首页 > 解决方案 > Is there a way to convert a .csv format embedded in a website to an actual csv to use read.csv() on?

问题描述

Basically, on baseball-reference.com there is a way to switch the tables to csv format, but not actually a .csv link. I am trying to see if the csv formatted text on the webpage can be converted to a .csv file in order to make it a usable table.

I tried to use the normal 'rvest' package with the following code

#Los Angeles Dodgers
dodgerBatting <- read_html('https://www.baseball-reference.com/teams/LAD/2019.shtml')
dodgerCSV <- dodgerBatting%>%
  html_nodes('#csv_team_batting')%>%
  html_text()
print(head(dodgerCSV))

The results are basically an empty character character(0)

标签: rrstudio

解决方案


You can get the tables present on the webpage using html_table command in rvest.

library(rvest)
url <- "https://www.baseball-reference.com/teams/LAD/2019.shtml"

out_table <- url %>% read_html %>%  html_table()

This returns a list of dataframes, we can access individual dataframes using out_table[[1]], out_table[[2]]. You might need to do some cleaning before using them.

If needed in csv format, we can use write.csv command to write them

write.csv(out_table[[1]], "/path/of/the/file.csv")

推荐阅读