首页 > 解决方案 > 使用 Rvest 获取表格

问题描述

我正试图刮桌子

标签: web-scrapingrvest

解决方案


令牌需要作为请求标头发送,x-xsrf-token而不是通过传递给参数: 在此处输入图像描述 此外,令牌值可能会随着会话而改变,因此您需要在 cookie 中获取它。之后,将数据转换为数据框,得到结果:

library(rvest)
pg <- html_session("https://www.barchart.com/options/stocks-by-sector?page=1")
cookies <- pg$response$cookies
token <- URLdecode(dplyr::recode("XSRF-TOKEN", !!!setNames(cookies$value, cookies$name)))
pg <- 
  pg %>% rvest:::request_GET(
    "https://www.barchart.com/proxies/core-api/v1/quotes/get?lists=stocks.optionable.by_sector.all.us&fields=symbol%2CsymbolName%2ClastPrice%2CpriceChange%2CpercentChange%2ChighPrice%2ClowPrice%2Cvolume%2CtradeTime%2CsymbolCode%2CsymbolType%2ChasOptions&orderBy=symbol&orderDir=asc&meta=field.shortName%2Cfield.type%2Cfield.description&hasOptions=true&page=1&limit=1000000&raw=1",
    config = httr::add_headers(`x-xsrf-token` = token)
  )
data_raw <- httr::content(pg$response)
data <- 
  purrr::map_dfr(
    data_raw$data,
    function(x){
      as.data.frame(x$raw)
    }
  )

推荐阅读