首页 > 解决方案 > 解析来自 POST 请求的原始响应数据 (R)

问题描述

我正在尝试从以下链接中抓取最近发生的事件表:https ://www.tapology.com/fightcenter

访问链接时,表格会显示即将发生的事件,因此您必须单击日程表并将选项更改为“结果”。

我已经在变量resp中抓取了下面似乎是原始数据的内容,但我不知道该代码是用什么语言编写的,也不知道如何解析它。

library(httr)

url <- paste0("https://www.tapology.com/fightcenter_events")

fd <- list(
  group = "all",
  region = "",
  schedule = "results",
  sport = "all"
)

postdata <- POST(url = url, query = fd, encode = "form",
                 add_headers(
                   "Accept" = "text/javascript, application/javascript, application/ecmascript, application/x-ecmascript, */*; q=0.01",
                   "Content-Type" = "application/x-www-form-urlencoded; charset=UTF-8",
                   "Cookie" = "_ga=GA1.2.1873043703.1537368153; __utmz=88071069.1563301531.1.1.utmcsr=(direct)|utmccn=(direct)|utmcmd=(none); remember_id=149246; remember_token=315e68b7a95fa6cda391fc3e2ae0e1fb1466335ed9a15480558bd4ef8d52d832; __utmc=88071069; __utma=88071069.1873043703.1537368153.1563983348.1563985208.3; _tapology_mma_session=Z2RWaU1XZ0hOQmIwcUhjN1Bac0twN0JZQktnVUlLUjVsVkdMMDR4bTBITGdnSDFlRW9WeHprQ2lRaWdJM0lRbW5PNTFYSG9kbVlaMWFlR3liZmEyZWhnRWVVNm03UVIwRUJLWHl1MmJXRlQ1dEFJTGJsTnVLQWx4MWpUMTJOYlBxQ1N1Y0pQREZlZTNzMDA0NTJINEpLS2FMNXZvaXZjQ3g2dFMzM1dJeTRmekc4TG5JTk9YZDlZdWx5WnpZd3luZlY1ZXliQ0RWS1B1aXJYQnpqVVp4UT09LS10am5XNVI0c0pXa2p1dHJ5OW9PME5nPT0%3D--7488fef85f733279f15da594ea47f0345aa16938",
                   "Host" = "www.tapology.com",
                   "Origin" = "https://www.tapology.com",
                   "User-Agent" = "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/75.0.3770.100 Safari/537.36",
                   "Referer" = "https://www.tapology.com/fightcenter",
                   "X-CSRF-Token" = "NS9M1Y5RMShdIfFaIKpYiqr+JuOZ8kwZvn9KSW7daZmgT9eJ4Q0ZyGLZSUHR4wjCdiE840HcQzLHHZSe0WgVJw==",
                   "X-Requested-With" = "XMLHttpRequest"
                   )
)

resp <- content(postdata, "text")

substr(resp, 1, 200)
[1] "$(\".fightcenterEvents\").html(\"<h3>\\n<span>Event Results<\\/span>\\n<span class=\\'moreLink\\'>  <nav class=\\\"pagination\\\" role=\\\"navigation\\\" aria-label=\\\"pager\\\">\\n    \\n    \\n        <span class=\\\"page "

标签: r

解决方案


推荐阅读