r - 如何从网站中提取图表后端数据?
问题描述
我正在尝试在以下链接上提取图表的后端数据 - https://coronavirus.jhu.edu/map.html。
在右上角,您将看到一个黄色气泡图,其中按国家/地区绘制了冠状病毒病例总数。我需要从图表中提取后端数据。我做了一些在线研究,发现了这个网站 - https://onlinejournalismblog.com/2017/05/10/how-to-find-data-behind-chart-map-using-inspector/。这对我很有帮助,我可以获得一个网络链接,当我打开浏览器时,我会看到 JSON 格式的后端数据。
我尝试通过以下几种方式提取数据 -
url = "..above link.."
x <- fromJSON(url)
x <- GET(url)
每次,我都收到错误。
我需要为所有国家提取此图表的后端数据。
提前谢谢你的帮助。
## Error I get from
print(x)
Response [https://services9.arcgis.com/N9p5hsImWXAccRNI/arcgis/rest/services/Nc2JKvYFoAEOFCG5JSI6/FeatureServer/4/query?f=json&where=(Confirmed%3C%3E0)%20AND%20(Country_Region%3D%27Senegal%27)&returnGeometry=false&spatialRel=esriSpatialRelIntersects&outFields=OBJECTID%2CConfirmed%2CLast_Update&orderByFields=Last_Update%20asc&outSR=102100&resultOffset=0&resultRecordCount=1000&cacheHint=true]
Date: 2020-03-27 14:15
Status: 403
Content-Type: text/html
Size: 919 B
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<HTML><HEAD><META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=iso-8859-1">
<TITLE>ERROR: The request could not be satisfied</TITLE>
</HEAD><BODY>
<H1>403 ERROR</H1>
<H2>The request could not be satisfied.</H2>
<HR noshade size="1px">
Request blocked.
We can't connect to the server for this app or website at this time. There might be too much traffic or a config...
<BR clear="all">
#
#str(x) gives me
str(x)
List of 10
$ url : chr "https://services9.arcgis.com/N9p5hsImWXAccRNI/arcgis/rest/services/Nc2JKvYFoAEOFCG5JSI6/FeatureServer/4/query?f"| __truncated__
$ status_code: int 403
$ headers :List of 9
..$ server : chr "CloudFront"
..$ date : chr "Fri, 27 Mar 2020 14:15:15 GMT"
..$ content-type : chr "text/html"
..$ content-length: chr "919"
..$ connection : chr "keep-alive"
..$ x-cache : chr "Error from cloudfront"
..$ via : chr "1.1 7ddf939a79757069f5b9d04b0ce928cf.cloudfront.net (CloudFront)"
..$ x-amz-cf-pop : chr "BLR50-C1"
..$ x-amz-cf-id : chr "obYT8TG43RpvXHq8VoWYPnSYLIdSwQgf4tFkP1oshW9xtmEkSj5AJA=="
..- attr(*, "class")= chr [1:2] "insensitive" "list"
$ all_headers:List of 1
..$ :List of 3
.. ..$ status : int 403
.. ..$ version: chr "HTTP/1.1"
.. ..$ headers:List of 9
.. .. ..$ server : chr "CloudFront"
.. .. ..$ date : chr "Fri, 27 Mar 2020 14:15:15 GMT"
.. .. ..$ content-type : chr "text/html"
.. .. ..$ content-length: chr "919"
.. .. ..$ connection : chr "keep-alive"
.. .. ..$ x-cache : chr "Error from cloudfront"
.. .. ..$ via : chr "1.1 7ddf939a79757069f5b9d04b0ce928cf.cloudfront.net (CloudFront)"
.. .. ..$ x-amz-cf-pop : chr "BLR50-C1"
.. .. ..$ x-amz-cf-id : chr "obYT8TG43RpvXHq8VoWYPnSYLIdSwQgf4tFkP1oshW9xtmEkSj5AJA=="
.. .. ..- attr(*, "class")= chr [1:2] "insensitive" "list"
$ cookies :'data.frame': 0 obs. of 7 variables:
..$ domain : logi(0)
..$ flag : logi(0)
..$ path : logi(0)
..$ secure : logi(0)
..$ expiration: 'POSIXct' num(0)
..$ name : logi(0)
..$ value : logi(0)
$ content : raw [1:919] 3c 21 44 4f ...
$ date : POSIXct[1:1], format: "2020-03-27 14:15:15"
$ times : Named num [1:6] 0 0.000159 0.000164 0.000517 0.004218 ...
..- attr(*, "names")= chr [1:6] "redirect" "namelookup" "connect" "pretransfer" ...
$ request :List of 7
..$ method : chr "GET"
..$ url : chr "https://services9.arcgis.com/N9p5hsImWXAccRNI/arcgis/rest/services/Nc2JKvYFoAEOFCG5JSI6/FeatureServer/4/query?f"| __truncated__
..$ headers : Named chr "application/json, text/xml, application/xml, */*"
.. ..- attr(*, "names")= chr "Accept"
..$ fields : NULL
..$ options :List of 2
.. ..$ useragent: chr "libcurl/7.64.1 r-curl/4.3 httr/1.4.1"
.. ..$ httpget : logi TRUE
..$ auth_token: NULL
..$ output : list()
.. ..- attr(*, "class")= chr [1:2] "write_memory" "write_function"
..- attr(*, "class")= chr "request"
$ handle :Class 'curl_handle' <externalptr>
- attr(*, "class")= chr "response"
解决方案
推荐阅读
- c# - 在对象列表中查找第一个匹配的字符串值
- python - 在 Pandas 中使用 np.where 后如何获取剩余的数据帧?
- javascript - AJAX 针对单个锚元素
- javascript - 我想从我的 select2() 库的下拉列表中禁用剩余的选择选项
- seal - Microsoft SEAL:减去两个 PolyCRT 组成的密文后需要负值
- bash - 如何检查文件或目录的大小是否大于 Bash 中的值?
- c# - .NET Core DI 使用自定义委托解析键控服务返回 null
- xml - If else odoo qweb 中的条件
- c - 函数不能为 C 中的指针分配内存
- jenkins - Jenkins 没有安装 NPM 包