首页 > 解决方案 > rvest read_html 导致某些链接的核心转储

问题描述

请看下面的代码。对于某些链接,这会导致 R 执行核心转储。我刚刚放了一个示例链接,这可能有助于调试

操作系统是 Ubuntu 16.04.4 LTS (GNU/Linux 4.4.0-75-generic x86_64)

>R

R version 3.3.3 (2017-03-06) -- "Another Canoe"
Copyright (C) 2017 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

  Natural language support but running in an English locale

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

> packageVersion("rvest")
[1] ‘0.3.2’
> link <- 'https://www.xerox.com/en-us/digital-printing/custom-print-production'
> library(rvest)
Loading required package: xml2
> result <- read_html(link)
*** %n in writable segment detected ***
Aborted (core dumped)

笔记:

  1. 相同的链接在 Mac 上工作正常

  2. 许多其他链接在 Ubuntu 上运行良好

标签: rrvest

解决方案


xml2 较旧

packageVersion("xml2") [1] '1.1.1' 一旦我更新到 1.2.0,错误就消失了


推荐阅读