r - 读取 json 文件“parse_con(txt, bigint_as_char) 中的错误:词法错误:json 文本中的无效字符”
问题描述
我在jsonlite::fromJSON
功能方面遇到以下问题。这是一个可重现的例子
tmp <- tempfile()
url <- "http://dadosabertos.c3sl.ufpr.br/curitibaurbs/2019_10_01_tabelaVeiculo.json.xz"
download.file(url, destfile =tmp,quiet = FALSE, mode = "w")
wb <- jsonlite::fromJSON(tmp)
我收到此错误
wb <- jsonlite::fromJSON(tmp)
# Error in parse_con(txt, bigint_as_char) :
# lexical error: invalid char in json text.
# 4444444444444444444444444},{"COD_LINHA":"519","NOME_LINHA":"
# (right here) ------^
# In addition: Warning message:
# In readBin(3L, raw(0), 32768L) : lzma decoder corrupt data
这是我的会话信息
> sessionInfo()
# R version 4.0.1 (2020-06-06)
# Platform: x86_64-w64-mingw32/x64 (64-bit)
# Running under: Windows Server 2012 R2 x64 (build 9600)
#
# Matrix products: default
#
# locale:
# [1] LC_COLLATE=Portuguese_Brazil.1252 LC_CTYPE=Portuguese_Brazil.1252 # LC_MONETARY=Portuguese_Brazil.1252
# [4] LC_NUMERIC=C LC_TIME=Portuguese_Brazil.1252
#
# attached base packages:
# [1] stats graphics grDevices utils datasets methods base
#
# loaded via a namespace (and not attached):
# [1] compiler_4.0.1 tools_4.0.1 jsonlite_1.7.0
可重现的代码在我的 Linux 中工作,它具有以下会话信息,但我仍然不确定导致错误的原因。
> sessionInfo()
# R version 4.0.2 (2020-06-22)
# Platform: x86_64-pc-linux-gnu (64-bit)
# Running under: Ubuntu 18.04.4 LTS
#
# Matrix products: default
# BLAS: /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.7.1
# LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.7.1
#
# locale:
# [1] LC_CTYPE=pt_BR.UTF-8 LC_NUMERIC=C LC_TIME=pt_BR.UTF-8
# [4] LC_COLLATE=en_US.UTF-8 LC_MONETARY=pt_BR.UTF-8 LC_MESSAGES=en_US.UTF-8
# [7] LC_PAPER=pt_BR.UTF-8 LC_NAME=C LC_ADDRESS=C
# [10] LC_TELEPHONE=C LC_MEASUREMENT=pt_BR.UTF-8 LC_IDENTIFICATION=C
# attached base packages:
# [1] stats graphics grDevices utils datasets methods base
#
# loaded via a namespace (and not attached):
# [1] Rcpp_1.0.5 rstudioapi_0.11 magrittr_1.5 hms_0.5.3
# [5] units_0.6-7 tidyselect_1.1.0 lattice_0.20-41 R6_2.4.1
# [9] rlang_0.4.7 stringr_1.4.0 dplyr_1.0.0 tools_4.0.2
# [13] rgdal_1.5-12 grid_4.0.2 data.table_1.13.0 KernSmooth_2.23-17
# [17] e1071_1.7-3 DBI_1.1.0 ellipsis_0.3.1 class_7.3-17
# [21] tibble_3.0.3 lwgeom_0.2-5 lifecycle_0.2.0 sf_0.9-5
# [25] crayon_1.3.4 zip_2.0.4 readr_1.3.1 purrr_0.3.4
# [29] gtfs2gps_1.2-3 vctrs_0.3.2 glue_1.4.1.9000 openxlsx_4.1.5
# [33] sp_1.4-2 stringi_1.4.6 compiler_4.0.2 pillar_1.4.6
# [37] generics_0.0.2 classInt_0.4-3 jsonlite_1.7.0 pkgconfig_2.0.3
解决方案
根据上面的 ritchie-sacramento,我们可以使用mode = "wb"
而不是mode = "w"
> help("download.file")
看
mode
character. The mode with which to write the file. Useful values are "w", "wb" (binary), "a" (append) and "ab". Not used for methods "wget" and "curl". See also ‘Details’, notably about using "wb" for Windows.
推荐阅读
- reactjs - 如何在不分配变量的情况下使用异步等待?避免从未使用过的值
- google-apps-script - 新项目的 Youtube 配额
- c++ - 未定义的参考:怎么了?
- javascript - 在 iOS 设备上使用后退按钮时未触发 CSS 关键帧动画
- laravel - 从视图 Laravel 加载数组到 Vue 组件
- c# - ASP .Net MVC 模型 - ViewModel - 视图
- javascript - 如何获取 javascript 对象中一个特定元素的索引
- javascript - 如何在 JSX 中渲染 html?
- vmware - 使用 govc CLI 在 VMware ESXI 上为特定 VM 设置 CPU 关联
- c# - 从 CsvHelper 升级,修复重大更改