r - Convert JSON to data.frame using tidyjson
问题描述
I have a JSON file that I want to convert to a table. It is easy to do with the jsonlite
library. However, if the file is big, then the conversion takes a significant amount of time. So I am testing a tidyjson
with the hope to speed up the process.
My JSON file looks as follows:
x = '[
{
"id": 1,
"A": [
{
"B": "b1",
"C": [
"c1"
]
}
]
},
{
"id": 2,
"A": [
{
"B": "b1",
"C": [
"c2"
]
}
]
}
]
'
That's how I process it:
library(tidyjson)
library(dplyr)
x %>% gather_array() %>%
spread_values(id = jstring("id")) %>%
enter_object("A") %>% gather_array %>%
spread_values(B = jstring("B")) %>%
enter_object("C") %>% gather_array() %>%
spread_values(C = jstring("C")) %>%
select(id, B, C)
Outcome I get:
..JSON id B C
<chr> <chr> <chr> <chr>
1 "\"c1\"" 1 b1 NA
2 "\"c2\"" 2 b1 NA
Cannot figure out what is wrong with the code and why it doesn’t work well for C. Any help is much appreciated.
UPDATE: Expected output:
id B C
<chr> <chr> <chr>
1 1 b1 c1
2 2 b1 c2
UPDATE 2:
jsonlite
way:
y = jsonlite::fromJSON(x)
cbind(id = y$id, do.call(rbind.data.frame, y$A))
id B C
1 1 b1 c1
2 2 b1 c2
Not sure that it is the fastest way of using jsonlite
in this case.
解决方案
We may use fromJSON
from jsonlite
library(jsonlite)
library(tidyr)
library(dplyr)
fromJSON(x) %>%
unnest_wider(A) %>%
unnest(C) %>%
unnest(C)
-output
# A tibble: 2 x 3
id B C
<int> <chr> <chr>
1 1 b1 c1
2 2 b1 c2
Or another option is
library(reticulate)
library(rrapply)
py_run_string(paste0("x = ", x))
rrapply(py$x, how = 'bind')
A.1.C A.1.B id
1 c1 b1 1
2 c2 b1 2
推荐阅读
- html - 我可以在电子邮件中创建链接以自动发送包含原始电子邮件中特定信息的电子邮件吗?
- php - 最后一个 elseif 语句在 PHP 中不起作用
- wordpress - 如何将页面的所有子项传递给 wordpress 的“is_page”函数?
- scala - 如何在函数中模拟函数调用?
- macos - 使用自制软件在 macOS 上安装 smlnj 不起作用
- reference - 变量的值模型只能和静态类型一起使用,不能和动态类型一起使用吗?
- c# - 如何将对象内的属性列表转换为对象列表
- r - R在多个data.frames之间循环并为它们赋值
- javascript - 检测 base64 dataURL 图像中的恶意代码或文本
- docker - docker:来自守护进程的错误响应:无法监听抽象的 unix 套接字“/containerd ...权限被拒绝:未知