首页 > 解决方案 > 将 JSON 数据从 SQL DB 导入 R 数据框

问题描述

我想知道是否有办法将 JSON 数据从 MySQL 数据库导入 R 数据框。

我有一张这样的桌子:

id  created_at   json
1   2020-07-01   {"name":"Dent, Arthur","group":"Green","age (y)":43,"height (cm)":187,"wieght (kg)":89,"sensor":34834834}
2   2020-07-01   {"name":"Doe, Jane","group":"Blue","age (y)":23,"height (cm)":172,"wieght (kg)":67,"sensor":12342439}
3   2020-07-01   {"name":"Curt, Travis","group":"Red","age (y)":13,"height (cm)":128,"wieght (kg)":47,"sensor":83287699}

我想获得“id”和“json”列。我正在使用 RMySQL 包将数据从 db 获取到 R 数据帧,但这只给了我“id”列,“json”列在每一行中只包含 NA。

有什么方法可以导入/加载数据并显示 json 列吗?并且可能提取json值的“传感器”部分?

结果将是这样的数据框(df):

id   json
1    {"name":"Dent, Arthur","group":"Green","age (y)":43,"height (cm)":187,"wieght (kg)":89,"sensor":34834834}
2    {"name":"Doe, Jane","group":"Blue","age (y)":23,"height (cm)":172,"wieght (kg)":67,"sensor":12342439}
3    {"name":"Curt, Travis","group":"Red","age (y)":13,"height (cm)":128,"wieght (kg)":47,"sensor":83287699}

或使用提取的值:

id   sensor
1    "sensor":34834834
2    "sensor":12342439
3    "sensor":83287699

非常感谢您的任何建议。

标签: sqlrjsondataframe

解决方案


使用unnest_wider来自tidyr

library(dplyr)

con <- DBI::dbConnect(RMySQL::MySQL(), 'db_name', user = 'user', password = 'pass', host = 'hostname')

t <- tbl(con, 'table_name')
  
t %>% 
  as_tibble() %>% 
  transmute(j = purrr::map(json, jsonlite::fromJSON)) %>%
  tidyr::unnest_wider(j)


DBI::dbDisconnect(con)

结果:

# A tibble: 3 x 6
  name         group `age (y)` `height (cm)` `wieght (kg)`   sensor
  <chr>        <chr>     <int>         <int>         <int>    <int>
1 Dent, Arthur Green        43           187            89 34834834
2 Doe, Jane    Blue         23           172            67 12342439
3 Curt, Travis Red          13           128            47 83287699

如果您只想检索过去 24 小时内的数据(按照 OP 的要求),请将tbl(con, 'table_name')语句更改为:

t <- DBI::dbGetQuery(con, 'SELECT * FROM `table_name` WHERE DATE(`created_at`) > NOW() - INTERVAL 1 DAY')

推荐阅读