首页 > 解决方案 > R converting FB json friend list to data frame

问题描述

I have a json file I'm trying to convert to a data frame. The json file looks like this, and has this pattern. (The json file is from FB, you can download your entire friendlist/profile actually in html or json format.)

   {
    "friends": [
    {
      "name": "Archie Andrews",
      "timestamp": 1539780292
    },
    {
      "name": "Betty Cooper",
      "timestamp": 1539005874
    },
    {
      "name": "Veronica Lodge",
      "timestamp": 1537680925
    },
    {
       "name": "Sabrina Spellman",
       "timestamp": 1381680968,
       "contact_info": "creepyhouse@666.com"
        }
    ]
 }

In general, I'm able to convert this into a dataframe with 2 columns (name, timestamp) using this code:

  library(rjson)
  friends <- fromJSON(file = "xxx.json")
  data_frame <- data.frame(matrix(unlist(friends), nrow = lengths(friends)+1, byrow = T), stringsAsFactors = FALSE)

However, the annoying thing is when they have contact_info like in Sabrina's example. What happens is it gets extracted also so it skews up the arrangement. Hence the need for nrow = lengths(friends)+1

Archie Andrews      1539780292
Betty Cooper        1539005874
Veronica Lodge      1537680925
Sabrina Spellman    1381680968
creepyhouse@666.com Jughead Jones
1343582935          Midge Klump

Is there a way that when extracting the lists into 2 columns, for every list I'll just take the first 2 elements (name, timestamp)? Ultimately, I don't care for the contact_info and I just want to have a 2-column dataframe.

标签: rjsondataframe

解决方案


如果我正确理解您的问题,您可以在之后删除这些列。请注意,jsonlite::read_jsonjsonlite::fromJSONxxx.json文件转换为列表对象,其中该列表的第一个元素是data.frame. 您可以使用子集运算符从此列表中提取元素[[

df <- jsonlite::read_json(path = "test.json", simplifyDataFrame = T)[[1]] ## note the "[[" subseting operator

df <- df[, c("name", "timestamp")] ## select the columns as desired

结果:

> df
              name  timestamp
1   Archie Andrews 1539780292
2     Betty Cooper 1539005874
3   Veronica Lodge 1537680925
4 Sabrina Spellman 1381680968

推荐阅读