首页 > 解决方案 > nested dictionary in list to dataframe python

问题描述

Have a json input from api:

{
  "api_info": {
    "status": "healthy"
  },
  "items": [
    {
      "timestamp": "time", 
      "stock_data": [
        {
          "ticker": "string",
          "industry": "string",
          "Description": "string"
        }
      ]
     "ISIN":xxx,
     "update_datetime": "time"
    }
  ]
}

have initially run

apiRawData = requests.get(url).json()['items']

then ran the json_normalize method:

apiExtractedData = pd.json_normalize(apiRawData,'stock_data',errors='ignore')

Here is the initial output where the stock_data is still contained within a list. stock_data ISIN update_datetime 0 [{'description': 'zzz', 'industry': 'C', 'ticker... xxx time

stock_data ISIN update_datetime
0 [{'description': 'zzz', 'industry': 'C', 'ticker...] 123 time

What i would like to achieve is a dataframe showing the headers and the corresponding rows:

description industry ticker ISIN update_datetime
0 'zzz' 'C' xxx 123 time

Do direct me if there is already an existing question answered :) cheers.

标签: python-3.xdataframejson-normalize

解决方案


I think you can simply convert your existing data frame into your expected one by using below code:

apiExtractedData['description'] = apiExtractedData['stock_data'].apply(lambda x: x[0]['description'])
apiExtractedData['industry'] = apiExtractedData['stock_data'].apply(lambda x: x[0]['industry'])
apiExtractedData['ticker'] = apiExtractedData['stock_data'].apply(lambda x: x[0]['ticker'])

And then just delete your stock_data column:

apiExtractedData = apiExtractedData.drop(['stock_data'], axis = 1)

推荐阅读