首页 > 解决方案 > 迭代包含嵌套字典的多层次列表的项目并将其打印到python中的csv

问题描述

我想获取也有子字典的多层次列表并将其写入带有标题的 csv。我的 json 看起来像:-

"features": [
{
  "type": "Feature",
  "properties": {
    "xyz": 1,
    "abc": "pqr",
    "mmi": null
  },
  "geometry": {
    "type": "pt",
    "coordinates": [
      -118.8957,
      38.8607,
      5.3
    ]
  },
  "id": "abc101"
},

应该给出下面的输出,输出 在此处输入图像描述 图像中显示的层次结构正是我想要得到的,但还没有找到任何合适的解决方案。

提前感谢您对上述内容的任何帮助。

标签: pythonpandas

解决方案


我建议json_normalizeset_index所有非分层列(不在.列名中)和最后一split列中使用.for MultiIndex

a = {"features": [
{
  "type": "Feature",
  "properties": {
    "xyz": 1,
    "abc": "pqr",
    "mmi": 'null'
  },
  "geometry": {
    "type": "pt",
    "coordinates": [
      -118.8957,
      38.8607,
      5.3
    ]
  },
  "id": "abc101"
},{
  "type": "Feature",
  "properties": {
    "xyz": 1,
    "abc": "pqr",
    "mmi": 'null'
  },
  "geometry": {
    "type": "pt",
    "coordinates": [
      -118.8957,
      38.8607,
      5.3
    ]
  },
  "id": "abc101"
}]}

from pandas.io.json import json_normalize

df = json_normalize(a['features']).set_index(['id','type'])
df.columns = df.columns.str.split('.', expand=True)
print (df)
                                 geometry      properties          
                              coordinates type        abc   mmi xyz
id     type                                                        
abc101 Feature  [-118.8957, 38.8607, 5.3]   pt        pqr  null   1
       Feature  [-118.8957, 38.8607, 5.3]   pt        pqr  null   1

编辑:

如果想再次阅读fileMultiIndex最好不要删除第一个重复列:

df.to_csv('test.csv')

df = pd.read_csv('test.csv', index_col=[0,1], header=[0,1])
print (df)
                                 geometry      properties        
                              coordinates type        abc mmi xyz
id     type                                                      
abc101 Feature  [-118.8957, 38.8607, 5.3]   pt        pqr NaN   1
       Feature  [-118.8957, 38.8607, 5.3]   pt        pqr NaN   1

但如果真的需要它:

from pandas.io.json import json_normalize

df = json_normalize(a['features']).set_index(['id','type'])
df.columns = df.columns.str.split('.', expand=True)

s = df.columns.get_level_values(0)
s1 = df.columns.get_level_values(1)
s0 = np.where(s.duplicated(),'',s)
df.columns = [s0, s1]

df.to_csv('test.csv')

推荐阅读