首页 > 解决方案 > 如何将其放入 Python 中的数据框中?

问题描述

我有一个从 API 获得的列表:

[[{'$type': 'Tfl.Api.Presentation.Entities.Line, Tfl.Api.Presentation.Entities',
   'id': 'piccadilly',
   'name': 'Piccadilly',
   'modeName': 'tube',
   'disruptions': [],
   'created': '2019-08-20T16:25:25.35Z',
   'modified': '2019-08-20T16:25:25.35Z',
   'lineStatuses': [],
   'routeSections': [],
   'serviceTypes': [{'$type': 'Tfl.Api.Presentation.Entities.LineServiceTypeInfo, Tfl.Api.Presentation.Entities',
     'name': 'Regular',
     'uri': '/Line/Route?ids=Piccadilly&serviceTypes=Regular'},
    {'$type': 'Tfl.Api.Presentation.Entities.LineServiceTypeInfo, Tfl.Api.Presentation.Entities',
     'name': 'Night',
     'uri': '/Line/Route?ids=Piccadilly&serviceTypes=Night'}],
   'crowding': {'$type': 'Tfl.Api.Presentation.Entities.Crowding, Tfl.Api.Presentation.Entities'}}],
 [{'$type': 'Tfl.Api.Presentation.Entities.Line, Tfl.Api.Presentation.Entities',
   'id': 'victoria',
   'name': 'Victoria',
   'modeName': 'tube',
   'disruptions': [],
   'created': '2019-08-20T16:25:25.36Z',
   'modified': '2019-08-20T16:25:25.36Z',
   'lineStatuses': [],
   'routeSections': [],
   'serviceTypes': [{'$type': 'Tfl.Api.Presentation.Entities.LineServiceTypeInfo, Tfl.Api.Presentation.Entities',
     'name': 'Regular',
     'uri': '/Line/Route?ids=Victoria&serviceTypes=Regular'},
    {'$type': 'Tfl.Api.Presentation.Entities.LineServiceTypeInfo, Tfl.Api.Presentation.Entities',
     'name': 'Night',
     'uri': '/Line/Route?ids=Victoria&serviceTypes=Night'}]

我想得到一个包含这些列的数据框:id、name、modeName、disruptions、serviceTypes 等,但我找不到正确的解决方案。

这是我尝试过的:

dflines = pd.DataFrame(columns = ["id", "name", "modeName", "disruptions", "serviceTypes"])

for i, row in range(len(info)):
    id = row["id"]
    name = row["name"]
    modeName = row["modeName"]
    disruptions = row["disruptions"]
    dflines.loc[i] = [id, name, modeName, disruptions, want, serviceTypes]

dflines.head(20)

我收到了这个错误:-------------------------------------------------------- ------------------------------

TypeError                                 Traceback (most recent call last)
<ipython-input-80-bec7efd03786> in <module>
      1 dflines = pd.DataFrame(columns = ["id", "name", "modeName", "disruptions", "serviceTypes"])
      2 
----> 3 for i, row in range(len(info)):
      4     id = row["id"]
      5     name = row["name"]

TypeError: cannot unpack non-iterable int object

谁能帮我?

标签: pythonfor-loop

解决方案


info=[[{'$type': 'Tfl.Api.Presentation.Entities.Line, Tfl.Api.Presentation.Entities',
   'id': 'piccadilly',
   'name': 'Piccadilly',
   'modeName': 'tube',
   'disruptions': [],
   'created': '2019-08-20T16:25:25.35Z',
   'modified': '2019-08-20T16:25:25.35Z',
   'lineStatuses': [],
   'routeSections': [],
   'serviceTypes': [{'$type': 'Tfl.Api.Presentation.Entities.LineServiceTypeInfo, Tfl.Api.Presentation.Entities',
     'name': 'Regular',
     'uri': '/Line/Route?ids=Piccadilly&serviceTypes=Regular'},
    {'$type': 'Tfl.Api.Presentation.Entities.LineServiceTypeInfo, Tfl.Api.Presentation.Entities',
     'name': 'Night',
     'uri': '/Line/Route?ids=Piccadilly&serviceTypes=Night'}],
   'crowding': {'$type': 'Tfl.Api.Presentation.Entities.Crowding, Tfl.Api.Presentation.Entities'}}],
 [{'$type': 'Tfl.Api.Presentation.Entities.Line, Tfl.Api.Presentation.Entities',
   'id': 'victoria',
   'name': 'Victoria',
   'modeName': 'tube',
   'disruptions': [],
   'created': '2019-08-20T16:25:25.36Z',
   'modified': '2019-08-20T16:25:25.36Z',
   'lineStatuses': [],
   'routeSections': [],
   'serviceTypes': [{'$type': 'Tfl.Api.Presentation.Entities.LineServiceTypeInfo, Tfl.Api.Presentation.Entities',
     'name': 'Regular',
     'uri': '/Line/Route?ids=Victoria&serviceTypes=Regular'},
    {'$type': 'Tfl.Api.Presentation.Entities.LineServiceTypeInfo, Tfl.Api.Presentation.Entities',
     'name': 'Night',
     'uri': '/Line/Route?ids=Victoria&serviceTypes=Night'}]}]]
info = np.squeeze(info).tolist()
dflines = pd.DataFrame(columns = ["id", "name", "modeName", "disruptions", "serviceTypes"])
dfserviceTypes=pd.DataFrame(columns =["$type","name","uri"])
i=0
j=0
for dic in info:
    for key in dic:
        if key in dflines.columns.tolist():
            dflines.loc[i,key]=str(dic[key])

        if key=='serviceTypes':
            for dic2 in dic[key]:
                for key2 in dic2:
                    if key2 in dfserviceTypes.columns.tolist():
                         dfserviceTypes.loc[j,key2]=str(dic2[key2])
                j+=1
    i+=1

请记住,如果将可视化分为两个数据框,则可视化会更容易,这样就不必面对将数据框插入另一个数据框以免丢失信息的任务

dflines

输出:

id  name        modeName    disruptions serviceTypes
0   piccadilly  Piccadilly  tube    []  [{'$type': 'Tfl.Api.Presentation.Entities.Line...
1   victoria    Victoria    tube    []  [{'$type': 'Tfl.Api.Presentation.Entities.Line...

和服务类型:

dfserviceTypes

输出:

    $type                                               name     uri
0   Tfl.Api.Presentation.Entities.LineServiceTypeI...   Regular /Line/Route?ids=Piccadilly&serviceTypes=Regular
1   Tfl.Api.Presentation.Entities.LineServiceTypeI...   Night   /Line/Route?ids=Piccadilly&serviceTypes=Night
2   Tfl.Api.Presentation.Entities.LineServiceTypeI...   Regular /Line/Route?ids=Victoria&serviceTypes=Regular
3   Tfl.Api.Presentation.Entities.LineServiceTypeI...   Night   /Line/Route?ids=Victoria&serviceTypes=Night

推荐阅读