首页 > 解决方案 > 如何遍历嵌套列表以将值存储在数据框中?

问题描述

给定一个嵌套字典neighborhood_data并且第一个项目 ieneighborhood_data[0]显示

{'type': 'Feature',
 'geometry': {'type': 'MultiPolygon',
  'coordinates': [[[[28.073783, -26.343133],
     [28.071239, -26.351536],
     [28.068717, -26.350644],
     [28.06663, -26.351362],
     [28.065161, -26.352135],
     [28.064671, -26.35399]]]],
'properties': {'cartodb_id': 1,
  'subplace_c': 761001001,
  'province': 'Gauteng',
  'wardid': '74202012',
  'district_m': 'Sedibeng',
  'local_muni': 'Midvaal',
  'main_place': 'Alberton',
  'mp_class': 'Settlement',
  'sp_name': 'Brenkondown',
  'suburb_nam': 'Brenkondown',
  'metro': 'Johannesburg',
  'african': 330,
  'white': 24,
  'asian': 0,
  'coloured': 2,
  'other': 0,
  'totalpop': 356}}}

然后我创建了一个空数据框neighborhoods

# define the dataframe columns
column_names = ['Province', 'District', 'Local_municipality','Main Place', 'Suburb','Metro','Latitude','Longitude'] 

# instantiate the dataframe
neighborhoods = pd.DataFrame(columns=column_names)

但是,当我循环neighborhoods_data将相关数据存储在neighborhoods数据框中时,出现以下错误

for data in neighborhood_data:
    province = data['properties']['province']
    district = data['properties']['district_m']
    local_muni_name = suburb_name = data['properties']['local_muni'] 
    suburb_name = data['properties']['suburb_nam']
    metro = data['properties']['metro']
    
    suburb_latlon = data['geometry']['coordinates']
    subur_lat = suburb_latlon[[[[1]]]]
    suburb_lon = suburb_latlon[[[[0]]]]
    
    neighborhoods = neighborhoods.append({'Province': province,
                                          'District': district,
                                          'Local_municipality': local_muni_name,
                                          'Main place': main_place,
                                          'Suburb': suburb_name,
                                          'Metro': metro,
                                          'Latitude': suburb_lat,
                                          'Longitude': suburb_lon}, ignore_index=True)
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-17-a5dc74ed4207> in <module>
      7 
      8     suburb_latlon = data['geometry']['coordinates']
----> 9     subur_lat = suburb_latlon[[[[1]]]]
     10     suburb_lon = suburb_latlon[[[[0]]]]
     11 

TypeError: list indices must be integers or slices, not list

那么如何在空数据框的“纬度”和“经度”列中存储纬度和经度坐标?

标签: pythonpandasgeojsonnested-lists

解决方案


您的字典格式错误,它错过了键中的右方括号coordinates,但让我们假设这是正确的字典:

{'geometry': {'coordinates': [[[[28.073783, -26.343133],
     [28.071239, -26.351536],
     [28.068717, -26.350644],
     [28.06663, -26.351362],
     [28.065161, -26.352135],
     [28.064671, -26.35399]]]],
  'properties': {'african': 330,
   'asian': 0,
   'cartodb_id': 1,
   'coloured': 2,
   'district_m': 'Sedibeng',
   'local_muni': 'Midvaal',
   'main_place': 'Alberton',
   'metro': 'Johannesburg',
   'mp_class': 'Settlement',
   'other': 0,
   'province': 'Gauteng',
   'sp_name': 'Brenkondown',
   'subplace_c': 761001001,
   'suburb_nam': 'Brenkondown',
   'totalpop': 356,
   'wardid': '74202012',
   'white': 24},
  'type': 'MultiPolygon'},
 'type': 'Feature'}

然后,访问

suburb_latlon = data['geometry']['coordinates']
subur_lat = suburb_latlon[[[[1]]]] # <--- Indexing error here
suburb_lon = suburb_latlon[[[[0]]]] # <--- Indexing error here

我们想要做以下事情(通过额外的列表级别解包,直到我们得到我们的坐标):

suburb_latlon = data['geometry']['coordinates']
subur_lat = suburb_latlon[0][0][0][1] # <--- Not sure what your logic is here, and why you would pick the first one, but I'll assume that given this indexing procedure you can customize this.
suburb_lon = suburb_latlon[0][0][0][0] # <--- Same here

推荐阅读