首页 > 解决方案 > 解析存储在 Pandas 列中的字符串,将其一部分放到同一数据帧内的不同列中

问题描述

我编写了代码,以便能够从 Melissa API 获取结果以对地址进行地理编码并将它们存储在 pandas 列中。对 Melissa API 的调用进展顺利,这是我得到的格式

df['MelissaResults'][1]

{'Version': '3.0.1.160',
 'TransmissionReference': '',
 'TransmissionResults': '',
 'TotalRecords': '1',
 'Records': [{'RecordID': '1',
   'Results': 'AE01,GE02',
   'FormattedAddress': '2413;12B Ironside Street',
   'Organization': '2413',
   'AddressLine1': '12B Ironside Street',
   'AddressLine2': '',
   'AddressLine3': '',
   'AddressLine4': '',
   'AddressLine5': '',
   'AddressLine6': '',
   'AddressLine7': '',
   'AddressLine8': '',
   'SubPremises': '',
   'DoubleDependentLocality': '',
   'DependentLocality': '',
   'Locality': 'Red Deer',
   'SubAdministrativeArea': '',
   'AdministrativeArea': 'AB',
   'PostalCode': '',
   'PostalCodeType': ' ',
   'AddressType': '1',
   'AddressKey': '000000',
   'SubNationalArea': '',
   'CountryName': 'Canada',
   'CountryISO3166_1_Alpha2': 'CA',
   'CountryISO3166_1_Alpha3': 'CAN',
   'CountryISO3166_1_Numeric': '124',
   'CountrySubdivisionCode': 'CA-AB',
   'Thoroughfare': 'Ironside St',
   'ThoroughfarePreDirection': '',
   'ThoroughfareLeadingType': '',
   'ThoroughfareName': 'Ironside',
   'ThoroughfareTrailingType': 'St',
   'ThoroughfarePostDirection': '',
   'DependentThoroughfare': '',
   'DependentThoroughfarePreDirection': '',
   'DependentThoroughfareLeadingType': '',
   'DependentThoroughfareName': '',
   'DependentThoroughfareTrailingType': '',
   'DependentThoroughfarePostDirection': '',
   'Building': '',
   'PremisesType': '',
   'PremisesNumber': '12B',
   'SubPremisesType': '',
   'SubPremisesNumber': '',
   'PostBox': '',
   'Latitude': '',
   'Longitude': '',
   'DeliveryIndicator': 'U',
   'MelissaAddressKey': '',
   'MelissaAddressKeyBase': '',
   'PostOfficeLocation': '',
   'SubPremiseLevel': '',
   'SubPremiseLevelType': '',
   'SubPremiseLevelNumber': '',
   'SubBuilding': '',
   'SubBuildingType': '',
   'SubBuildingNumber': '',
   'UTC': 'UTC-07:00',
   'DST': 'Y',
   'DeliveryPointSuffix': '',
   'CensusKey': '245100302002015',
   'Extras': {}}]}

从那里,我试图将结果的一部分,即纬度和经度存储在单独的列中。


df_inglewood1['MelissaResults'][1]['Records'][0]['Latitude']

Output: ''

def get_lat(data):
    record = data['Records'][0]
    return record['Latitude']


def get_lng(data):
    record = data['Records'][0]
    return record['Longitude']

df.loc[:, 'Melissa_Latitude'] = df.loc[:, 'MelissaResults'].apply(get_lat)
df.loc[:, 'Melissa_Longitude'] = df.loc[:, 'MelissaResults'].apply(get_lng)

KeyError                                  Traceback (most recent call last)
<ipython-input-46-305a0f421ef6> in <module>
----> 1 df.loc[:, 'Melissa_Latitude'] = df.loc[:, 'MelissaResults'].apply(get_lat)
      2 df.loc[:, 'Melissa_Longitude'] = df.loc[:, 'MelissaResults'].apply(get_lng)

~/virt_env/virt2/lib/python3.6/site-packages/pandas/core/series.py in apply(self, func, convert_dtype, args, **kwds)
   4211             else:
   4212                 values = self.astype(object)._values
-> 4213                 mapped = lib.map_infer(values, f, convert=convert_dtype)
   4214 
   4215         if len(mapped) and isinstance(mapped[0], Series):

pandas/_libs/lib.pyx in pandas._libs.lib.map_infer()

<ipython-input-45-2466e025cf7d> in get_lat(data)
      1 def get_lat(data):
----> 2     record = data['Records'][0]
      3     return record['Latitude']
      4 #get_lat(df.iloc[0, -1])
      5 

KeyError: 'Records'

我在这里想念什么?

标签: pythonpandasparsinggeocoding

解决方案


推荐阅读