首页 > 解决方案 > 如何在填充 Pandas DataFrame 时处理异常?

问题描述

我正在尝试用历史每小时天气数据填充数据框。通过调用 DarkSky API 完成。但是,有时某些字段会丢失并出现 KeyError。

以下是 API 每小时发回的内容:

'summary': 'Mostly cloudy throughout the day.',
'icon': 'partly-cloudy-day',
'data': [{
   'time': 1528354800,
   'summary': 'Partly Cloudy',
   'icon': 'partly-cloudy-night',
   'precipIntensity': 0,
   'precipProbability': 0,
   'temperature': 12.94,
   'apparentTemperature': 12.94,
   'dewPoint': 9.36,
   'humidity': 0.79,
   'pressure': 1011.4,
   'windSpeed': 2.69,
   'windGust': 2.69,
   'windBearing': 252,
   'cloudCover': 0.33,
   'uvIndex': 0,
   'visibility': 13.818}]

因此,在填充我的数据框时,我会得到一个 KeyError ,因为有时precipIntensity并且precipProbability不会出现,而是有一个名为precipType.

这是我尝试填充数据框的方式:

VICTORIA = 48.407326, -123.329773
        dt = datetime(2018, month, day).isoformat()
        weather = forecast('APIKEY', *VICTORIA, time = dt)
        weather.refresh(units='si')
        for hour in weather['hourly']['data']:
            daily_weather = daily_weather.append(
            {'time': hour['time'],
             'realtime': datetime.fromtimestamp(hour['time']),
             'summary': hour['summary'],
             'icon': hour['icon'],
             'precipIntensity': hour['precipIntensity'],
             'precipProbability': hour['precipProbability'],
             'temperature': hour['temperature'],
             'apparentTemperature': hour['apparentTemperature'],
             'dewPoint': hour['dewPoint'],
             'humidity': hour['humidity'],
             'pressure': hour['pressure'],
             'windSpeed': hour['windSpeed'],
             'windBearing': hour['windBearing'],
             'cloudCover': hour['cloudCover'],
             'uvIndex': hour['uvIndex'],
             'visibility': hour['visibility'],
             }, ignore_index=True)

我试图使用 try/except 语句来产生如下异常:

for hour in weather['hourly']['data']:
        daily_weather = daily_weather.append(
        {'time': hour['time'],
         'realtime': datetime.fromtimestamp(hour['time']),
         'summary': hour['summary'],
         'icon': hour['icon'],
         'temperature': hour['temperature'],
         'apparentTemperature': hour['apparentTemperature'],
         'dewPoint': hour['dewPoint'],
         'humidity': hour['humidity'],
         'pressure': hour['pressure'],
         'windSpeed': hour['windSpeed'],
         'windBearing': hour['windBearing'],
         'cloudCover': hour['cloudCover'],
         'uvIndex': hour['uvIndex'],
         'visibility': hour['visibility'],
         }, ignore_index=True)
        try:
            daily_weather = daily_weather.append({'precipIntensity': hour['precipIntensity'], 'precipProbability': hour['precipProbability']}, ignore_index=True)
        except KeyError:
            daily_weather = daily_weather.append({'precipType': hour['precipType']}, ignore_index=True)

然而,该precipIntensity字段填充未使用的行而不是与其他行:

数据框输出

在尝试填充数据框时,我希望得到一些关于如何使用异常语句的建议。谢谢你。

标签: pythonpandasapidataframetry-catch

解决方案


您正在使用在代码中追加的两个调用来创建输出列表中的不同行。将每一行的字典保存在局部变量中,填充它,然后将其附加到您的列表中。

出于代码可读性的原因,我还建议不要使用 try/catch,而只是直接if检查。您甚至可以为多个可选字段自动化它。

示例(未测试):

for hour in weather['hourly']['data']:
     row = {
         'time': hour['time'],
         'realtime': datetime.fromtimestamp(hour['time']),
         'summary': hour['summary'],
         'icon': hour['icon'],
         'temperature': hour['temperature'],
         'apparentTemperature': hour['apparentTemperature'],
         'dewPoint': hour['dewPoint'],
         'humidity': hour['humidity'],
         'pressure': hour['pressure'],
         'windSpeed': hour['windSpeed'],
         'windBearing': hour['windBearing'],
         'cloudCover': hour['cloudCover'],
         'uvIndex': hour['uvIndex'],
         'visibility': hour['visibility'],
     })
     for field in ('precipIntensity', 'precipIntensity', 'precipProbability', 'precipType'):
         if field in hour:
             row[field] = hour[field]
     daily_weather.append(row)

或者让它更整洁:

fields = ('time', 'summary', 'icon', 'temperature', 'apparentTemperature', 'dewPoint', 'humidity', 'pressure', 'windSpeed', 'windBearing', 'cloudCover',  'uvIndex', 'visibility', 'precipIntensity', 'precipIntensity', 'precipProbability', 'precipType')

for hour in weather['hourly']['data']:
     row = {
         'realtime': datetime.fromtimestamp(hour['time'])
     }
     for field in fields:
         if field in hour:
             row[field] = hour[field]
     daily_weather.append(row)

推荐阅读