首页 > 解决方案 > 将信息从 dics 列表解析到系列级别

问题描述

作为对 PostgreSQL 数据库的 SQL 请求的结果,我有数据框: 在此处输入图像描述

my_table['receipt'] 具有以下结构:

{'id': '272f9730-000f-5000', 'paid': True, 'test': False, 'amount': {'value': '100.00', 'currency': 'RUB'}, 'status': 'succeeded', 'metadata': {'scid': '1311111'}, 'recipient': {'account_id': '555565', 'gateway_id': '5555138'}, 'created_at': '2020-10-31T15:32:00.308Z', 'refundable': True, 'captured_at': '2020-10-31T15:32:01.942Z', 'description': 'renting', 'income_amount': {'value': '96.50', 'currency': 'RUB'}, 'payment_method': {'id': '555555e-000f-5000-8000081', 'type': 'bank_card', 'saved': True, 'title': 'Bank card *3322'}, 'refunded_amount': {'value': '0.00', 'currency': 'RUB'}, 'receipt_registration': 'pending', 'authorization_details': {'rrn': '037777777776', 'auth_code': '5555555'}}

请您帮助我了解如何使用来自“receipt”的信息向 my_table DataFrame 添加一列“rrn”?

我得到的最大:

pd.DataFrame(day_payments['receipt'][0], index = ['rrn'])['authorization_details']

我不知道如何正确地将值添加到我的 DataFrame 中的所有行。

标签: pythonpandaslist

解决方案


您可以将 lambda 函数应用于带有字典的列以提取相关值:

df = pd.DataFrame({'a':[{'foo':1, 'bar':2}, {'foo':5, 'bar':6}]})

#           a
# 0 {'foo': 1, 'bar': 2}
# 1 {'foo': 5, 'bar': 6}

# If you only want a single column with a particular key from the dict, then just add as a series:
# df['foo'] = df['a'].apply(lambda x: x['foo'])
# Otherwise...

# Get keys from the first row dict - these become the new column names
new_cols  = df['a'].iloc[0].keys()

# Dynamically create new columns by looping over new column names
for col in new_cols:
    df[col] = df['a'].apply(lambda x: x[col])

df

>>>

#             a            foo   bar
# 0 {'foo': 1, 'bar': 2}    1     2
# 1 {'foo': 5, 'bar': 6}    5     6

推荐阅读