首页 > 解决方案 > 熊猫字典列到列

问题描述

我想提取overalldictratings列中的键并将其添加为单独的列。这是我迄今为止尝试过的:

def try_literal_eval(e):
    try:
        return ast.literal_eval(e)
    except ValueError:
        return {'overall': 0}

res = pd.DataFrame(df['ratings'].apply(try_literal_eval).tolist())
output = pd.concat((df.drop('ratings', 1), res), axis=1)
output

df

customer_id    rating 
44224         {'overall': 5, 'description': 3}
55243         {'overall': 3, 'description': 2}

所需的输出_df

customer_id    overall_rating
44224          5
55243          3

标签: pythonpandas

解决方案


df['overall_rating'] = df['rating'].apply(lambda x: x.get('overall'))应该给你结果

c = ['customer_id','rating'] 
d = [[44224,{'overall': 5, 'description': 3}],
[55243,{'overall': 3, 'description': 2}]]
import pandas as pd
df = pd.DataFrame(d,columns=c)
print (df)
df['overall_rating'] = df['rating'].apply(lambda x: x.get('overall'))
print (df)

这个的输出是:

原始数据框:

   customer_id                            rating
0        44224  {'overall': 5, 'description': 3}
1        55243  {'overall': 3, 'description': 2}

更新的数据框:

   customer_id                            rating  overall_rating
0        44224  {'overall': 5, 'description': 3}               5
1        55243  {'overall': 3, 'description': 2}               3

或者你可以给:

df['overall_rating'] = pd.DataFrame([x for x in df['rating']])['overall']

这个的输出也将是相同的:

c = ['customer_id','rating'] 
d = [[44224,{'overall': 5, 'description': 3}],
[55243,{'overall': 3, 'description': 2}]]
import pandas as pd
df = pd.DataFrame(d,columns=c)
print (df)
df['overall_rating'] = pd.DataFrame([x for x in df['rating']])['overall']
#df['overall_rating'] = df['rating'].apply(lambda x: x.get('overall'))
print (df)

原始数据框:

   customer_id                            rating
0        44224  {'overall': 5, 'description': 3}
1        55243  {'overall': 3, 'description': 2}

更新的数据框:

   customer_id                            rating  overall_rating
0        44224  {'overall': 5, 'description': 3}               5
1        55243  {'overall': 3, 'description': 2}               3

具有浮点值的字典和没有“整体”条目的字典的示例

c = ['customer_id','rating'] 
d = [[44224,{'overall': 5, 'description': 3}],
[55243,{'overall': 3, 'description': 2}],
[11223,{'overall': 1.5, 'description': 2}],
[12345,{'description':3}]]
import pandas as pd
df = pd.DataFrame(d,columns=c)
print (df)
df['overall_rating'] = df['rating'].apply(lambda x: x.get('overall'))
print (df)

这个的输出是:

输入数据框

   customer_id                              rating
0        44224    {'overall': 5, 'description': 3}
1        55243    {'overall': 3, 'description': 2}
2        11223  {'overall': 1.5, 'description': 2}
3        12345                  {'description': 3}

更新后的 DataFrame 是:

   customer_id                              rating  overall_rating
0        44224    {'overall': 5, 'description': 3}             5.0
1        55243    {'overall': 3, 'description': 2}             3.0
2        11223  {'overall': 1.5, 'description': 2}             1.5
3        12345                  {'description': 3}             NaN

推荐阅读