python - 熊猫字典列到列
问题描述
我想提取overall
dictratings
列中的键并将其添加为单独的列。这是我迄今为止尝试过的:
def try_literal_eval(e):
try:
return ast.literal_eval(e)
except ValueError:
return {'overall': 0}
res = pd.DataFrame(df['ratings'].apply(try_literal_eval).tolist())
output = pd.concat((df.drop('ratings', 1), res), axis=1)
output
df
customer_id rating
44224 {'overall': 5, 'description': 3}
55243 {'overall': 3, 'description': 2}
所需的输出_df
customer_id overall_rating
44224 5
55243 3
解决方案
df['overall_rating'] = df['rating'].apply(lambda x: x.get('overall'))
应该给你结果
c = ['customer_id','rating']
d = [[44224,{'overall': 5, 'description': 3}],
[55243,{'overall': 3, 'description': 2}]]
import pandas as pd
df = pd.DataFrame(d,columns=c)
print (df)
df['overall_rating'] = df['rating'].apply(lambda x: x.get('overall'))
print (df)
这个的输出是:
原始数据框:
customer_id rating
0 44224 {'overall': 5, 'description': 3}
1 55243 {'overall': 3, 'description': 2}
更新的数据框:
customer_id rating overall_rating
0 44224 {'overall': 5, 'description': 3} 5
1 55243 {'overall': 3, 'description': 2} 3
或者你可以给:
df['overall_rating'] = pd.DataFrame([x for x in df['rating']])['overall']
这个的输出也将是相同的:
c = ['customer_id','rating']
d = [[44224,{'overall': 5, 'description': 3}],
[55243,{'overall': 3, 'description': 2}]]
import pandas as pd
df = pd.DataFrame(d,columns=c)
print (df)
df['overall_rating'] = pd.DataFrame([x for x in df['rating']])['overall']
#df['overall_rating'] = df['rating'].apply(lambda x: x.get('overall'))
print (df)
原始数据框:
customer_id rating
0 44224 {'overall': 5, 'description': 3}
1 55243 {'overall': 3, 'description': 2}
更新的数据框:
customer_id rating overall_rating
0 44224 {'overall': 5, 'description': 3} 5
1 55243 {'overall': 3, 'description': 2} 3
具有浮点值的字典和没有“整体”条目的字典的示例
c = ['customer_id','rating']
d = [[44224,{'overall': 5, 'description': 3}],
[55243,{'overall': 3, 'description': 2}],
[11223,{'overall': 1.5, 'description': 2}],
[12345,{'description':3}]]
import pandas as pd
df = pd.DataFrame(d,columns=c)
print (df)
df['overall_rating'] = df['rating'].apply(lambda x: x.get('overall'))
print (df)
这个的输出是:
输入数据框
customer_id rating
0 44224 {'overall': 5, 'description': 3}
1 55243 {'overall': 3, 'description': 2}
2 11223 {'overall': 1.5, 'description': 2}
3 12345 {'description': 3}
更新后的 DataFrame 是:
customer_id rating overall_rating
0 44224 {'overall': 5, 'description': 3} 5.0
1 55243 {'overall': 3, 'description': 2} 3.0
2 11223 {'overall': 1.5, 'description': 2} 1.5
3 12345 {'description': 3} NaN
推荐阅读
- pandas - 如何合并在 for 循环中生成的数据帧
- elasticsearch - ElasticSearch 自定义搜索,用于在产品目录上进行企业完整搜索
- python - 如何从外部运行虚拟环境python文件?
- photoshop-script - 这个脚本的完整路径是什么?ECMAScript 相当于 %~f0
- python - 在不同的课程中尝试更改 kivy 窗口过渡方向会引发错误?
- python - 如何配置 Spacy 管道以对拼写检查器组件的结果进行词形还原?
- linux - mongoDB(结果=信号,代码=杀死,信号=生病
- github - 无论如何在 Github 页面上使用 SharedArrayBuffer 吗?
- java - Android Gradle 插件需要 Java 11 才能运行。您当前使用的是 Java 1.8。错误
- python - MDRectangleFlatButton(s) 不采用它们在 GridLayout 或 MDGridLayout 中应该具有的大小。Kivy 的按钮工作正常