python - Python - 从字典类型列熊猫数据框中获取前 5 项

问题描述

我有一个数据框，其中一列是字典，我在该字典中得到了大量的项目，这导致了我的内存问题。解决方案是只从该字典中获取前 10 个项目。我已经有了代码，但它给出了一个错误：

TypeError: '<' not supported between instances of 'dict' and 'dict'

我制作了一个示例代码只是为了向您展示我的问题：

import pandas as pd
import datetime

res = pd.DataFrame([])
res_tmp = pd.DataFrame([])
d = {'club': ['A1', 'B1'], 'score': [3, 4]}
df = pd.DataFrame(data=d)

for index, row in df.iterrows():
    total = int(row['score']) * -1
    res_tmp = res_tmp.append({'today': str(datetime.datetime.now()), 'total': total}, ignore_index=True)
    res = res.append({'club': row['club'], 'details': res_tmp.to_dict('dict')},ignore_index=True)

res['details'] = res['details'].apply(lambda y: (sorted(y.items(), key=lambda x: x[1]))[:1])

我做错了什么？注意：在示例中，我只有两行，这就是为什么我放前 1 而不是前 10

谢谢！

标签： pythonpandasdictionary

正如错误消息告诉您的那样，dicts 没有定义的值排序。如果要对字典进行排序，则必须提供您编写的函数来定义排序顺序。您提取了该值，但您还必须将 dict 转换为已<定义的某种类型。例如：

key = lambda x: list(x[1].values())

python - Python - 从字典类型列熊猫数据框中获取前 5 项

问题描述

解决方案

推荐阅读