首页 > 解决方案 > pandas 最大数据返回元数据样式?

问题描述

数据

data = [
    {"content": "1", "title": "app sotre", "info": "", "time": 1578877014},
    {"content": "2", "title": "app", "info": "", "time": 1579877014},
    {"content": "3", "title": "pandas", "info": "", "time": 1582877014},
    {"content": "12", "title": "a", "info": "", "time": 1582876014},
    {"content": "33", "title": "apple", "info": "", "time": 1581877014},
    {"content": "16", "title": "banana", "info": "", "time": 1561877014},
]

我的代码

import pandas as pd
s = pd.Series(data)
print(pd.to_numeric(s.str.get('content'),errors='coerce').nlargest(3,keep='all'))

但只有得到,我知道使用 nlargest,但我需要更多

[12,33,16]

我希望最大的 3 个数据

[
    {"content": "12", "title": "a", "info": "", "time": 1582876014},
    {"content": "33", "title": "apple", "info": "", "time": 1581877014},
    {"content": "16", "title": "banana", "info": "", "time": 1561877014},
]

标签: pythonpandas

解决方案


我认为问题在于提取的数据Series,因此要获取原始数据,请选择索引值的Series原始数据:Series.loc

idx = pd.to_numeric(s.str.get('content'),errors='coerce').nlargest(3,keep='all').index

print (s.loc[idx].tolist())
[{'content': '33', 'title': 'apple', 'info': '', 'time': 1581877014}, 
 {'content': '16', 'title': 'banana', 'info': '', 'time': 1561877014},
 {'content': '12', 'title': 'a', 'info': '', 'time': 1582876014}]

如果需要按Series索引添加排序输出Series.sort_index

print (s.loc[idx].sort_index().tolist())
[{'content': '12', 'title': 'a', 'info': '', 'time': 1582876014}, 
 {'content': '33', 'title': 'apple', 'info': '', 'time': 1581877014}, 
 {'content': '16', 'title': 'banana', 'info': '', 'time': 1561877014}]

我知道要求是Series,但如果改变它:

df = pd.DataFrame(data)
df['content'] = pd.to_numeric(df['content'], errors='coerce')
df = df.nlargest(3, 'content')
print (df)
   content   title info        time
4       33   apple       1581877014
5       16  banana       1561877014
3       12       a       1582876014

print (df.to_dict('r'))
[{'content': 33, 'title': 'apple', 'info': '', 'time': 1581877014}, 
 {'content': 16, 'title': 'banana', 'info': '', 'time': 1561877014}, 
 {'content': 12, 'title': 'a', 'info': '', 'time': 1582876014}]

推荐阅读