首页 > 解决方案 > Pandas 的系列包含 AttributeError:“系列”对象没有属性“包含”

问题描述

我的数据

data = [{"content": "1", "title": "chestnut", "info": "", "time": 1578877014},
     {"content": "2", "title": "chestnut", "info": "", "time": 1579877014},
     {"content": "3", "title": "ches", "info": "", "time": 1582877014},
     {"content": "aa", "title": "ap", "info": "", "time": 1582876014},
     {"content": "15", "title": "apple", "info": "", "time": 1581877014},
     {"content": "16", "title": "banana", "info": "", "time": 1561877014},
     ]

我的代码

index=[i['content'] for i in data]

s=pd.Series(data,index)
print((s[s.str.get('title').contains('ches',regex=True)]))

发生了错误

AttributeError: 'Series' object has no attribute 'contains'

我想实现这个效果,如何使用包含包含文档: https ://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.Series.str.contains.html#pandas.Series.str .包含.

我希望数据是

[
{"content": "1", "title": "chestnut", "info": "", "time": 1578877014},
{"content": "2", "title": "chestnut", "info": "", "time": 1579877014},
{"content": "3", "title": "ches", "info": "", "time": 1582877014},
]

标签: pythonpandas

解决方案


最好有一个与数据兼容的结构。使用数据框。

DataFrame 提供了更好的列和行操作。您的数据是二维的,即它有项目,然后每个项目都有带有值的属性。因此适合像 DataFrame 这样的 2D 结构,而不是像 Series 这样的 1D 结构。

>>> df = pd.DataFrame(data)
>>> df
  content     title info        time
0       1  chestnut       1578877014
1       2  chestnut       1579877014
2       3      ches       1582877014
3      aa        ap       1582876014
4      15     apple       1581877014
5      16    banana       1561877014

>>> df[df.title.str.contains('ches')]
  content     title info        time
0       1  chestnut       1578877014
1       2  chestnut       1579877014
2       3      ches       1582877014

对于系列(不推荐)

s[s.apply(lambda x: x.get('title')).str.contains('ches')]

推荐阅读