python - 如果条件不满足,则计算类别数并按列表删除
问题描述
鉴于:
import pandas as pd
lis1= ('apple','orange','strawberry','strawberry','strawberry','apple','orange','orange','orange','strawberry')
lis2= ("lorem ipsum review","lorem ipsum review","lorem ipsum review","lorem ipsum review","lorem ipsum review","lorem ipsum review","lorem ipsum review","lorem ipsum review","lorem ipsum review","lorem ipsum review")
pd.DataFrame({'category':lis1, 'review': lis2})
category review
0 apple lorem ipsum review
1 orange lorem ipsum review
2 strawberry lorem ipsum review
3 strawberry lorem ipsum review
4 strawberry lorem ipsum review
5 apple lorem ipsum review
6 orange lorem ipsum review
7 orange lorem ipsum review
8 orange lorem ipsum review
9 strawberry lorem ipsum review
需要:
lis1= ('orange','strawberry','strawberry','strawberry','orange','orange','orange','strawberry')
lis2= ("lorem ipsum review","lorem ipsum review", "lorem ipsum review","lorem ipsum review","lorem ipsum review","lorem ipsum review","lorem ipsum review","lorem ipsum review")
pd.DataFrame({'category':lis1, 'review': lis2})
category review
0 orange lorem ipsum review
1 strawberry lorem ipsum review
2 strawberry lorem ipsum review
3 strawberry lorem ipsum review
4 orange lorem ipsum review
5 orange lorem ipsum review
6 orange lorem ipsum review
7 strawberry lorem ipsum review
我需要代码来计算唯一类别 (nunique()) 并删除仅出现少于 3 次的类别。该示例显示,由于 apple 是唯一出现两次的类别,因此应用了列表删除。
解决方案
您可以过滤groupby
and的结果transform
:
df[df.groupby('category')['category'].transform('count').gt(2)]
category review
1 orange lorem ipsum review
2 strawberry lorem ipsum review
3 strawberry lorem ipsum review
4 strawberry lorem ipsum review
6 orange lorem ipsum review
7 orange lorem ipsum review
8 orange lorem ipsum review
9 strawberry lorem ipsum review
另一个解决方案是value_counts
+ map
:
df[df.category.map(df['category'].value_counts()).gt(2)]
category review
1 orange lorem ipsum review
2 strawberry lorem ipsum review
3 strawberry lorem ipsum review
4 strawberry lorem ipsum review
6 orange lorem ipsum review
7 orange lorem ipsum review
8 orange lorem ipsum review
9 strawberry lorem ipsum review
推荐阅读
- mysql - MySQL IFNULL 仍然返回 NULL
- reactjs - 如何强制刷新图像
- javascript - 如何在 http 请求中使用存储数据?
- c# - MySQLX.GetSession 异常::'预期的消息 id:2。收到的消息 id:10'
- python-3.x - 网络刮刀在美丽的汤中不起作用
- java - 如何从片段中调用一个类
- python - 如何使用 __pycache__ 文件夹运行 Python 项目?
- java - Java 8 使用流重写复杂的 for 循环
- javascript - 如何使用单个函数在画布上绘制多个图像?
- c++ - GLib-GObject-CRITICAL ...断言“夸克> 0”失败