python - How to group items into buckets of 1-10?
问题描述
I am testing a very basic line of code.
modDF['RatingDecile'] = pd.cut(modDF['RatingScore'], 10)
This gives me ranges of rating scores in 10 buckets. Instead of the range, how can I see 1, 2, 3, etc., up to 10?
So, instead of this.
Score RatingQuantile
0 (26.3, 29.0]
6 (23.6, 26.3]
7 (23.6, 26.3]
8 (26.3, 29.0]
10 (18.2, 20.9]
... ...
9763 (23.6, 26.3]
9769 (20.9, 23.6]
9829 (20.9, 23.6]
9889 (23.6, 26.3]
9949 (20.9, 23.6]
How can I get something like this?
Score RatingQuantile
0 10
6 8
7 8
8 10
10 6
... ...
9763 8
9769 5
9829 5
9889 5
9949 5
I tried this.
modDF['DecileRank'] = pd.qcut(modDF['RatingScore'],10,labels=False)
I got this error.
ValueError: Bin edges must be unique: array([ 2., 20., 25., 27., 27., 27., 27., 27., 27., 27., 29.]).
You can drop duplicate edges by setting the 'duplicates' kwarg
The error makes sense to me. I just don't know the work-around for this issue. Thoughts?
解决方案
qcut()
如果通过一个系列,我没有问题。我假设您的数据看起来像我正在使用的数据。
import pandas as pd
import numpy as np
data = {'values':np.random.randint(1,30,size=1000)}
df = pd.DataFrame(data)
df['ranks'] = pd.qcut(df['values'],10,labels=False)
print(df)
输出:
values ranks
0 18 5
1 22 7
2 5 1
3 12 3
4 14 4
.. ... ...
995 22 7
996 13 4
997 26 8
998 3 0
999 22 7
groupby()
之后您可以使用或其他一系列功能检查简单操作(例如垃圾箱的限制) :
df_info = df.groupby('ranks').agg(
min_score=pd.NamedAgg(column='values',aggfunc='min'),
max_score=pd.NamedAgg(column='values',aggfunc='max'),
count_cases=pd.NamedAgg(column='values',aggfunc='count'))
print(df_info)
输出:
min_score max_score count_cases
ranks
0 1 3 137
1 4 5 72
2 6 8 105
3 9 11 96
4 12 14 98
5 15 17 107
6 18 20 91
7 21 23 99
8 24 27 121
9 28 29 74
推荐阅读
- kubernetes - 使用“应用”时“已创建”与“未更改”与“已配置”
- javascript - 如果您尝试编辑,光标会跳到字段末尾
- ios - 关于应用内评级 iOS 的说明
- machine-learning - 当下一个状态无法达到时,如何应用强化学习?
- entity-framework-core - 实体框架多对一关系 - 对象引用未设置为对象的实例
- reactjs - 在 ReactJS 中上传多个文件
- python - 有没有办法以更简单的方式对列表中的重复项进行排序和删除?
- jasper-reports - 如何修复 java.lang.ClassCastException:java.lang.Double 无法转换为变量表达式的 java.math.BigDecimal 错误?
- python - 从外部函数将项目写入 tkinter 列表框
- php - 仅使用 SSL 服务器证书连接到 MySQL