python-3.x - Binning with pd.Cut Beyond range(replacing Nan with "Max_val" )
问题描述
df= pd.DataFrame({'days': [0,31,45,35,19,70,80 ]})
df['range'] = pd.cut(df.days, [0,30,60])
df
Here as code is reproduced , where pd.cut is used to convert a numerical column to categorical column . pd.cut
usually gives category as per the list passed [0,30,60]
. In this row's 0 , 5 & 6 categorized as Nan which is beyond the [0,30,60]
. what i want is 0 should categorized as <0
& 70 should categorized as >60
and similarly 80 should categorized as >60
respectively, If possible dynamic text labeling of A,B,C,D,E
depending on no of category created.
解决方案
For the first part, adding -np.inf
and np.inf
to the bins will ensure that everything gets a bin:
In [5]: df= pd.DataFrame({'days': [0,31,45,35,19,70,80]})
...: df['range'] = pd.cut(df.days, [-np.inf, 0, 30, 60, np.inf])
...: df
...:
Out[5]:
days range
0 0 (-inf, 0.0]
1 31 (30.0, 60.0]
2 45 (30.0, 60.0]
3 35 (30.0, 60.0]
4 19 (0.0, 30.0]
5 70 (60.0, inf]
6 80 (60.0, inf]
For the second, you can use .cat.codes
to get the bin index and do some tweaking from there:
In [8]: df['range'].cat.codes.apply(lambda x: chr(x + ord('A')))
Out[8]:
0 A
1 C
2 C
3 C
4 B
5 D
6 D
dtype: object
推荐阅读
- reactjs - 如何使用 React-Hook-Form 设置焦点
- json - 如何在ansible中使用jinja2模板创建一个包含多个条目的json文件?
- javascript - 如何使用来自 finra.com 的 python 抓取 JS Web 数据
- android - 非无限分页
- firebase - Flutter 从文档中删除数组
- vert.x - 我应该怎么做才能从此 vertx 代码中获取数据库响应
- javascript - 使用引用属性将 JSON 转换为 YAML
- javascript - spawnSync /bin/sh ENOBUFS
- javascript - 显示/隐藏输入,并选择适当的输入
- python - python selenium 从元素中获取文本