python - Pandas:如何对基于两列的过滤行进行数学计算?
问题描述
我有以下数据框,我需要对过滤的行进行数学运算。
data = pd.DataFrame({'name': ['jpy','jpy','eur','usd','usd','usd'],'currency':['jpy_23','jpy_23','eur_15', 'thb_20','thb_20','thb_20'],
'sal':[15.0,20.0,25.0,30.0,20.0,15.0 ]})
我想基于如下两列进行分组:
df1 = df.groupby(['name','currency'])
然后我想对每个组进行以下操作,如下所示:
len(data[(data['sal']>25)])/len(data.index)
len(data[(data['sal']<=25)])/len(data.index)
len(data[(data['sal']>=0) & (data['sal']<5)])/len(data.index)
len(data[(data['sal']>=5) & (data['sal']<15)])/len(data.index)
len(data[(data['sal']>=15) & (data['sal']<25)])/len(data.index)
最后,预期的数据框应该如下所示。空列应包含计算值。请建议如何获得预期的输出。
new_data = pd.DataFrame({'name': ['jpy','eur','usd'],'currency':['jpy_23','eur_15','thb_20'],
'>25':[ ], '<= 25': [ ], 'Between 0 & 5': [ ], 'Between 5 & 15' : [ ], 'Between >15 & 25': [ ]})
解决方案
或许:
In [4]: bins = [0, 5, 15, 25, float("inf")]
...: groups = data.groupby(['name', 'currency', pd.cut(data['sal'], bins)])
...: d = groups.size().unstack()
...: d.div(d.sum(axis=1), axis=0)
Out[4]:
sal (0.0, 5.0] (5.0, 15.0] (15.0, 25.0] (25.0, inf]
name currency
eur eur_15 0.0 0.000000 1.000000 0.000000
jpy_23 NaN NaN NaN NaN
thb_20 NaN NaN NaN NaN
jpy eur_15 NaN NaN NaN NaN
jpy_23 0.0 0.500000 0.500000 0.000000
thb_20 NaN NaN NaN NaN
usd eur_15 NaN NaN NaN NaN
jpy_23 NaN NaN NaN NaN
thb_20 0.0 0.333333 0.333333 0.333333
推荐阅读
- python - 在python中运行程序时导入错误出现以下错误消息
- python - 如何将 django.contrib.auth 中的现有用户与 Python Social Auth(Google 后端)相关联?
- java - MockRestServiceServer 是否支持双向 TLS,如果支持,如何配置?
- python - Python:从 json/dictionary 创建 csv
- javascript - TypeError:无法读取未定义的属性“tapAsync”
- java - 为什么我不能将数据从数据库加载到jsp文件?
- python - 将 2 个数据框组合到 1 个带有 2 个工作表的 excel 工作簿中
- excel - 堆栈空间不足 - 如何优化 VBA 代码
- reactjs - React Native 5 向标题元素类型添加按钮无效
- c++ - 尝试链接到我使用 LD_PRELOAD 创建的共享库时出错