首页 > 解决方案 > 如何计算数据框中的条件密度?

问题描述

我有一个像下面这样的数据框。

amplitude   -13.125 |-13.125 |-11.875 |-11.875 |-11.25  |-11.25
duration -----------|--------|--------|--------|--------|--------
1           NaN     |NaN     |NaN     |NaN     |NaN     |NaN
2           NaN     |0.008032|NaN     |NaN     |NaN     |NaN
3           0.004016|NaN     |NaN     |NaN     |0.004016|0.004016
4           0.9     |NaN     |NaN     |NaN     |NaN     |NaN
5           NaN     |NaN     |NaN     |NaN     |NaN     |NaN
--------------------|--------|--------|--------|--------|--------
sum         0.904016|0.008032|NaN     |NaN     |0.004016|0.004016

如何在数据框中的行和列的交点处找到值?另外,我想通过将我找到的值除以“总和”中的值来计算密度。例子:

duration        amplitude       density 
3               -13.125      0.004016/0.904016  
2               -13.125      0.008032/0.008032
... 

标签: pythonpandasdataframedata-scienceprobability-density

解决方案


假设总和行是数据框的最后一行:

# Divide every line except the last line by the last line
density = (df.iloc[:-1] / df.iloc[-1]).stack().to_frame('density')

结果:

                     density
duration amplitude          
2        -13.125    1.000000
3        -13.125    0.004442
         -11.250    1.000000
         -11.250    1.000000
4        -13.125    0.995558

推荐阅读