首页 > 解决方案 > Groupby 和范围计数并在 Python 中传播

问题描述

我有dx

   City     Val1        Val2
x1  N1      0.0000      1.0
x2  N1      790.9980    6.0
x3  N2      110.3755    4.0
x4  N1      191.8150    2.0
x5  N2      574.3100    3.0
x6  N1      725.4000    4.0
x7  N3      362.7000    2.0
x8  N1      196.4800    1.0
x9  N3      230.8200    2.0
x10 N1      259.8830    1.0

可重现的例子:

dx = pd.DataFrame({'City': {'x1': 'N1',
  'x2': 'N1',
  'x3': 'N2',
  'x4': 'N1',
  'x5': 'N2',
  'x6': 'N1',
  'x7': 'N3',
  'x8': 'N1',
  'x9': 'N3',
  'x10': 'N1'},
 'Val1': {'x1': 0.0,
  'x2': 790.998,
  'x3': 110.3755,
  'x4': 191.81500000000003,
  'x5': 574.31,
  'x6': 725.4,
  'x7': 362.7,
  'x8': 196.48,
  'x9': 230.82,
  'x10': 259.883},
 'Val2': {'x1': 1.0,
  'x2': 6.0,
  'x3': 4.0,
  'x4': 2.0,
  'x5': 3.0,
  'x6': 4.0,
  'x7': 2.0,
  'x8': 1.0,
  'x9': 2.0,
  'x10': 1.0}})

dx

我想将它们分组City并计算Val1范围内的数量[0,200,400,600,800,1000]

预期输出:

      (0, 200]   (200, 400]   (400, 600]   (600, 800]   (800, 1000]
City
N1     2          1            0            2            0
N2     1          0            1            0            0
N3     0          2            0            0            0

标签: pythonpandas

解决方案


使用crosstabwith并通过withcut添加缺失的类别:DataFrame.reindexcat.categories

bins = [0,200,400,600,800,1000]

s = pd.cut(dx['Val1'], bins=bins)
df = pd.crosstab(dx['State'], s).reindex(s.cat.categories, axis=1, fill_value=0)
print (df)
Val1   (0, 200]  (200, 400]  (400, 600]  (600, 800]  (800, 1000]
State                                                           
N1            2           1           0           2            0
N2            1           0           1           0            0
N3            0           2           0           0            0

如果使用或DataFrame.pivot_table添加所有类别:GroupBy.sizeSeries.unstack

df = dx.pivot_table(index='State', columns=pd.cut(dx['Val1'], bins=bins), aggfunc='size')

df = dx.groupby(['State', pd.cut(dx['Val1'], bins=bins)]).size().unstack()
print (df)
Val1   (0, 200]  (200, 400]  (400, 600]  (600, 800]  (800, 1000]
State                                                           
N1            2           1           0           2            0
N2            1           0           1           0            0
N3            0           2           0           0            0

推荐阅读