首页 > 解决方案 > 如何从 Pandas 数据框中获取 1 和 0 的最大连续数量

问题描述

我想从每行的熊猫数据框中获取最大数量的连续 1 和 0

import pandas as pd
d=[[0,0,1,0,1,0],[0,0,0,1,1,0],[1,0,1,1,1,1]]
df = pd.DataFrame(data=d)
df
Out[4]: 
   0  1  2  3  4  5
0  0  0  1  0  1  0
1  0  0  0  1  1  0
2  1  0  1  1  1  1

输出应如下所示:

Out[5]: 
   0  1  2  3  4  5  Ones  Zeros
0  0  0  1  0  1  0     1      2      
1  0  0  0  1  1  0     2      3
2  1  0  1  1  1  1     4      1

标签: pythonpandasdataframemaxrow

解决方案


利用boolean maskingwitheqshift。我们检查当前值是否等于1or0和下一个值是否等于1or 0True这样我们就可以用&得到数组,False这样我们就可以sum结束它们了axis=1

m1 = df.eq(0) & df.shift(axis=1).eq(0) # check if current value is 0 and previous value is 0
m2 = df.shift(axis=1).isna() # take into account the first column which doesnt have previous value

m3 = df.eq(1) & df.shift(-1, axis=1).eq(1) # check if current value is 1 and next value is 1
m4 = df.shift(-1, axis=1).isna() # take into account the last column which doesnt have next value

df['Ones'] = (m1 | m2).sum(axis=1)
df['Zeros'] = (m3 | m4).sum(axis=1)

输出

   0  1  2  3  4  5  Ones  Zeros
0  0  0  1  0  1  0     2      1
1  0  0  0  1  1  0     3      2
2  1  0  1  1  1  1     1      4

推荐阅读