首页 > 解决方案 > 熊猫在列中找到最大连续下降周数

问题描述

我有一个如下所示的数据框。我想知道 ID 和产品方面最多连续下降多少周。

import pandas as pd

raw_data = {'ID': ['101', '101', '101','101', '101', '101', '102', '102', '102', '102','102', '103', '103', '103', '103','104', '104', '104', '104','104','104'],
            'product':['x','x','x','x','x','x','z','z','z','z','z','y','y','y','y','x','x','x','x','x','x'],
            'Week': ['201828','201829','201830','201831','201832','201833','201829','201830','201831','201832','201830','201831','201832','201833','201830','201831','201832','201833','201834','201835','201836'],
    'Orders': ['-15%','-4%','-6%','6%','-10%','15%','-26%','-15%','-56%','-15%','-4%', '5%', '-10%', '-10%', '15%', '-20%', '-11%','10%', '-15%', '-20%','-26%']}

df2 = pd.DataFrame(raw_data, columns = ['ID','product','Week','Orders'])

想要的输出:

输出

标签: pythonpython-3.xpandas

解决方案


使用cumsum创建附加键的一种方法

s=df2['Orders'].str.contains('-')

df2[s].groupby([df2.ID,(~s).groupby(df2['ID']).cumsum(),df2['product']]).size().max(level=[0,2])
Out[202]: 
ID   product
101  x          3
102  z          5
103  y          2
104  x          3
dtype: int64

推荐阅读