首页 > 解决方案 > 熊猫识别组中具有列值的第一行

问题描述

我有一个包含三列的数据框:

    ID       Date    Status
0    1   1/1/2000  Complete
1    1   1/4/2000  ReOpened
2    1  1/10/2000  ReOpened
3    1  1/11/2000    Closed
4    1  1/15/2000  ReOpened
5    2   1/2/2000  ReOpened
6    2   1/4/2000  ReOpened
7    2  1/10/2000    Closed
8    3  1/20/2000    Closed
9    3  1/22/2000    Closed
10   4  1/25/2000  ReOpened

对于每个 ID,如果有“重新打开”状态,我需要根据日期获取显示第一次“重新打开”的行。所以我的输出看起来像:

   ID ProductionDate    Status
0   1       1/4/2000  ReOpened
1   2       1/2/2000  ReOpened
2   4      1/25/2000  ReOpened

我试过了: df = pd.np.where(df.Status.str.contains("ReOpened"), df.groupby(['ID']).first(),0)但这不起作用。

标签: pythonpandas

解决方案


drop_duplicates 应该足够了。

df[df.Status.eq('ReOpened')].drop_duplicates(['ID'])                                                                       
#    ID       Date    Status
#1    1   1/4/2000  ReOpened
#5    2   1/2/2000  ReOpened
#10   4  1/25/2000  ReOpened

推荐阅读