首页 > 解决方案 > making new pandas columns based on min and max values

问题描述

given this data frame:

    HOUSEID     PERSONID    STRTTIME    ENDTIME TDTRPNUM
0   20000017    1            955          1020     1
1   20000017    1           1130          1132     2
2   20000017    1           1330          1400     3
3   20000017    2            958          1020     1
4   20000017    2           1022          1025     2
5   20000017    2           1120          1122     3
6   20000017    2           1130          1132     4

I want to make 2 new columns firsttrip_time and lasttrip_time. Then, add STRTTIME to the firsttrip_time for minimum number of TDTRPNUM , And add ENDTIME to lasttrip_time for the maximum number of TDTRPNUM in each HOUSEID and PERSONID category.

results:

    HOUSEID     PERSONID    firsttrip_time  lasttrip_time   
0   20000017      1          955              1400             
1   20000017      2          958              1132      

I have tried this to get the mix and max, but have no idea how to continue the process?

grouped = df.groupby(['HOUSEID', 'PERSONID','STRTTIME', 'ENDTIME'])['TDTRPNUM']
max = grouped.max()
min = grouped.min()

Can you help me with this or give me a hint?

thank you

标签: pythonpandas

解决方案


使用groupbywith agg,最后使用rename您的列:

print (df.sort_values(["HOUSEID","PERSONID","TDTRPNUM"])
         .groupby(["HOUSEID", "PERSONID"], as_index=False)
         .agg({"STRTTIME":"first","ENDTIME":"last"})
         .rename(columns={"STRTTIME":"firsttrip_time","ENDTIME":"lasttrip_time"}))

    HOUSEID  PERSONID  firsttrip_time  lasttrip_time
0  20000017         1             955           1400
1  20000017         2             958           1132

推荐阅读