python - making new pandas columns based on min and max values
问题描述
given this data frame:
HOUSEID PERSONID STRTTIME ENDTIME TDTRPNUM
0 20000017 1 955 1020 1
1 20000017 1 1130 1132 2
2 20000017 1 1330 1400 3
3 20000017 2 958 1020 1
4 20000017 2 1022 1025 2
5 20000017 2 1120 1122 3
6 20000017 2 1130 1132 4
I want to make 2 new columns firsttrip_time
and lasttrip_time
. Then, add STRTTIME
to the firsttrip_time
for minimum number of TDTRPNUM
, And add ENDTIME
to lasttrip_time
for the maximum number of TDTRPNUM
in each HOUSEID
and PERSONID
category.
results:
HOUSEID PERSONID firsttrip_time lasttrip_time
0 20000017 1 955 1400
1 20000017 2 958 1132
I have tried this to get the mix and max, but have no idea how to continue the process?
grouped = df.groupby(['HOUSEID', 'PERSONID','STRTTIME', 'ENDTIME'])['TDTRPNUM']
max = grouped.max()
min = grouped.min()
Can you help me with this or give me a hint?
thank you
解决方案
使用groupby
with agg
,最后使用rename
您的列:
print (df.sort_values(["HOUSEID","PERSONID","TDTRPNUM"])
.groupby(["HOUSEID", "PERSONID"], as_index=False)
.agg({"STRTTIME":"first","ENDTIME":"last"})
.rename(columns={"STRTTIME":"firsttrip_time","ENDTIME":"lasttrip_time"}))
HOUSEID PERSONID firsttrip_time lasttrip_time
0 20000017 1 955 1400
1 20000017 2 958 1132
推荐阅读
- c# - 如何使用 C# 中的互操作以及嵌入式图像将 html 转换为 docx?
- python - 查询相关对象的列值并存储为变量
- java - 如何在数据集中存储 INDArray 的列表
- javascript - 如何在 Jest v24.6 toMatchSnapshot(propertyMatchers?, hint?) 中指定第二个参数?
- android - 如何显示按运动类别排序的谷歌地图中的多个现有标记?
- reactjs - 如何用值=“”正确填充输入?
- r - 当日期列不存在时绘制时间序列数据
- javascript - jQuery ajax“.done”回调没有触发
- firebase - 目标具有传递依赖关系,包括静态框架:
- pandas - 使用 pandas 绘制最高相关性