首页 > 解决方案 > 如何使用一个df的开始和结束时间范围进行分组并使用python从另一个df中找到每个车辆插槽的开始位置和结束位置?

问题描述

我有一个数据框:df1:这是车辆的插槽:

    CompanyID   RegistrationNo  slotStartTime           slotEndTime
1   602         veh1            2020-07-27 21:12:00 2020-07-27 22:12:00
2   602         veh1            2020-07-27 21:30:00 2020-07-27 22:30:00
3   602         veh2            2020-07-28 22:16:00 2020-07-28 23:16:00

另一个:df2:从这个数据中我想找到插槽的开始位置和插槽的结束位置

    RegistrationNo  GPSTime         Location
0   veh1            2020-07-27 21:12:00 loc1
1   veh1            2020-07-27 21:15:00 loc2
2   veh1            2020-07-27 21:20:00 loc3
3   veh1            2020-07-27 21:30:00 loc4
4   veh1            2020-07-27 21:45:00 loc5
5   veh1            2020-07-27 22:15:00 loc6
6   veh1            2020-07-27 22:29:00 loc7
4   veh2            2020-07-28 21:45:00 loc8
5   veh2            2020-07-28 22:15:00 loc9
6   veh2            2020-07-28 22:29:00 loc10 
7   veh2            2020-07-28 22:50:00 loc11 
7   veh2            2020-07-28 23:16:00 loc12 

预期结果:

    CompanyID   RegistrationNo  slotStartTime           slotEndTime      slotStartloc slotEndLoc
1   602         veh1            2020-07-27 21:12:00 2020-07-27 22:12:00  loc1         loc5
2   602         veh1            2020-07-27 21:30:00 2020-07-27 22:30:00  loc4         loc7
3   602         veh2            2020-07-28 22:16:00 2020-07-28 23:16:00  loc10        loc12

我曾尝试使用 group by daterange 但我猜 bcoz 涉及另一个 df 它不起作用并抛出错误

标签: pythonpandas

解决方案


def compare(reg, start, end):
    startslot = df2[(df2['RegistrationNo']==reg) & (df2['GPSTime'].between(start, end, inclusive=True))]['location'].iloc[0]
    endslot = df2[(df2['RegistrationNo']==reg) & (df2['GPSTime'].between(start, end, inclusive=True))]['location'].iloc[-1]
    return startslot, endslot

df2.sort_values('GPSTime', ascending=True, inplace=True)
df1[['slotStartloc', 'slotEndloc']] = df1.apply(lambda x: compare(x['RegistrationNo'], x['slotStartTime'], x['slotEndTime']), axis=1, result_type='expand')

df1
    CompanyID   RegistrationNo  slotStartTime           slotEndTime      slotStartloc slotEndLoc
1   602         veh1            2020-07-27 21:12:00 2020-07-27 22:12:00  loc1         loc5
2   602         veh1            2020-07-27 21:30:00 2020-07-27 22:30:00  loc4         loc7
3   602         veh2            2020-07-28 22:16:00 2020-07-28 23:16:00  loc10        loc12

推荐阅读