首页 > 解决方案 > 我如何合并同一数据框/熊猫中的两行

问题描述

我有一个数据集,包括这些列:(订单号、运单号、订单日期、预定时间、类型、送货费、现金收集、工人、调度、已完成、已分配时间、状态)

** 每个订单以两行表示,第一个类型:PICKUP,第二个类型:DELIVERY(具有相同的订单#和一些列,如:

   Unnamed: 0     Order #  Waybill #             Order date  \
0           0  9920000150        NaN  01 Aug, 2019 12:30 PM   
1           1  9920000150        NaN  01 Aug, 2019 12:30 PM   

           Scheduled for      Type  Delivery Fee  Cash collection   Worker  \
0  01 Aug, 2019 03:00 PM    PICKUP           NaN              NaN  Driver1   
1  01 Aug, 2019 03:00 PM  DELIVERY           NaN            135.0  Driver1   

              Dispatched              Completed            Assigned On  \
0  01 Aug, 2019 01:49 PM  01 Aug, 2019 01:51 PM  01 Aug, 2019 01:42 PM   
1  01 Aug, 2019 01:55 PM  01 Aug, 2019 02:08 PM  01 Aug, 2019 01:42 PM   

      Status  
0  Completed  
1  Completed  

我想将两行合并为一列,因此列将如下所示:因此一列可以表示为:[订单#,运单#,订单日期,预定时间,送货费,现金收集,工人,Dispatched_pickup,Completed_pickup, Assigned On_pickup,Status_pickup,Dispatched_delivery,Completed_delivery,Assigned On_delivery,Status_delivery]

我刚刚尝试过,但它不起作用 df1 = df.assign(cid = df.groupby(['Order #', 'Waybill #', 'Order date' , 'Scheduled for']).cumcount())。 set_index(['Order #', 'cid']).unstack(-1).sort_index(1,1)

标签: pythonpandasdata-analysis

解决方案


这是一个简单的示例,您可以将其扩展到更多列。我重命名了这些列,这样它们就不会发生冲突。

pickup_df = df[df['type'] == "PICKUP"]
delivery_df = df[df['type'] == "DELIVERY"]

pickup_df = pickup_df[['Order#','Waybill', 'Orderdate']]
delivery_df = pickup_df[['Order#','Waybill', 'Orderdate']]

pickup_df.rename( columns={'Waybill' : 'Pickup Waybill', 'Orderdate' : 'Pickup Orderdate'}, inplace=True)
deliver_df.rename( columns={'Waybill' : 'Delivery Waybill', 'Orderdate' : 'Delivery Orderdate'}, inplace=True)

combined_df = pickup_df.merge(deliver_df, on='Order#', how='left')

推荐阅读