首页 > 解决方案 > 从另一个数据框中的一个数据框中搜索值并在相应的行/不同的列中返回信息

问题描述

我有 2 个数据框:

df_Billed: pd.Dataframe({'Bill_Number':[220119, 220120, 220219, 220219, 220419, 220519, 220619, 221219],'Date': [1/31/2019, 2/20/2020, 2/28/2019, 6/30/2019,6/30/2019,6/30/2019,6/30/2019,12/31/2019], 'Amount': [3312.5, 832.0,10000.0, -3312.5,8725.0,1862.5,3637.5,1587.5]})

df_Received: pd.Dataframe({'Bill_Number':[220119, 220219, 220419, 220519, 220619],'Date':[4/16/2019,5/21/2019,8/2/2019,8/2/2019,8/2/2019],'Amount':[3312.5,6687.5,8725,1862.5,3637.5]})

我正在尝试在 df_Billed 中搜索每个“Bill_Number”,以查看是否存在 df_Received。理想情况下,如果存在,我想计算该特定账单号的 df_Billed 和 df_Received 日期之间的差异(以查看获得付款所需的天数)。如果 df_Received 中不存在帐单编号,我只想在 df_Billed 中返回该帐单编号的所有行。

EX: Since df_Billed Bill_Number 220119 is in df_Received, it would return 75 (which is the number of days it took for the bill to be paid 4/16/2019 - 1/31/2019). 

EX: Since df_Billed Bill_Number 221219 is not in df_Received, it would return 12/31/2019 (which is the date it was billed). 

标签: pythonpandasdataframe

解决方案


您最初可能会在 Bill_Number 上使用合并

df_Billed=df_Billed.merge(df_Received,on='Bill_Number',how='left')

然后使用applypandas.to_datetime计算日期之间的差异

df_Billed['result']=df_Billed.apply(lambda x:x.Date_x if pd.isnull(x.Date_y) 
                    else abs(pd.to_datetime(x.Date_x)-pd.to_datetime(x.Date_y)).days, 
                    axis=1)

最后,我认为您想为最终结果创建一个新列。所以我将合并的列 Date_x 和 Amount_y 重命名为 Date 和 Amount 下面:

df_Billed.drop(['Date_y','Amount_y'],axis=1,inplace=True)
df_Billed.rename(columns={"Date_x": "Date","Amount_x":"Amount"},inplace=True)

最终数据框:

在此处输入图像描述


推荐阅读