python - 如果条件匹配,则从第二个数据帧中获取数据并将其附加到第一个数据帧
问题描述
我有两个名为emp_crm和verient的数据框。如果满足以下条件,则将 emp_crm 数据帧值附加到verient数据帧。
问题陈述: 如果匹配AND 等于
AND小于AND 大于then,
我想从中获取记录并附加到
,以便匹配emp_crm
附加到 Verient [所有列]。verient
emp_crm[ACD]
Verient[PbxId]
Date of AudioStart
Date of OPEN_DT
Time of OPEN_DT
Time of AudioStart
Time of AudioEnd
[condition: AudioStartTime < Ticket OpenTime < Audio EndTime]
emp_crm[ACD]
Verient[PbxId]
emp_crm's [all columns]
需要注意的地方:
任何记录都不应从verient中删除,只为匹配的记录追加,并将其他记录留空。
我的方法——
- 如果
emp_crm[ACD] == Verient[PbxId]
ANDverient[AudioStartDate] == emp_crm[TicketOpenDate]
ANDStartTime_in_Seconds < OpenTime_in_Seconds < EndTime_in_Seconds
然后
1.1。对于匹配emp_crm[ACD] == Verient[PbxId]
,
1.1.1。将 emp_crm 的 TICKET_NO、Tier_1 附加到 Verient(所有列)
1.1.2。删除不必要的列以使 verient_crm 保持简单。
DataFrame: emp_crm
ACD NID OPEN_DT OPEN_ID TICKET_NO TIER_1 OpenDate OpenTime OpenTime_in_Seconds
6557 n199 2021-05-19 20:13:00 n199 INC000047814 Voice 2021-05-19 20:13:00 72780
7768 n141 2021-04-30 19:39:00 n141 INC000047860 Data 2021-04-30 19:39:00 70740
7768 n141 2021-04-30 15:56:30 n141 INC000047853 Data 2021-04-30 15:56:30 57390
6570 n116 2021-05-19 16:00:27 n116 INC000047848 Ring 2021-05-24 16:10:00 58200
6570 n116 2021-05-19 15:55:02 n116 INC000047784 Voice 2021-05-19 15:55:02 57302
DataFrame: verient
AudioEnd AudioStart PbxId AudioStartDate StartTime AudioEndDate EndTime StartTime_in_Seconds EndTime_in_Seconds
2021-04-30 16:07:24 2021-04-30 15:35:27 6576 2021-04-30 15:35:27 2021-04-30 16:07:24 56127 58044
2021-04-30 15:51:12 2021-04-30 15:46:30 7768 2021-04-30 15:46:30 2021-04-30 15:51:12 56790 57072
2021-04-30 16:16:06 2021-04-30 15:48:22 7768 2021-04-30 15:48:22 2021-04-30 16:16:06 56902 58566
2021-05-19 16:00:47 2021-05-19 15:50:02 6570 2021-05-19 15:50:02 2021-05-19 16:00:47 57002 57647
2021-05-19 16:02:20 2021-05-19 15:53:26 6570 2021-05-19 15:53:26 2021-05-19 16:02:20 57206 57740
Final DataFrame: verient_crm
AudioEnd AudioStart PbxId ACD NID OPEN_DT OPEN_ID TICKET_NO TIER_1
2021-04-30 16:07:24 2021-04-30 15:35:27 6576
2021-04-30 15:51:12 2021-04-30 15:46:30 7768
2021-04-30 16:16:06 2021-04-30 15:48:22 7768 7768 n141 2021-04-30 15:56:30 n141 INC000047853 Data
2021-05-19 16:00:47 2021-05-19 15:50:02 6570 6570 n116 2021-05-19 16:00:27 n116 INC000047848 Ring
2021-05-19 16:02:20 2021-05-19 15:53:26 6570 6570 n116 2021-05-19 15:55:02 n116 INC000047784 Voice
Final Result after dropping unnecessary columns:
------------------------------------------------
verient_crm(AudioEnd,AudioStart,PbxId,ACD, NID,OPEN_DT, OPEN_ID, TICKET_NO, TIER_1)
我的代码和方法:
import pandas as pd
import numpy as np
matchedDict = {}
for record in emp_crm.iterrows():
for vr in verient.iterrows():
if(record == vr): ## verient.pbxLoginID == emp_CRM.ACD
if(vr.AudioStart_in_TimeStamp <= record.OPEN_DT_in_TimeStamp and record.OPEN_DT_in_TimeStamp <= vr.AudioEnd_in_TimeStamp):
matchedDict[vr.index] = record
# Append emp_crm(ACD, NID,OPEN_DT, OPEN_ID, TICKET_NO, TIER_1) with Verient for every matched ACD == PbxID from emp_crm to Verient.
for item in matchedDict:
record = item.value
verient.iloc[item.key,col_index_for_ACD] = record['ACD'] ## Don't know how to fetch col_index_for_tier1
verient.iloc[item.key,col_index_for_NID] = record['NID']
verient.iloc[item.key,col_index_for_OPEN_DT] = record['OPEN_DT']
verient.iloc[item.key,col_index_for_OPEN_ID] = record['OPEN_ID']
verient.iloc[item.key,col_index_for_TICKET_NO] = record['TICKET_NO']
verient.iloc[item.key,col_index_for_TIER_1] = record['TIER_1']
我收到以下错误,无法解决。
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-74-bf0a6bb075ee> in <module>
4 for record in emp_crm.iterrows():
5 for vr in verient.iterrows():
----> 6 if(record['ACD'] == vr['pbxLoginID']): ## verient.pbxLoginID == emp_CRM.ACD or not
7 if(vr.AudioStart_in_TimeStamp <= record.OPEN_DT_in_TimeStamp and record.OPEN_DT_in_TimeStamp <= vr.AudioEnd_in_TimeStamp):
8 matchedDict[vr.index] = record
TypeError: tuple indices must be integers or slices, not str
需要注意的地方:
- 任何记录都不应从 verient 中删除,只为 macthing 记录追加,并将其他记录留空。
解决方案
您可以合并数据帧,然后按如下方式过滤结果数据帧,而不是迭代数据帧(这是一个非常昂贵的操作) -
# Left merge
verient = pd.merge(verient, emp_crm, left_on = ['PbxId'], right_on = ['ACD'], how='left')
# Filter
verient= verient.loc[(verient.AudioStart_in_TimeStamp < verient.OPEN_DT_in_TimeStamp) & (verient.OPEN_DT_in_TimeStamp < verient.AudioEnd_in_TimeStamp)]
然后,您可以根据需要删除不必要的列。
编辑 -要保留数据框中的原始行verient
,您可以将NULL
值分配给emp_crm
条件不匹配的列。而不是filter
,这样做 -
verient.loc[~((verient.AudioStart_in_TimeStamp < verient.OPEN_DT_in_TimeStamp) & (verient.OPEN_DT_in_TimeStamp < verient.AudioEnd_in_TimeStamp)), ['ACD', 'NID', 'OPEN_DT', 'OPEN_ID', 'TICKET_NO', 'TIER_1']] = np.nan
推荐阅读
- azure - AADSTS90094:正在请求您无权授予的权限。请与您的管理员联系,他们可以授予权限
- php - Laravel 5.4:遍历数组并从每个对象中减去数量,直到数量达到 0 停止
- c# - 如何知道 Sitefinity 中的模块内容何时更改
- c# - 统一构建的控制台中显示 UriFormatException。如何解决这个问题?
- specflow - Specflow 外部步骤程序集 - 没有为一个或多个步骤找到匹配的步骤定义
- php - 将数据从文档文件加载到 CKEDITOR - PHP
- python - grpc 服务器中 ThreadPoolExecutor 的目的是什么?
- spring-boot - 在 spring-boot 中配置 http.server.requests 的分布统计
- unity3d - Unity - 带有音频源的多个预制件
- ibm-midrange - 如何保留在 Subfile 上按下 Enter 和在 Popup 上按下 F5 的值