python - 如何将 Pandas DataFrame 行的迭代结果存储在新列中?
问题描述
我是 Python 编码的新手。目前,我正在尝试分析包含多个工作流的数据框。每个工作流程都有用于启动和结束工作流程的不同流程步骤。在简化版本中,我的数据如下所示:
Workflow Initiate End_1 End_2 End_3
0 1 Name_1 na Name_1 na
1 2 Name_2 na na na
2 3 Name_3 na na Name_5
3 4 Name_4 Name_5 na na
4 5 na na na Name_5
对于每个工作流,我想比较结束工作流的名称是否与启动工作流的名称不同。
以下列方式遍历行给了我想要的输出:
for index, row in df.iterrows():
if ((row['Initiate'] != 'na')
and (row['Initiate'] == row['End_1']) |
(row['Initiate'] == row['End_2']) |
(row['Initiate'] == row['End_3'])
):
print("Name end equals initiate")
elif ((row['End_1'] == 'na') &
(row['End_2'] == 'na') &
(row['End_3'] == 'na')
):
print("No name ended")
else:
print("Different name ended")
Name end equals initiate
No name ended
Different name ended
Different name ended
Different name ended
但是,我想在显示每个工作流背后的上述结果的数据框中添加一个额外的列,例如“分析”。
为此,我将代码填充到一个函数中:
def function_name(a, b, c, d):
for index, row in df.iterrows():
if ((a != 'na')
and (a == b) |
(a == c) |
(a == d)
):
return "Name end equals initiate"
elif ((b == 'na') &
(c == 'na') &
(d == 'na')
):
return "No name ended"
else:
return "Different name ended"
df['Analysis'] = function_name(row['Initiate'],
row['End_1'],
row['End_2'],
row['End_3'])
print(df)
Workflow Initiate ... End_3 Analysis
0 1 Name_1 ... na Different name ended
1 2 Name_2 ... na Different name ended
2 3 Name_3 ... Name_5 Different name ended
3 4 Name_4 ... na Different name ended
4 5 na ... Name_5 Different name ended
如您所见,输出与第一次分析不同。我想在我的数据框中添加一个额外的列,它可以提供与打印语句相同的输出。
解决方案
您应该在这里避免按行循环。您的算法是可矢量化的:
df = df.replace('na', np.nan) # replace string 'na' with NaN for efficient processing
ends = df.filter(like='End') # filter by columns with 'End'
match = ends.ffill(1).iloc[:, -1] == df['Initiate'] # find last Name in each End
nulls = ends.isnull().all(1) # check which rows are all null
# apply vectorised conditional logic
df['Result'] = np.select([match, nulls], ['Name end equals initiate', 'No name ended'],
'Different name ended')
print(df)
Workflow Initiate End_1 End_2 End_3 Result
0 1 Name_1 NaN Name_1 NaN Name end equals initiate
1 2 Name_2 NaN NaN NaN No name ended
2 3 Name_3 NaN NaN Name_5 Different name ended
3 4 Name_4 Name_5 NaN NaN Different name ended
4 5 NaN NaN NaN Name_5 Different name ended
推荐阅读
- php - 对象数组的数组交集
- dll - 在一台 PC 上无法使用 PyInstaller 找到 DLL,但可以在另一台 PC 上找到
- python - 如何根据另一个数据框中的组在熊猫数据框中创建指标列?
- reactjs - 两个不同父母的孩子可以在 React 中拥有相同的键吗
- excel - UserForm ComboBox 从动态工作表列表填充,从第一个 ComboBox 的选择中添加第二个 ComboBox
- javascript - React Redux:我的 React 组件没有收到更新的数组作为道具
- c# - Process files concurrently and asynchronously
- html - vue.js 更改整个网站的字体
- c# - 使用 redis-ha 集群进行 Redlock
- ios - Xcode 11.5 - 目前无法安装此应用