python - 匹配 Pandas Dataframe 中的两列，但顺序很重要

问题描述

我有两个DataFrames

df_1：

和 df_2：

我的目标是得到以下结果： df_result：

idx A X B Y
0   1 A 1 H
1   2 B 2 I
2   4 D 4 J
3   2 F 2 K

我正在尝试根据df_2中的列来匹配A和列。BB

列A并B在到达 4 后重复它们的内容。这里的顺序很重要，因为 df_1 中的行与df_2中的行idx = 4不匹配。idx = 5

我试图使用：

matching = list(set(df_1["A"]) & set(df_2["B"]))

接着

df1_filt = df_1[df_1['A'].isin(matching)]
df2_filt = df_2[df_2['B'].isin(matching)]

但这没有考虑顺序。

我正在寻找一个没有很多 for 循环的解决方案。

编辑：

df_result = pd.merge_asof(left=df_1, right=df_2, left_on='idx', right_on='idx', left_by='A', right_by='B', direction='backward', tolerance=2).dropna().drop(labels='idx', axis='columns').reset_index(drop=True)

得到我想要的。

标签： pythonpandas

python - 匹配 Pandas Dataframe 中的两列，但顺序很重要

问题描述

解决方案

推荐阅读