首页 > 解决方案 > 检查数据框单元格是否包含另一个数据框单元格中的值

问题描述

我正在尝试执行以下操作:

给定 df1 中的一行,如果 str(row['code']) 在 df2['code'] 的任何行中,那么我希望 df2['lamer_url_1'] 和 df2['shopee_url_1'] 中的所有这些行从 df1 中取相应的值。然后继续 df1['code'] 的下一行...

'''

===============

初始表:

df1

     code                  lamer_url_1                 shopee_url_1

0  L61B18H089                       b                            a

1  L61S19H014                       e                            d

2  L61S19H015                       z                            y

df2

  code             lamer_url_1   shopee_url_1   lamer_url_2  shopee_url_2

0 L61B18H089-F1424         NaN           NaN          NaN           NaN

1 L61S19H014-S1500         NaN           NaN          NaN           NaN

2 L61B18H089-F1424         NaN           NaN          NaN           NaN

===============

预期输出:

df2

   code              lamer_url_1  shopee_url_1  lamer_url_2  shopee_url_2
0  L61B18H089-F1424           b             a          NaN           NaN

1  L61S19H014-S1500           e             d          NaN           NaN

2  L61B18H089-F1424           b             a          NaN           NaN

'''

标签: pythonpandasdataframe

解决方案


我假设“df2”中“代码”的共同部分是“-”之前的字符。我还假设从“df1”我们想要“lamer_url_1”、“shopee_url_1”,从“df2”我们想要“lamer_url_2”、“shopee_url_2”(如果我错了,请在评论中纠正我,以便我可以完善代码):

df1.set_index(df1['code'], inplace=True)
df2.set_index(df2['code'].apply(lambda x: x.split('-')[0]), inplace=True)
df2.index.names = ['code_join']

df3 = pd.merge(df2[['code', 'lamer_url_2', 'shopee_url_2']],
               df1[['lamer_url_1', 'shopee_url_1']],
               left_index=True, right_index=True)

推荐阅读