首页 > 解决方案 > 比较python中的两列并在第一个表中返回匹配项

问题描述

帮助循环比较不同表中的两列并将匹配项返回到第一个表。

data1:
|name   | revenue |
|-------|---------|
|Alice  | 700     |
|Bob    | 1000    |
|Gerry  | 300     |
|Alex   | 600     |
|Kyle   | 800     |
data2:
|Name   | revenue |
|-------|---------|
|Bob    | 900     |
|Gerry  | 400     |
result data1:
|name   | revenue  |  name_result |
|-------|----------|--------------|
|Alice  | 700      |              |
|Bob    | 1000     |  Bob         |
|Gerry  | 300      |  Gerry       |
|Alex   | 600      |              |
|Kyle   | 800      |              |

我尝试使用此代码,但得到了所有空值:

import pandas as pd
import numpy as np

def group_category(category):
    for name in data['name']: 
        if name in data2['Name']:
            return name
        else: name = ''
        return name 
data['name_result'] = data['name'].apply(group_category)

标签: pythoncomparecycle

解决方案


利用:

def group_category(category):
    if category in df2['Name'].unique():
            return category
    else:
        return ''

#Finally:
#Since you are going to use this function on Series so used map() in place of apply()
df1['name_result']=df1['name'].map(group_category)

或者

通过isin()where()

df1['name_result']=df1['name'].where(df1['name'].isin(df2['Name']),'')

推荐阅读