首页 > 解决方案 > 匹配查找导致错误 - 只能比较具有相同标签的系列对象

问题描述

我有以下内容:

df1['Combined'] = ['A', 'A', 'A', 'A', 'A', 'B', 'B', 'B', 'B', 'C', 'C', C', 'C']
df1['Quantity'] = [0, 60, 75, 149, 205, 500, 250, 300, 500, 40, 45, 75, 80]

df2['Combined'] = ['A', 'A', 'A', 'A', 'B', 'B','B','B', 'C', 'C', 'C']
df2['Min Q'] = [0, 50, 100, 150, 100, 0, 300, 400, 5, 50, 100] 
df2['Max Q'] = [49, 99, 149, 199, 199, 299, 399, 499, 60, 100, 149]

我想向 df1 添加一列,返回 df2 的范围。我尝试如下:

计算 df2['Range']:

df2['Range'] = df2['Min Q'].astype(float).astype(str) + ' - ' + df2['Max Q'].astype(float).astype(str)

要查找 df1['Range']:

def lookup_Range(Range):
    match = (df2['Min Q'].astype(float) <= df1['Quantity'].astype(float)) & (df2['Max Q'].astype(float) >= df1['Quantity'].astype(float)) & (df1['Combined'] == df2['Combined'])
    Range = df2['Range'][match]
    return Range.values[0]

df1['Quantity'].apply(lookup_Range)

但我收到以下错误:

Can only compare identically-labeled Series objects. 

我不确定我做错了什么。列重复自己,但我想我会在每个实例中得到一个独特的匹配。感谢你的帮助。

标签: pandasmatch

解决方案


IIUC,您需要:

bins = df2['Max Q'].tolist()
#[49, 99, 149, 199, 199, 299, 399, 499]
df1['bins']=pd.Series(np.searchsorted(bins, df1['Quantity'].values)).map(df2['Range'].to_dict())
print(df1)

  Combined  Quantity     bins
0        A         0     0-49
1        A        60    50-99
2        A        75    50-99
3        A       149  100-149
4        A       205    0-299
5        B       500      NaN
6        B       250    0-299
7        B       300  300-399
8        B       500      NaN

推荐阅读