首页 > 解决方案 > 通过在python中迭代来条件匹配行

问题描述

我有以下 dfs。

df1

Date          Dollar  EURO  GBP
12/03/2019    100      80   90
12/02/2019    101      81   89
12/01/2019    1000     79   91

df2

Product   Currency  Rate   Date
ABC        EURO      80    12/03/2019
xyz        USD       105   11/30/2019
ert        GBP       90    11/29/2019

基本上我想要的是在 df2 中放置一个新列(具有该货币和日期的汇率),并在 df1 和 df2 的汇率是否匹配时进一步添加一列。

Product   Currency  Rate   Date          df1.rate     Check
ABC        EURO      80    12/03/2019    80           Match
xyz        USD       105   11/30/2019    N/A          Not Match
ert        GBP       90    11/29/2019    N/A          Not Match 

我努力了 。

USD = df2['Currency'] == "US $"
GBP = df2['Currency'] == "GBP"
EURO = df2['Currency'] == "EURO"

if USD:
    df2['Check'] = df2['rate'] == df1['Dollar']
elif GBP:
    df2['Check'] = df2['rate'] == df1['GBP']
elif EURO:
    df2['Check'] = df2['rate'] == df1['EURO']

它在第 1 行给出以下错误。

ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

请提出补救措施。

标签: pythonpandasdataframe

解决方案


DataFrame.meltDataFrame.merge左连接和参数一起使用indicator,最后设置值由numpy.where

df = df1.melt('Date', var_name='Currency', value_name='Rate')

df2 = df2.merge(df, how='left', indicator='Check')
mask = df2['Check'].eq('both')
df2['Check'] = np.where(mask, 'Match','Not Match')
print (df2)
  Product Currency  Rate        Date      Check
0     ABC     EURO    80  12/03/2019      Match
1     xyz      USD   105  11/30/2019  Not Match
2     ert      GBP    90  11/29/2019  Not Match

如果需要还列df1.rate添加DataFrame.insert

df2.insert(len(df2.columns)-1, 'df1.rate', df2['Rate'].where(mask))
print (df2)
  Product Currency  Rate        Date  df1.rate      Check
0     ABC     EURO    80  12/03/2019      80.0      Match
1     xyz      USD   105  11/30/2019       NaN  Not Match
2     ert      GBP    90  11/29/2019       NaN  Not Match

详细说明:重塑相同格式数据的值,df1例如:df2melt

print (df1.melt('Date', var_name='Currency', value_name='Rate'))
         Date Currency  Rate
0  12/03/2019   Dollar   100
1  12/02/2019   Dollar   101
2  12/01/2019   Dollar  1000
3  12/03/2019     EURO    80
4  12/02/2019     EURO    81
5  12/01/2019     EURO    79
6  12/03/2019      GBP    90
7  12/02/2019      GBP    89

然后 merge使用左连接,指标参数创建带有信息的新列,如果在两个或左 DataFrame 中匹配:

print (df2.merge(df, how='left', indicator='Check'))
  Product Currency  Rate        Date      Check
0     ABC     EURO    80  12/03/2019       both
1     xyz      USD   105  11/30/2019  left_only
2     ert      GBP    90  11/29/2019  left_only

最后将值替换为mask

df2['Check'] = np.where(mask, 'Match','Not Match')
print (df2)
  Product Currency  Rate        Date      Check
0     ABC     EURO    80  12/03/2019      Match
1     xyz      USD   105  11/30/2019  Not Match
2     ert      GBP    90  11/29/2019  Not Match

推荐阅读