首页 > 解决方案 > 使用 pandas 对数据框进行按列映射和操作

问题描述

我有两个数据框,即df1df2。我想对New_Amount_Dollar来自的列执行操作df2。基本上,df1我有历史货币数据,我想执行给定的按日期操作,CurrencyAmount_Dollardf2计算New_Amount_Dollar列中的值df2

例如,在df2我有第一种货币AUDfor Date = '01-01-2019',所以我想计算New_Amount_Dollar价值这样

New_Amount_Dollar=Amount_Dollar/AUD来自df1 ie的值New_Amount_Dollar = 19298/98 = 196.91

另一个例子,df2我有第三种货币COPDate = '03-01-2019所以我想计算New_Amount_Dollar价值,这样

New_Amount_Dollar = Amount_Dollar/COP来自df1 ie的值New_Amount_Dollar = 5000/0.043 = 116279.06

import pandas as pd
data1 = {'Date':['01-01-2019', '02-01-2019', '03-01-2019', '04-01-2019','05-01-2019'],
        'AUD':[98, 98.5, 99, 99.5, 97],
        'BWP':[30,31,33,32,31],
        'CAD':[0.02,0.0192,0.0196,0.0196,0.0192],
        'BND':[0.99,0.952,0.970,0.980,0.970],
        'COP':[0.05,0.047,0.043,0.047,0.045]}
df1 = pd.DataFrame(data1)

data2 = {'Date':['01-01-2019', '02-01-2019', '03-01-2019', '04-01-2019','05-01-2019'],
        'Currency':['AUD','AUD','COP','CAD','BND'],
        'Amount_Dollar':[19298, 19210, 5000, 200, 2300],
        'New_Amount_Dollar':[0,0,0,0,0]
        }
df2 = pd.DataFrame(data2) 

df1

         Date   AUD  BWP     CAD    BND    COP
0  01-01-2019  98.0   30  0.0200  0.990  0.050
1  02-01-2019  98.5   31  0.0192  0.952  0.047
2  03-01-2019  99.0   33  0.0196  0.970  0.043
3  04-01-2019  99.5   32  0.0196  0.980  0.047
4  05-01-2019  97.0   31  0.0192  0.970  0.045
df2

         Date Currency  Amount_Dollar  New_Amount_Dollar
0  01-01-2019      AUD          19298                  0
1  02-01-2019      AUD          19210                  0
2  03-01-2019      COP           5000                  0
3  04-01-2019      CAD            200                  0
4  05-01-2019      BND           2300                  0

预期结果

         Date Currency  Amount_Dollar  New_Amount_Dollar
0  01-01-2019      AUD          19298             196.91
1  02-01-2019      AUD          19210             195.02
2  03-01-2019      COP           5000          116279.06
3  04-01-2019      CAD            200           10204.08
4  05-01-2019      BND           2300            2371.13

标签: pythonpandas

解决方案


用于数组和除DataFrame.lookup列:DataFrame.set_indexAmount_Dollar

arr = df1.set_index('Date').lookup(df2['Date'], df2['Currency'])

df2['New_Amount_Dollar'] = df2['Amount_Dollar'] /  arr
print (df2)
         Date Currency  Amount_Dollar  New_Amount_Dollar
0  01-01-2019      AUD          19298         196.918367
1  02-01-2019      AUD          19210         195.025381
2  03-01-2019      COP           5000      116279.069767
3  04-01-2019      CAD            200       10204.081633
4  05-01-2019      BND           2300        2371.134021

但如果日期时间不匹配,请使用DataFrame.asfreq

import pandas as pd
data1 = {'Date':['01-01-2019', '02-01-2019', '03-01-2019', 
                 '04-01-2019','05-01-2019','08-01-2019'],
        'AUD':[98, 98.5, 99, 99.5, 97,100],
        'BWP':[30,31,33,32,31,20],
        'CAD':[0.02,0.0192,0.0196,0.0196,0.0192,0.2],
        'BND':[0.99,0.952,0.970,0.980,0.970,.23],
        'COP':[0.05,0.047,0.043,0.047,0.045,0.023]}
df1 = pd.DataFrame(data1)

data2 = {'Date':['01-01-2019', '02-01-2019', '03-01-2019', '04-01-2019','07-01-2019'],
        'Currency':['AUD','AUD','COP','CAD','BND'],
        'Amount_Dollar':[19298, 19210, 5000, 200, 2300],
        'New_Amount_Dollar':[0,0,0,0,0]
        }
df2 = pd.DataFrame(data2) 
print (df1)
         Date    AUD  BWP     CAD    BND    COP
0  01-01-2019   98.0   30  0.0200  0.990  0.050
1  02-01-2019   98.5   31  0.0192  0.952  0.047
2  03-01-2019   99.0   33  0.0196  0.970  0.043
3  04-01-2019   99.5   32  0.0196  0.980  0.047
4  05-01-2019   97.0   31  0.0192  0.970  0.045
5  08-01-2019  100.0   20  0.2000  0.230  0.023

print (df2)
         Date Currency  Amount_Dollar  New_Amount_Dollar
0  01-01-2019      AUD          19298                  0
1  02-01-2019      AUD          19210                  0
2  03-01-2019      COP           5000                  0
3  04-01-2019      CAD            200                  0
4  07-01-2019      BND           2300                  0

df1['Date'] = pd.to_datetime(df1['Date'], dayfirst=True)
df2['Date'] = pd.to_datetime(df2['Date'], dayfirst=True)

print (df1.set_index('Date').asfreq('D', method='ffill'))
              AUD  BWP     CAD    BND    COP
Date                                        
2019-01-01   98.0   30  0.0200  0.990  0.050
2019-01-02   98.5   31  0.0192  0.952  0.047
2019-01-03   99.0   33  0.0196  0.970  0.043
2019-01-04   99.5   32  0.0196  0.980  0.047
2019-01-05   97.0   31  0.0192  0.970  0.045
2019-01-06   97.0   31  0.0192  0.970  0.045
2019-01-07   97.0   31  0.0192  0.970  0.045
2019-01-08  100.0   20  0.2000  0.230  0.023

arr = df1.set_index('Date').asfreq('D', method='ffill').lookup(df2['Date'], df2['Currency'])

df2['New_Amount_Dollar'] = df2['Amount_Dollar'] /  arr
print (df2)
        Date Currency  Amount_Dollar  New_Amount_Dollar
0 2019-01-01      AUD          19298         196.918367
1 2019-01-02      AUD          19210         195.025381
2 2019-01-03      COP           5000      116279.069767
3 2019-01-04      CAD            200       10204.081633
4 2019-01-07      BND           2300        2371.134021

推荐阅读