首页 > 解决方案 > python pandas数据框多索引连接

问题描述

在以下示例中,如何连接具有相同多索引的两个数据帧?

数据框1:

EOAN
                  Close
DateTime   Stock       
2021-02-27 EOAN   8.450
2021-03-06 EOAN   8.436
2021-03-13 EOAN   8.812
2021-03-20 EOAN   8.820
2021-03-24 EOAN   9.084

数据框2:

SAP
                   Close
DateTime   Stock        
2021-02-27 SAP    102.06
2021-03-06 SAP    101.78
2021-03-13 SAP    103.04
2021-03-20 SAP    103.60
2021-03-24 SAP    103.06
                       0      1

当代码被执行时,我得到以下结果:

DateTime   Stock               
2021-02-27 EOAN      NaN  8.450
           SAP    102.06    NaN
2021-03-06 EOAN      NaN  8.436
           SAP    101.78    NaN
2021-03-13 EOAN      NaN  8.812
           SAP    103.04    NaN
2021-03-20 EOAN      NaN  8.820
           SAP    103.60    NaN
2021-03-24 EOAN      NaN  9.084
           SAP    103.06    NaN

我得到这样的数据框:

for stock in stocks:

    df = pandas.DataFrame(app.data, columns=['DateTime', 'Close'])
    df['DateTime'] = pandas.to_datetime(df['DateTime'], yearfirst=False)
    df['Stock'] = my_stock
    df = df.set_index(['DateTime', 'Stock'])
    app.data.clear()
    
    if df_all is None:
        df_all = df
    else:
        df_all = pandas.concat([df,df_all], axis = 1)

df_all.stack()
print(df_all)

我试图得到的是以下结果,它也适用于两种以上的股票:

DateTime   Stock   Close            
2021-02-27 EOAN    8.450  
           SAP    102.06
2021-03-06 EOAN    8.436  
           SAP    101.78
2021-03-13 EOAN    8.812  
           SAP    103.04
2021-03-20 EOAN    8.820 
           SAP    103.60    
2021-03-24 EOAN    9.084  
           SAP    103.06    

标签: pythonpandasdataframeconcatenation

解决方案


样本数据:

df1 = pd.DataFrame.from_dict({'Close': {('2021-02-27', 'EOAN'): 8.45,
('2021-03-06', 'EOAN'): 8.436,
('2021-03-13', 'EOAN'): 8.812,
('2021-03-20', 'EOAN'): 8.82,
('2021-03-24', 'EOAN'): 9.084}})

df2 = pd.DataFrame({'Close': {('2021-02-27', 'SAP'): 102.06,
('2021-03-06', 'SAP'): 101.78,
('2021-03-13', 'SAP'): 103.04,
('2021-03-20', 'SAP'): 103.6,
('2021-03-24', 'SAP'): 103.06}})

沿着索引连接将创建 aMultiIndex作为 和 的索引的df1并集df2。要获得所需的输出,您可能希望sort_index()在连接后使用:

pd.concat([df1, df2], axis=0).sort_index()

推荐阅读