首页 > 解决方案 > 广播两个数据帧

问题描述

我有2个数据框如下:

第一个数据框data

             2019-06-19     2019-06-20     2019-06-21     2019-06-22     2019-06-23     2019-06-24     2019-06-25
 currency                                                                                                         
BCH          485.424079     485.424079      57.574609      57.559609      57.559609      57.559609      57.559609
BTC          202.204572     256.085103     197.291801     177.359726     177.359726     177.359726     252.859726
BTG         4065.370000    4065.370000    4065.370000    4065.370000    4065.370000    4065.370000    4065.370000
ETC        40001.000000   40001.000000   40001.000000   40001.000000   40001.000000   40001.000000       0.000000
ETH         4092.917231    4092.917231    1497.655594    1497.655594    1497.655594    1497.655594    1497.655594

第二个数据框sys_bal

created_at  2019-06-19  2019-06-20  2019-06-21  2019-06-22  2019-06-23  2019-06-24  2019-06-25
 currency                                                                                      
1WO            1997308     1996908     1996908     1996908     1996908     1996908     1996908
ABX             241444      241444      241444      241444      241444      241444      241444
ADH            5981797     5981797     5981797     5981797     5981797     5981797     5981797
ALX             385466      385466      385466      385466      385466      385466      385466
AMLT           4749604     4749604     4749604     4687869     4687869     4687869     4687869
BCH               4547        4547        4483        4463        4465        4467        4403
BRC            1231312     1231312     1231312     1231312     1231312     1231312     1231142
BTC               7366        7342        7287        7307        8292        8635        7772
BTRN          15236038    15236038    15236038    15236038    15236038    15236233    15236233

我尝试通过做一个与另一个相加pos_bal = sys_bal + data。它们的尺寸相同,但我有一个错误。

错误:

pos_bal = sys_bal + data
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/pandas/core/ops.py", line 1547, in f
other = _align_method_FRAME(self, other, axis)
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/pandas/core/ops.py", line 1481, in _align_method_FRAME
right = to_series(right)
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/pandas/core/ops.py", line 1456, in to_series
given_len=len(right)))
ValueError: Unable to coerce to Series, length must be 7: given 2

我打印了两个数据帧的 dtypes,我得到了以下信息:

第一个数据框:

2019-06-19    float64
2019-06-20    float64
2019-06-21    float64
2019-06-22    float64
2019-06-23    float64
2019-06-24    float64
2019-06-25    float64
dtype: object

第二个数据框:

   created_at
0  2019-06-19    int64
   2019-06-20    int64
   2019-06-21    int64
   2019-06-22    int64
   2019-06-23    int64
   2019-06-24    int64
   2019-06-25    int64
 dtype: object

data.info()输出:

<class 'pandas.core.frame.DataFrame'>
Index: 12 entries, BCH to XRP
Data columns (total 7 columns):
2019-06-20    12 non-null float64
2019-06-21    12 non-null float64
2019-06-22    12 non-null float64
2019-06-23    12 non-null float64
2019-06-24   12 non-null float64
2019-06-25    12 non-null float64
2019-06-26   12 non-null float64
dtypes: float64(7)
memory usage: 768.0+ bytes
None

sys_bal.info()输出:

<class 'pandas.core.frame.DataFrame'>
 Index: 126 entries, 1WO to ZPR
 Data columns (total 7 columns):
 2019-06-20    126 non-null int64
 2019-06-21    126 non-null int64
 2019-06-22    126 non-null int64
 2019-06-23    126 non-null int64
 2019-06-24    126 non-null int64
 2019-06-25    126 non-null int64
 2019-06-26    126 non-null int64
 dtypes: int64(7)
 memory usage: 7.9+ KB
 None

标签: pythonpandasdatetime

解决方案


data=pd.DataFrame({'currency':['BCH','BTC'],'2019-06-19 ':['485.424079','202.204572'],'2019-06-20':['485.424079','256.085103']})
sys_bal=pd.DataFrame({'currency':['1WO','ABX'],'2019-06-19 ':['1997308','241444'],'2019-06-20':['1996908','241444']})

编辑:如果您收到'dict' object has no attribute 'set_index' 它意味着您没有像我预期的那样使用数据框,请尝试在您的数据上使用:

data=pd.DataFrame.from_dict(data)
sys_bal=pd.DataFrame.from_dict(sys_bal)
data=data.set_index('currency')
sys_bal=sys_bal.set_index('currency')

df=pd.concat([data,sys_bal])
print(df)
         2019-06-19   2019-06-20
currency                        
BCH       485.424079  485.424079
BTC       202.204572  256.085103
1WO          1997308     1996908
ABX           241444      241444

它应该适合您,如果不尝试查看您的数据框,我会看到sys_bal您有额外的标题名称created_at


推荐阅读