首页 > 解决方案 > 闰年的熊猫 groupby 失败

问题描述

我想在我的时间序列数据上绘制不同的图表。

我的问题是当我包含闰年的年份时它会失败:

groups = daily_incidents_df.groupby(Grouper(freq='A'))
years = pd.DataFrame()
for name, group in groups:
  print(group)
  years[name.year] = group.values.squeeze()
years.boxplot()
plt.show()

输出:

            num_incidents
date                     
2015-01-01            175
2015-01-02             84
2015-01-03             94
2015-01-04             90
2015-01-05             78
...                   ...
2015-12-27            138
2015-12-28            113
2015-12-29            103
2015-12-30             90
2015-12-31            110

[365 rows x 1 columns]
            num_incidents
date                     
2016-01-01            183
2016-01-02            110
2016-01-03            134
2016-01-04            105
2016-01-05            102
...                   ...
2016-12-27            135
2016-12-28            134
2016-12-29            145
2016-12-30            111
2016-12-31            159

[366 rows x 1 columns]
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-17-6eb0a1a15c64> in <module>()
      3 for name, group in groups:
      4   print(group)
----> 5   years[name.year] = group.values.squeeze()
      6 years.boxplot()
      7 plt.show()

3 frames
/usr/local/lib/python3.6/dist-packages/pandas/core/internals/construction.py in sanitize_index(data, index, copy)
    609 
    610     if len(data) != len(index):
--> 611         raise ValueError("Length of values does not match length of index")
    612 
    613     if isinstance(data, ABCIndexClass) and not copy:

ValueError: Length of values does not match length of index

标签: pythonpandasmatplotlibpandas-groupbyboxplot

解决方案


你可以做连接:

groups = df.groupby(pd.Grouper(freq='A')),

years = pd.concat([pd.Series(x.values.flatten(), name=y) 
                   for y,x in groups],
                  axis=1)

years.boxplot()

输出:

在此处输入图像描述

这给出了(注意 xtick 标签):

在此处输入图像描述

但是,我会做而不是使用Grouper

groups = df.groupby(df.index.year)

推荐阅读