首页 > 解决方案 > Pandas:将列分组为时间序列

问题描述

考虑这组数据:

data = [{'Year':'1959:01','0':138.89,'1':139.39,'2':139.74,'3':139.69,'4':140.68,'5':141.17},
        {'Year':'1959:07','0':141.70,'1':141.90,'2':141.01,'3':140.47,'4':140.38,'5':139.95},
        {'Year':'1960:01','0':139.98,'1':139.87,'2':139.75,'3':139.56,'4':139.61,'5':139.58}]

如何转换为 Pandas 时间序列,如下所示:

Year    Value
1959-01 138.89  
1959-02 139.39  
1959-03 139.74
...
1959-07 141.70
1959-08 141.90
...

标签: pythonpandastime-series

解决方案


这是一种方法

s = pd.DataFrame(data).set_index("Year").stack()
s.index = pd.Index([pd.to_datetime(start, format="%Y:%m") + pd.DateOffset(months=int(off))
                    for start, off in s.index], name="Year")
df = s.to_frame("Value")

首先我们设置Year为索引,然后将值堆叠在它旁边。然后通过可用日期 + 其他值作为月份偏移量从当前索引准备一个索引。最后转到新列名称为 的框架Value

要得到

>>> df

             Value
Year
1959-01-01  138.89
1959-02-01  139.39
1959-03-01  139.74
1959-04-01  139.69
1959-05-01  140.68
1959-06-01  141.17
1959-07-01  141.70
1959-08-01  141.90
1959-09-01  141.01
1959-10-01  140.47
1959-11-01  140.38
1959-12-01  139.95
1960-01-01  139.98
1960-02-01  139.87
1960-03-01  139.75
1960-04-01  139.56
1960-05-01  139.61
1960-06-01  139.58

推荐阅读