首页 > 解决方案 > 情节重叠不同年份的趋势

问题描述

我有这样多年的数据:

         time     value
0  2015-01-01  0.982295
1  2015-02-01  3.283557
2  2015-03-01  2.665395
3  2015-04-01  3.124564
4  2015-05-01  4.747362
5  2015-06-01  4.436057
6  2015-07-01  3.925824
7  2015-08-01  4.772219
8  2015-09-01  5.313609
9  2015-10-01  6.213427
10 2015-11-01  6.870897
11 2015-12-01  8.130550
12 2016-01-01  1.984611
13 2016-02-01  1.782809
14 2016-03-01  2.904271
15 2016-04-01  3.029645
16 2016-05-01  3.810806
17 2016-06-01  4.365906
18 2016-07-01  3.922678
19 2016-08-01  4.354115
20 2016-09-01  5.376155
21 2016-10-01  7.290028
22 2016-11-01  6.504523
23 2016-12-01  7.689338
24 2017-01-01  2.158096
25 2017-02-01  1.983260
26 2017-03-01  3.774609
27 2017-04-01  3.570528
28 2017-05-01  3.283161
29 2017-06-01  3.834184
30 2017-07-01  4.388914
31 2017-08-01  5.035261
32 2017-09-01  4.844120
33 2017-10-01  6.206708
34 2017-11-01  6.198993
35 2017-12-01  7.220857
36 2018-01-01  1.346803
37 2018-02-01  2.361194
38 2018-03-01  3.478777
39 2018-04-01  4.093510
40 2018-05-01  3.730770
41 2018-06-01  3.612807
42 2018-07-01  5.524375
43 2018-08-01  5.604300
44 2018-09-01  6.412848
45 2018-10-01  5.463882
46 2018-11-01  6.224526
47 2018-12-01  7.082455
48 2019-01-01  0.893474
49 2019-02-01  1.393201
50 2019-03-01  3.163579
51 2019-04-01  3.506390
52 2019-05-01  3.564924
53 2019-06-01  4.852669
54 2019-07-01  4.087379
55 2019-08-01  4.800931
56 2019-09-01  4.907763
57 2019-10-01  7.235331
58 2019-11-01  6.841004
59 2019-12-01  7.854044

我想绘制与其他重叠的每一年的趋势。
我试过这段代码:

df['year'] = df['time'].dt.year

for year in df['year'].unique():
    plt.plot(df[df['year'] == year]['time'],
             df[df['year'] == year]['value'])

plt.show()

但我明白了:

在此处输入图像描述

我希望这些线从一月到十二月与 x 轴重叠。
我已经找到了这个问题,但我不想在我的数据框和多索引上使用 groupby 函数。

标签: pythonpandasdataframedatetimematplotlib

解决方案


你可以提取:

  • 要设置为 x 轴的月份
  • 用作过滤器的年份

time列中并使用它们来绘制您的数据,如以下代码所示:

import pandas as pd
import matplotlib.pyplot as plt
import matplotlib.dates as md

df = pd.read_csv('data.csv')

df['time'] = pd.to_datetime(df['time'], format = '%Y-%m-%d')
df['month'] = pd.to_datetime(df['time'].dt.month, format = '%m')
df['year'] = df['time'].dt.year

fig, ax = plt.subplots(figsize = (16, 8))

for year in df['year'].unique():
    ax.plot(df[df['year'] == year]['month'],
            df[df['year'] == year]['value'],
            label = year)

ax.xaxis.set_major_locator(md.MonthLocator())
ax.xaxis.set_major_formatter(md.DateFormatter('%b'))
ax.set_xlim([df['month'].iloc[0], df['month'].iloc[-1]])

plt.legend()
plt.show()

这给出了这个情节:

在此处输入图像描述

或者,更好的是,使用 sns.lineplot 以避免 for 循环:

import pandas as pd
import matplotlib.pyplot as plt
import matplotlib.dates as md
import seaborn as sns

df = pd.read_csv('data.csv')

df['time'] = pd.to_datetime(df['time'], format = '%Y-%m-%d')
df['month'] = pd.to_datetime(df['time'].dt.month, format = '%m')
df['year'] = df['time'].dt.year

fig, ax = plt.subplots(figsize = (16, 8))

palette = sns.color_palette('Set1', n_colors = len(df['year'].unique()))
sns.lineplot(ax = ax,
             data = df,
             x = 'month',
             y = 'value',
             hue = 'year',
             palette = palette,
             ci = None)

ax.xaxis.set_major_locator(md.MonthLocator())
ax.xaxis.set_major_formatter(md.DateFormatter('%b'))
ax.set_xlim([df['month'].iloc[0], df['month'].iloc[-1]])

plt.legend()
plt.show()

这给出了几乎相同的情节:

在此处输入图像描述


推荐阅读