首页 > 解决方案 > Python seaborn,如何将xlabels格式化为图片所示?

问题描述

我想显示分组条形图:

import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
import seaborn as sns

np.random.seed(62918)
df_w = pd.DataFrame({'A': [np.random.randint(15) for _ in range(50)],
                    'B': [np.random.randint(15) for _ in range(50)],
                    'dtime': pd.date_range(start='2020-01-01 08:00:00', freq='30T' ,periods=50)
                   })
df_l = df_w.melt(id_vars='dtime').sort_values('dtime')

df_l 看起来像:

在此处输入图像描述

现在我试图绘制它:

fig, ax = plt.subplots(figsize = (12,6))    
ax = sns.barplot(data = df_l, x = "dtime", y = 'value', ax=ax, hue='variable')

这导致了这样的情节:

在此处输入图像描述

如您所见,xlabels 是无用的。如何将 xlabels 转换为这样的东西?

在此处输入图像描述

例如,午夜有一条线,每 3 小时打勾。如何在我的 xlabels 上获得这种类型的时间?

标签: pythonpandasmatplotlibseaborn

解决方案


使用线条而不是条形来可视化时间序列通常更有用(尤其是当您的示例数据集中有很多数据点时),这也许可以解释为什么 pandas 和 seaborn 条形图函数都不会自动处理日期时间变量来创建漂亮的标签。

假设您有充分的理由用条形而不是线来可视化您的时间序列,唯一的选择(据我所知;另请参见此答案)是手动创建所需的标签。这里有两种方法可以匹配示例中的格式。在这两种情况下,第一个时间点都标有时间和日期,其余的标签取决于您选择的时间频率。

请注意,我使用 dtime 变量作为索引创建数据框,因为这样可以比使用 seaborn 更方便地创建熊猫条形图,并且结果几乎相同。如果您决定坚持使用 seaborn,则标签的解决方案保持不变。

import numpy as np                 # v 1.19.2
import pandas as pd                # v 1.1.3
import matplotlib.pyplot as plt    # v 3.3.2

# Create a dataframe containing a random time series
np.random.seed(62918)
df = pd.DataFrame(dict(A = [np.random.randint(15) for _ in range(50)],
                       B = [np.random.randint(15) for _ in range(50)]),
                  index = pd.date_range(start='2020-01-01 08:00:00',
                                        freq='30T', periods=50))

# Select time frequency parameters for ticks and ticklabels
obs_per_hour = 2 # because datetime freq='30T'
time_step_hour = 4 # frequency of ticks in terms of hours
time_step = obs_per_hour*time_step_hour

# Create pandas bar plot
# Note: by default pandas automatically draws all numerical columns
# contained in the df (here A and B) and uses the index for the x axis
fig, ax = plt.subplots(figsize=(10,5))
df.plot.bar(ax=ax)

# Select ticks according to the selected time step
# Note: if the time step is not set to a factor of 8 (i.e. 1, 2, 4, 8),
# which is the first time point of the date range, the midnight label
# that contains the date will not be visible
xticks = ax.get_xticks()
ax.set_xticks(xticks[::time_step])

# Create custom tick labels by iterating through the timestamps of the
# df index using a list comprehension with an if-else statement to show
# date only when the day changes (first timestamp then every midnight)
# Format codes: https://docs.python.org/3/library/datetime.html#strftime-and-strptime-format-codes
xticklabels = [ts.strftime('%H:%M\n%d-%b') if ts.day != df.index.day[i-1]
               else ts.strftime('%H:%M') for i, ts in enumerate(df.index)]
ax.set_xticklabels(xticklabels[::time_step], rotation=0)

plt.legend(frameon=False)
plt.show()

barplot_labels1

如果您希望在午夜有日期标签而不必有小时标签,这是使用相同数据集的另一种解决方案。

obs_per_hour = 2
time_step_hour = 6
time_step = obs_per_hour*time_step_hour

fig, ax = plt.subplots(figsize=(10,5))
df.plot.bar(ax=ax)

# Create lists of ticks and labels depending on selected time step
custom_ticks = []
custom_labels = []
for idx in range(df.index.size):
    # Select ticks at time step and at midnights
    if idx%time_step == 0 or (df.index.minute[idx] == 0 and df.index.hour[idx] == 0):
        custom_ticks.append(idx)
    # Create labels showing hour:minutes and day-month at first timestamp
    # and at midnights when time step falls on midnights
    if idx == 0 or ((df.index.minute[idx] == 0 and df.index.hour[idx] == 0) and idx%time_step == 0):
        custom_labels.append(df.index[idx].strftime('%H:%M\n%d-%b'))
    # Create labels showing hour:minutes at each time step except at
    # midnights and at first timestamp
    if idx%time_step == 0 and not (df.index.minute[idx] == 0 and df.index.hour[idx] == 0) and not idx == 0:
        custom_labels.append(df.index[idx].strftime('%H:%M'))
    # Create labels showing day-month at midnights when time step does
    # not fall on midnights
    if (df.index.minute[idx] == 0 and df.index.hour[idx] == 0) and not idx%time_step == 0:
        custom_labels.append(df.index[idx].strftime('\n%d-%b'))
ax.set_xticks(custom_ticks)
ax.set_xticklabels(custom_labels, rotation=0)

plt.legend(frameon=False)
plt.show()

barplot_labels2


推荐阅读