首页 > 解决方案 > Pandas: Plot a histogram of times in intervals of 10 minutes

问题描述

I have a dataframe in this format:

    DATE        NAME        ARRIVAL TIME
275 2018-07-05  Adam    19:33:51.579885
276 2018-07-05  Bill    19:38:57.578135
277 2018-07-05  Cindy   19:40:24.704381
278 2018-07-05  Don     19:34:29.689414
279 2018-07-05  Eric    19:33:54.173609

I would like to plot a histogram of arrival times in fixed buckets, e.g. every 10 minutes.

Utilising the follow code from other answers, I've managed to produce the following histogram:

df['ARRIVAL TIME'] = pd.to_datetime(df['ARRIVAL TIME'])
plt.hist([t.hour + t.minute/60. for t in df['ARRIVAL TIME']], bins = 8)

enter image description here

That's close to what I want. However, I'd prefer the bins to be "7:30", "7:40", etc.

标签: pythonpandasmatplotlib

解决方案


如果您只想手动更改默认刻度标签(例如,请参阅此答案),则以下内容应该可以工作(在运行您已经完成的命令之后):

plt.draw()      # do this so that the labels are generated
ax = plt.gca()  # get the figure axes
xticks = ax.get_xticklabels()  # get the current x-tick labels
newlabels = []
for label in xticks:
    h, m = divmod(float(label.get_text())%12, 1)  # get hours and minutes (in 12 hour clock)
    newlabels.append('{0:02d}:{1:02d}'.format(int(h), int(m*60)))  # create the new label

ax.set_xticklabels(newlabels)  # set the new labels

但是,如果您想专门将直方图箱边缘设置为以 10 分钟为间隔,那么您可以执行以下操作:

import numpy as np

# get a list of the times
times = [t.hour + t.minute/60. for t in df['ARRIVAL TIME']]

# set the time interval required (in minutes)
tinterval = 10.

# find the lower and upper bin edges (on an integer number of 10 mins past the hour)
lowbin = np.min(times) - np.fmod(np.min(times)-np.floor(np.min(times)), tinterval/60.)
highbin = np.max(times) - np.fmod(np.max(times)-np.ceil(np.max(times)), tinterval/60.)
bins = np.arange(lowbin, highbin, tinterval/60.)  # set the bin edges

# create the histogram
plt.hist(times, bins=bins)
ax = plt.gca()  # get the current plot axes
ax.set_xticks(bins)  # set the position of the ticks to the histogram bin edges

# create new labels in hh:mm format (in twelve hour clock)
newlabels = []
for edge in bins:
    h, m = divmod(edge%12, 1)  # get hours and minutes (in 12 hour clock)
    newlabels.append('{0:01d}:{1:02d}'.format(int(h), int(m*60)))  # create the new label

ax.set_xticklabels(newlabels)  # set the new labels

推荐阅读