首页 > 解决方案 > 在 Python 中表示直方图 x 轴内的区间

问题描述

我试图通过 Python 中的直方图来表示 percT 列。这是我的输入文件如下:

programName,reqMethID,countT,countN,countU,totalcount,percT,percN,percU
chess,1-9,0,1,0,1,0.0,100.0,0.0
chess,1-16,1,1,0,2,50.0,50.0,0.0
chess,1-4,1,2,0,3,33.33,66.67,0.0
chess,2-9,1,3,0,4,25.0,75.0,0.0
chess,2-16,1,4,0,5,20.0,80.0,0.0
chess,2-4,1,5,0,6,16.67,83.33,0.0
chess,3-9,1,6,0,7,14.29,85.71,0.0
chess,3-16,1,7,0,8,12.5,87.5,0.0
chess,3-4,1,8,0,9,11.11,88.89,0.0
chess,4-9,1,9,0,10,10.0,90.0,0.0
chess,4-16,1,10,0,11,9.09,90.91,0.0
chess,4-4,2,10,0,12,16.67,83.33,0.0
chess,5-9,2,11,0,13,15.38,84.62,0.0
chess,5-16,2,12,0,14,14.29,85.71,0.0
chess,5-4,2,13,0,15,13.33,86.67,0.0
chess,6-9,3,13,0,16,18.75,81.25,0.0
chess,6-16,3,14,0,17,17.65,82.35,0.0
chess,6-4,3,15,0,18,16.67,83.33,0.0
chess,7-9,4,15,0,19,21.05,78.95,0.0
chess,7-16,4,16,0,20,20.0,80.0,0.0
chess,7-4,4,17,0,21,19.05,80.95,0.0
chess,8-9,4,18,0,22,18.18,81.82,0.0
chess,8-16,4,19,0,23,17.39,82.61,0.0
chess,8-4,4,20,0,24,16.67,83.33,0.0
chess,1-10,0,1,0,1,0.0,100.0,0.0
chess,1-17,1,1,0,2,50.0,50.0,0.0
chess,2-10,1,2,0,3,33.33,66.67,0.0
chess,2-17,1,3,0,4,25.0,75.0,0.0
chess,3-10,1,4,0,5,20.0,80.0,0.0
chess,3-17,1,5,0,6,16.67,83.33,0.0
chess,4-10,1,6,0,7,14.29,85.71,0.0
chess,4-17,1,7,0,8,12.5,87.5,0.0
chess,5-10,1,8,0,9,11.11,88.89,0.0
chess,5-17,1,9,0,10,10.0,90.0,0.0
chess,6-10,2,9,0,11,18.18,81.82,0.0

这是我在 Python 中用来以直方图方式表示上述数据的代码:

    dataset = pd.read_csv( 'TNUPercentages.txt', sep= ',', index_col=False) 
X_ticks_array=[i for i in range(0, 100, 10)]
plt.xticks(X_ticks_array)


Tdata= dataset['percT']
print(Tdata.head())
plt.hist(Tdata);
plt.xlabel('Percentages of T')
plt.ylabel('Frequency')
plt.show()

问题是我得到了这张图。x 轴表示列内的值percT,y 轴表示这些值的频率。问题是很难区分在 x 轴上具有 0 的数据的频率与在 x 轴上具有 5 或在 x 轴上具有 10 的数据的频率。我希望 x 轴有 11 个 bin,每个 bin 代表以下每个间隔:0、(0-10]、(10,20]、(20-30]、(30-40]、(40-50) ], (50-60],(60-70], (70-80], (80-90], (90-100]),这些区间对应于percT列中的值,y 轴应该代表这种值在数据集中出现的频率。我该怎么做?

直方图

标签: pythonmatplotlibhistogram

解决方案


pandascutvalue_counts方法在这里会有所帮助:

fig, ax = pyplot.subplots(figsize=(6, 3.5))
(
    pandas.cut(data['percT'], bins=numpy.arange(0, 100, 10))
        .value_counts()
        .sort_index()
        .plot.bar(ax=ax)
)

在此处输入图像描述


推荐阅读