首页 > 解决方案 > 如何仅注释堆叠条形图的一个类别

问题描述

我有一个数据框示例,如下所示。

data = {'Date':['2021-07-18','2021-07-19','2021-07-20','2021-07-21','2021-07-22','2021-07-23'],
    'Invalid':["NaN", 1, 1, "NaN", "NaN", "NaN"],
    'Negative':[23, 24, 17, 24, 20, 23],
    'Positive':["NaN", 1, 1, 1, "NaN", 1]}

df_sample = pd.DataFrame(data) 
df_sample

在此处输入图像描述

下面给出了显示堆叠条形图的代码以及由它生成的图形。

temp = Graph1_df.set_index(['Dates', 'Results']).sort_index(0).unstack()
temp.columns = temp.columns.get_level_values(1)

f, ax = plt.subplots(figsize=(20, 5))
temp.plot.bar(ax=ax, stacked=True, width = 0.3, color=['blue','green','red'])
ax.title.set_text('Total Test Count vs Dates') 


plt.show()

在此处输入图像描述

使用上面的代码或任何新方法,我只想在图表上显示“正”的值。注意:数据框片段中的第三列是“正”列。

任何帮助是极大的赞赏。谢谢。

标签: pythondataframematplotlibdata-visualizationbar-chart

解决方案


  • pandas.DataFrame.plot用with绘图kind='bar'
  • 用于.bar_label添加注释
    • 有关与相关的其他链接和选项,请参阅此答案.bar_label
  • 堆积条形图分别根据列和行的顺序从左到右和从下到上的顺序绘制。
    • 由于'Positive'是列索引 2,我们只需要标签i == 2
  • 测试pandas 1.3.0并要求matplotlib >=3.4.2python >=3.8
    • 的列表推导式labels使用赋值表达式 ,:=它只能从python 3.8
      • labels = [f'{v.get_height():.0f}' if ((v.get_height()) > 0) and (i == 2) else '' for v in c]是没有的选项:=
    • .bar_label只能从matplotlib 3.4.2
      • 这个答案显示了如何为matplotlib <3.4.2
import pandas as pd
import numpy as np  # used for nan

# test dataframe
data = {'Date':['2021-07-18','2021-07-19','2021-07-20','2021-07-21','2021-07-22','2021-07-23'],
    'Invalid':[np.nan, 1, 1, np.nan, np.nan, np.nan],
    'Negative':[23, 24, 17, 24, 20, 23],
    'Positive':[np.nan, 1, 1, 1, np.nan, 1]}

df = pd.DataFrame(data)

# convert the Date column to a datetime format and use the dt accessor to get only the date component
df.Date = pd.to_datetime(df.Date).dt.date

# set Date as index
df.set_index('Date', inplace=True)

# create multi-index column to match OP image
top = ['Size']
current = df.columns
df.columns = pd.MultiIndex.from_product([top, current], names=['', 'Result'])

# display(df)
              Size                  
Result     Invalid Negative Positive
Date                                
2021-07-18     NaN       23      NaN
2021-07-19     1.0       24      1.0
2021-07-20     1.0       17      1.0
2021-07-21     NaN       24      1.0
2021-07-22     NaN       20      NaN
2021-07-23     NaN       23      1.0

# reset the top index to a column
df = df.stack(level=0).rename_axis(['Date', 'Size']).reset_index(level=1)

# if there are many top levels that are reset as a column, then select the data to be plotted
sel = df[df.Size.eq('Size')]

# plot
ax = sel.iloc[:, 1:].plot(kind='bar', stacked=True, figsize=(20, 5), title='Total Test Count vs Dates', color=['blue','green','red'])

# add annotations
for i, c in enumerate(ax.containers):
    
    # format the labels
    labels = [f'{w:.0f}' if ((w := v.get_height()) > 0) and (i == 2) else '' for v in c]
    
    # annotate with custom labels
    ax.bar_label(c, labels=labels, label_type='center', fontsize=10)

    # pad the spacing between the number and the edge of the figure
    ax.margins(y=0.1)

在此处输入图像描述


推荐阅读