首页 > 解决方案 > 如何注释Seaborn中不包括异常值的箱形图的最小值?

问题描述

我有一个包含一些异常值的数据集,我想在 Python 中注释箱形图的底部。这是我现在面临的一个例子:

data = {'theta': [1 for i in range(0,10)],
       'error': [10,20,21,22,23,24,25,26,27,28]}

df = pd.DataFrame(data=data)
df

fig,ax1 = plt.subplots(figsize=(8,5))
box_plot = sns.boxplot(x="theta", y='error', data=df, ax = ax1, showfliers = False)
min_value = df.groupby(['theta'])['error'].min().values
for xtick in box_plot.get_xticks():
    idx = df[df['error']==min_value[xtick]].index.values
    text = 'The minimum value before outliers is here'
    box_plot.text(xtick,min_value[xtick]+2, text, 
            horizontalalignment='center',size='x-small',weight='semibold')
    box_plot.plot(xtick,min_value[xtick], marker='*', markersize=20 )

这不会产生我想要的 在此处输入图像描述

相反,我想得到这个

在此处输入图像描述

对于此示例,我可以手动获取,但我想要一种更系统的方法,可以将其推广到其他实例。

标签: pythonpandasmatplotlibgraphseaborn

解决方案


根据这个答案,Seaborn 箱线图使用 matplotlib 为绘制​​的箱线图生成胡须和四分位数,因此您可以使用matplotlib.cbook.boxplot_stats.

您还可以使用 修改 yticks 的范围ax1.set_yticks,以防您想显示异常值 10。

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

from matplotlib.cbook import boxplot_stats

data = {'theta': [1 for i in range(0,10)],
       'error': [10,20,21,22,23,24,25,26,27,28]}

df = pd.DataFrame(data=data)
df

fig,ax1 = plt.subplots(figsize=(8,5))
box_plot = sns.boxplot(x="theta", y='error', data=df, ax = ax1, showfliers = False)
min_value = df.groupby(['theta'])['error'].min().values

## get the lower whisker
## you can retreive other boxplot values as well
low_whisker = boxplot_stats(df.error)[0]['whislo']

for xtick in box_plot.get_xticks():
    ## idx = df[df['error']==min_value[xtick]].index.values
    text = 'The minimum value before outliers is here'
    box_plot.text(xtick,low_whisker-2, text, 
            horizontalalignment='center',size='x-small',weight='semibold')
    box_plot.plot(xtick,low_whisker, marker='*', markersize=20)

## set the range to include the entire range of the data
ax1.set_yticks(np.arange(min(df.error),max(df.error)+5,5))

plt.show()

在此处输入图像描述


推荐阅读