首页 > 解决方案 > 每组 seaborn countplot 的计数和色调百分比

问题描述

我在下面使用 psudo-data 创建了 countplot。我有每个图表的百分比。但是,我想为每个答案组写下色调的百分比。

伪数据:

import matplotlib.pyplot as plt
import seaborn as sns
import pandas as pd
import random

lo = 0
hi = 10
size = 256
random.seed(1)

answer = [random.randint(lo, hi) for _ in range(size)]
sex = [random.randint(0, 1) for _ in range(size)]

data = {'sex': sex, 'answer': answer} 
df = pd.DataFrame(data)

di = {0:'A',1:'B',2:'C',3:'D',4:'E',5:'F',
      6:'G',7:'H',8:'I',9:'J',10:'K'}
df = df.replace({'answer': di})

di = {0:'Male',1:'Female'}
df = df.replace({'sex': di})
df = df.sort_values(by=['answer','sex'])

#See count of groups:
pd.pivot_table(xx,
              index='answer',
              columns='sex',
              aggfunc='size')

我也在这里尝试过的图形:

#fig, ax = plt.subplots()

total = float(df.shape[0])

sns.set(rc={'figure.figsize':(22,10)})

ax = sns.countplot(y="answer", hue="sex", data=df)

# percentage of bars
for i in ax.patches:
    # get_width pulls left or right; get_y pushes up or down
    ax.text(i.get_width()+.12, i.get_y()+.3, \
            '%' + str(round((i.get_width()/total)*100, 1)), fontsize=15,
            color='dimgrey')
    
ax.set_ylabel('Answers',fontsize=20)
ax.set_xlabel('Count',fontsize=20)
ax.tick_params(axis='x', which='major', labelsize=20)
ax.tick_params(axis='y', which='major', labelsize=20)

plt.legend(bbox_to_anchor=(1.05, 1), loc='upper left', borderaxespad=0.,
          prop={'size': 14})

ax.set_title("""
XXXX
""", fontsize=20,loc='left')

#plt.savefig('test.png', dpi=400,bbox_inches='tight')

这是最终的视觉效果;注意:我有黑色圆圈的值,但我想要红色的。 在此处输入图像描述

标签: pythonpandasseaborn

解决方案


向原始数据框添加一列配置比率。在循环过程中将文本添加到该添加列的值。你必须要有创意,因为 Patches 的计数循环了 22 次,而组成比列的索引是 11 行,所以我们需要一个条件分支。

#See count of groups:
df1 = pd.pivot_table(df,
              index='answer',
              columns='sex',
              aggfunc='size')
df1['ratio'] = df1['Female'] / (df1['Female'] + df1['Male'])
df1
sex Female  Male    ratio
answer          
A   9   20  0.310345
B   8   10  0.444444
C   10  9   0.526316
D   13  11  0.541667
E   11  12  0.478261
F   7   10  0.411765
G   16  10  0.615385
H   15  9   0.625000
I   18  14  0.562500
J   9   13  0.409091
K   8   14  0.363636

import matplotlib.pyplot as plt
import seaborn as sns

total = float(df.shape[0])

sns.set(rc={'figure.figsize':(22,10)})

ax = sns.countplot(y="answer", hue="sex", data=df)

idx = df1.index.to_list()
n = 0
k = 0
# percentage of bars
for i in ax.patches:
    # get_width pulls left or right; get_y pushes up or down
    ax.text(i.get_width()+.12, i.get_y()+.3, '%' + str(round((i.get_width()/total)*100, 1)), fontsize=15, color='dimgrey')
    if n <= 10:
        ax.text(i.get_width()+.82, i.get_y()+.3, '%' + str(round(df1.loc[idx[n],'ratio']*100, 1)), fontsize=15, color='r')
        n += 1
    elif n >= 11:
        ax.text(i.get_width()+.82, i.get_y()+.3, '%' + str(round(100-df1.loc[idx[k],'ratio']*100, 1)), fontsize=15, color='r')
        n += 1
        k += 1

ax.set_ylabel('Answers',fontsize=20)
ax.set_xlabel('Count',fontsize=20)
ax.tick_params(axis='x', which='major', labelsize=20)
ax.tick_params(axis='y', which='major', labelsize=20)

plt.legend(bbox_to_anchor=(1.05, 1), loc='upper left', borderaxespad=0.,
          prop={'size': 14})

在此处输入图像描述


推荐阅读