python - 数据透视表中的条形图,包含总计和每组聚合的百分比
问题描述
这是挑战:从 shipwreck.csv 文件中创建一个数据框。从这个数据框中,构建一个数据透视表,显示每个班级中男性/女性的平均票价,以及每个班级中幸存的男性/女性人数。行索引应该是类值。使用边距包括每个舱位中所有男性、女性和所有乘客的平均值。打印整个框架。然后创建一个条形图,显示每个班级的男性和女性以及所有乘客的存活百分比。在上一个问题中使用数据透视表中的数据。条的宽度应为 0.25。
我的问题是我只使用那些指定的列构建了数据框,但我不明白如何获取数据框数据透视表并找到男性/女性的平均票价以便能够设置图表。
到目前为止,这是我的代码:
%matplotlib inline
import pandas as pd
import numpy as np
import matplotlib
import matplotlib.pyplot as plt
plt.style.use('seaborn-whitegrid')
matplotlib.rcParams['figure.figsize'] = (10.0, 4.0)
df = pd.read_csv("shipwreck.csv",usecols=
['survived','sex','fare','class'])
df.set_index('survived')
print(df)
#pivot table to get average fares for male/female then plot it
#use bar graph w/ width of.25 for bars
这是 .csv 从数据框中显示的内容:
survived sex fare class
0 0 male 7.2500 Third
1 1 female 71.2833 First
2 1 female 7.9250 Third
3 1 female 53.1000 First
4 0 male 8.0500 Third
5 0 male 8.4583 Third
6 0 male 51.8625 First
7 0 male 21.0750 Third
8 1 female 11.1333 Third
9 1 female 30.0708 Second
10 1 female 16.7000 Third
11 1 female 26.5500 First
12 0 male 8.0500 Third
13 0 male 31.2750 Third
14 0 female 7.8542 Third
15 1 female 16.0000 Second
16 0 male 29.1250 Third
17 1 male 13.0000 Second
18 0 female 18.0000 Third
19 1 female 7.2250 Third
20 0 male 26.0000 Second
21 1 male 13.0000 Second
22 1 female 8.0292 Third
23 1 male 35.5000 First
24 0 female 21.0750 Third
25 1 female 31.3875 Third
26 0 male 7.2250 Third
27 0 male 263.0000 First
28 1 female 7.8792 Third
29 0 male 7.8958 Third
.. ... ... ... ...
861 0 male 11.5000 Second
862 1 female 25.9292 First
863 0 female 69.5500 Third
864 0 male 13.0000 Second
865 1 female 13.0000 Second
866 1 female 13.8583 Second
867 0 male 50.4958 First
868 0 male 9.5000 Third
869 1 male 11.1333 Third
870 0 male 7.8958 Third
871 1 female 52.5542 First
872 0 male 5.0000 First
873 0 male 9.0000 Third
874 1 female 24.0000 Second
875 1 female 7.2250 Third
876 0 male 9.8458 Third
877 0 male 7.8958 Third
878 0 male 7.8958 Third
879 1 female 83.1583 First
880 1 female 26.0000 Second
881 0 male 7.8958 Third
882 0 female 10.5167 Third
883 0 male 10.5000 Second
884 0 male 7.0500 Third
885 0 female 29.1250 Third
886 0 male 13.0000 Second
887 1 female 30.0000 First
888 0 female 23.4500 Third
889 1 male 30.0000 First
890 0 male 7.7500 Third
[891 rows x 4 columns]
这是条形图的样子:
解决方案
以下是您可以执行的操作:
df = pd.read_csv('shipwreck.csv', usecols=['survived', 'sex', 'class'])
df_piv = pd.pivot_table(df,
index='class',
columns='sex',
aggfunc=lambda x: 100*x.sum()/x.count(), # % per group
margins=True,
margins_name='Combined')
df_piv.columns = df_piv.columns.droplevel()
df_piv.plot.bar(rot='horizontal');
推荐阅读
- node.js - 在 SSH 中缺少 Elastic Beanstalk 环境变量
- c - 为什么没有第三个变量的交换在这里不起作用?
- angular - 数组在一个函数中被称为对象,这在 Angular Formly 中使用格式化程序时会导致错误
- apache - 将到期时间设置为某个过去的值
- java - 解析文件并将单行拆分为二维数组
- java - 从另一个应用程序返回共享时如何隐藏键盘 onResume
- android - 比较 Rx 中的两个可流动的流
- c++ - 为什么我的代码没有为表达式字符串计算正确的值?
- apache-spark - 找不到 pyspark.zip,应用程序 application_1558064260263_0001 由于 AM Container 失败了 2 次
- django - -DJANGO- 我的以下功能似乎不起作用?