python - Jupyter Notebook - 可视化问题,如何解决?
问题描述
我在 Jupyter Notebook/Lab 上遇到了有关 dataviz(更准确地说是使用 Seaborn)的问题(请参阅随附的屏幕截图)。
我一直在尝试在不同的 IDE(Pycharm 和 VCS)以及 Web 浏览器上运行我的脚本,结果是一样的。
你能帮我解决这个问题吗?
最好的,
罗密欧
# Question 1 : How many people are in titanic and how many survivors?
import pandas as pd
df = pd.read_csv('titanic.csv')
n_people = len(df)
print('Number of passenger :',n_people)
n_survived = len(df[df['Survived']==1])
print('Number of survivors :', n_survived)
# Question 2 : How many that survived were female and how many that died were female?
sur_f = df.loc[(df['Survived'] == 1) & (df['Sex']=='female')]
print('Survived and female :',len(sur_f))
died_f = df.loc[(df['Survived'] == 0) & (df['Sex']=='female')]
print('Died and female :',len(died_f))
# Question 3 : How many children were on the titanic?
children = df[df['Age']<18]
print('Number of children (under 18) :',len(children))
# Question 4 : How many children died that were on the ship?
died_c = children.loc[(children['Survived']==0)]
print('Number of children that died :',len(died_c))
# Question 5 : How many people had families with them?
family = df.loc[(df['SibSp']!=0) &(df['Parch']!=0)]
print('Number of people who had family (Siblings/Spouses or Parents/children) aboard :',len(family))
# Question 6 : What is the ratio of female to male?
num_female = len(df[df['Sex']=='female'])
num_male = len(df[df['Sex']=='male'])
ratio_female_male = (num_female / num_male)
ratio_f_t = (num_female/len(df))
ratio_m_t = (num_male/len(df))
print('The ratio female to male is :',round(ratio_female_male,2))
print('The ratio female to total passenger is :',round(ratio_f_t,2))
print('The ratio female to total passenger is :',round(ratio_m_t,2))
# Question 7 : What contributed to the survival of those who survived?
#Convert the male / female
df['Sex'] = df.Sex.map(lambda x: 0 if x == 'male' else 1)
#or
#gen = {'male' : 0, 'female' : 1}
#df['Sex'] df.Sex.map(gen)
import seaborn as sns
import matplotlib.pyplot as plt
correlation = df.corr(method='pearson')
plt.figure(figsize=(7,4))
plt.title('Correlation between Features', y=1.05, size = 15)
sns.heatmap(correlation,
cmap='RdBu_r',
annot=True,
linewidth=0.5)
plt.show()
print('The most influential factor is sex, with a correlation coefficient regarding Survived of : 0.54')
解决方案
推荐阅读
- r - R:按数据集分组,条件是组中特定列中的所有值都符合某些要求
- r - 如何在 R 中按中位数或众数填充 NA
- nginx - 为什么浏览器在响应 413 的情况下显示 CORS 错误?
- google-bigquery - 用于扩展 json 以记录的 BigQuery UDF
- javascript - 从对象的数据数组中填充视图 Angular
- r - R中删除的小数位
- google-apps-script - Google App Script 读取特定文件夹中的文件
- c - fscanf 上的分段错误
- java - 为什么 addNode() 方法返回编译错误“方法...类型...不适用于参数...”,我该如何解决?
- biztalk - 为什么 BizTalk Scope 没有捕获此 MissingPropertyException