首页 > 解决方案 > 如何合并多个数据框并将它们显示在 python 的一个箱线图中?

问题描述

我正在使用二进制分类数据集,我正在尝试绘制所有样本的年龄,类 == 1 的样本和类 == 0 的样本?我想知道如何合并 firstDf、secondDf 和 thirdDf 并将它们显示在 python 的一个箱线图中?

age | class
------------
 1 |  1
 2 |  1
 3 |  0
 4 |  1
 5 |  0
 6 |  1
 7 |  1
 8 |  0
 9 |  0
10 |  1



import pandas as pd
import matplotlib.pyplot as plt

data = [['age', 'class'],
 [1,1],
 [2,1],
 [3,0],
 [4,1],
 [5,0],
 [6,1],
 [7,1],
 [8,0],
 [9,0],
[10,1]]

firstDf = df['age']
secondDf = [df[df['class'] == 0]['age']]
thirdDf = [df[df['class'] == 1]['age']]

预期情节

在此处输入图像描述

标签: pythonpython-3.xmatplotlibseaborndata-science

解决方案


# subset dataframes
firstDf = df
secondDf = df[df['class'] == 0]
thirdDf = df[df['class'] == 1]

# combine dataframes and reset index
combined_df = pd.concat([firstDf, secondDf, thirdDf], 
                        keys=['All', 'Class0', 'Class1']).reset_index(level=0)

# drop column 'class'
combined_df = combined_df.drop('class', axis=1)

# rename columns
combined_df.columns = ['category', 'age']

# fix datatype
combined_df['age'] = combined_df['age'].astype('int')

# import seaborn
import seaborn as sns

# plot boxplot
sns.boxplot(data=combined_df, x='category', y='age')

在此处输入图像描述


推荐阅读