首页 > 解决方案 > 使用Python在数据框中添加平均值列

问题描述

我想创建一个新的数据框,其中包含性别、孩子数量、保险价格以及个人是否吸烟。下面是我的数据框的示例。

Sex    Children Insurance Smoker
Male      3      392.48    Yes
Male      6      782.68    Yes
Male      6      438.21    No 
Female    1      125.98    Yes
Female    1      58.32     No
Female    4      585.12    Yes
Female    4      356.12    No

到目前为止,我使用代码得到了这个

df = pd.DataFrame(insurance).groupby(["sex", "children", "smoker"]).size()

#which outputs
sex      children   smoker
female   1          yes      1
         1          no       1
         4          yes      1
         4          no       1
male     3          yes      2
         6          yes      1
         6          no       1

我如何根据他们有多少孩子以及他们是否吸烟来为每个性别添加一列平均保险?我尝试添加 mean("insurance") 但得到了一个错误,当然。非常感谢你的帮助!

标签: pythondatatablegroupingaveragedata-cleaning

解决方案


df.groupby(["Sex", "Children", "Smoker"],as_index=False)["Insurance"].mean()

#output

    Sex Children Smoker Insurance
0   Female  1     No    58.32
1   Female  1     Yes   125.98
2   Female  4     No    356.12
3   Female  4     Yes   585.12
4   Male    3     Yes   392.48
5   Male    6     No    438.21
6   Male    6     Yes   782.68

那是你要的吗?

     Sex       Children Smoker  size mean
0     Female       1    No      1   58.32
1     Female       1    Yes     1   125.98
2     Female       4    No      1   356.12
3     Female       4    Yes     1   585.12
4       Male       3    Yes     1   392.48
5       Male       6    No      1   438.21
6       Male       6    Yes     1   782.68

推荐阅读