python - For循环数据框python

问题描述

我有用df_civic列调用的数据框 - state ,rank, make/model, model year, thefts。我想计算每个的AVG和STD。theftsmodel year

数据框中的所有年份均采用：years_civic = list(pd.unique(df_civic['Model Year']))

我的循环如下所示：

for civic_year in years_civic:
    f = df_civic['Model Year'] == civic_year
    civic_avg = df_civic[f]['Thefts'].mean()
    civic_std = df_civic[f]['Thefts'].std()
    civic_std= np.round(car_std,2)
    civic_avg= np.round(car_avg,2)
    print(civic_avg, civic_std, np.sum(f))

但是输出不是我需要的，只有正确的输出是来自np.sum(f)

现在输出如下所示：

9.0 20.51 1
9.0 20.51 1
9.0 20.51 1
9.0 20.51 1
9.0 20.51 13
9.0 20.51 15
9.0 20.51 3
9.0 20.51 2

标签： pythonpandasdataframeloopsfor-loop

Pandas 为聚合数据提供了许多强大的功能。for通常最好在使用循环之前先考虑这些函数。

例如，您可以使用：

import pandas as pd
import numpy as np

df_civic.groupby("Model Year").agg({"theft": ["mean", np.std]})

关于您的代码，有一些奇怪的东西，car_std并且car_avg没有定义。

python - For循环数据框python

问题描述

解决方案

推荐阅读