首页 > 解决方案 > pandas dataframe update values using groupby

问题描述

I have a pandas dataframe with id and date:

data = [{'id': 'a', 'date': 1, 'value':3}, {'id':'b', 'date': 1, 'value': 30},
    {'id': 'a', 'date': 2, 'value':5}, {'id':'b', 'date': 2, 'value': 20}] 
test_df = pd.DataFrame(data)

I want to loop over each date and do some calculation with value column to get an adjusted_value column:

for idx, daily_df in test_df.groupby('date'):
    daily_df['adj_value'] = some functions

I have two questions:

  1. I am getting a warning from this: A value is trying to be set on a copy of a slice from a DataFrame. Try using .loc[row_indexer,col_indexer] = value instead
  2. I want to add the adj_value column to the original test_df

标签: pythonpandasdataframepandas-groupby

解决方案


无需循环遍历groupby()结果。 NamedAgg()做你想做的事。出于示例的目的,我已经合成了一个函数,通常我会使用内置函数或lambda函数

data = [{'id': 'a', 'date': 1, 'value':3}, {'id':'b', 'date': 1, 'value': 30},
    {'id': 'a', 'date': 2, 'value':5}, {'id':'b', 'date': 2, 'value': 20}] 
test_df = pd.DataFrame(data)

def myfunc(x):
    x = list(x)
    if len(x)>1: return x[0] * x[1]
    else: return x[0]

test_df.groupby("date").agg(adj_value=pd.NamedAgg(column="value", aggfunc=myfunc))

输出

      adj_value
date           
1            90
2           100

推荐阅读