首页 > 解决方案 > After using groupby, how do I get all values from multiple rows into a list?

问题描述

I have a DataFrame with names of people, dates, start/end times, and durations. I want to group by name and date, sum the Duration, and also "sum" the Start and End values by throwing them into a list.

df = pd.DataFrame([
    ['Bar', '2/18/2019', '7AM', '9AM',120],
    ['Bar', '2/18/2019', '9AM', '11AM',120],
    ['Foo', '2/18/2019', '10AM', '12PM',120],
    ],
    columns=['Name', 'Date', 'Start','End','Duration'])

Looking to turn this...

Into this...

Where I am using groupby to get the sum of Duration for Name and Date...

df.groupby(['Name','Date'])['Duration'].sum().reset_index()

...but having a heck of a time trying to figure out how to throw all of those times into a list. I've tried .apply and building a dictionary where the key is Name+date and the value is the list, but to no avail.

Any hints or gentle nudges in the right direction?

标签: pythonpandas

解决方案


尝试这个:

df['Time'] = df['Start'] + '-' + df['End']

df.groupby(['Name', 'Date']).apply(lambda x: pd.Series({
    'Duration': x['Duration'].sum(),
    'Times': x['Time'].values
}))

Times现在包含ndarrays 个字符串。


推荐阅读