首页 > 解决方案 > Find value and expand it to the grouping in pandas

问题描述

I need to transform pandas toy-dataframe like this (basically group by entity, find the value of v for which df['gr'] == 'x' and "expand" that value to the entire grouping):

entity  gr  v
0   A   x   1
1   A   y   2
2   A   z   3
3   B   z   4
4   B   x   5
5   B   y   6

to this form:

    entity  gr  v   new
0   A       x   1   1
1   A       y   2   1
2   A       z   3   1
3   B       z   4   5
4   B       x   5   5
5   B       y   6   5

Here is my solution:

import pandas as pd

df = pd.DataFrame({'entity': ['A', 'A', 'A','B', 'B', 'B'], 'gr': ['x', 'y', 'z', 'z', 'x', 'y'], 'v': [1,2,3,4,5,6]})

df['new'] = df.loc[df['gr'] == 'x', 'v']
df['new'] = df.groupby('entity')['new'].ffill().bfill().astype(int)

but I wonder, if a better, more concise or idiomatic approach exists to this problem?

Slight variation on this problem, instead of df['gr'] == 'x', different mask df['gr'] == df['different_column']

标签: pythonpandaspandas-groupby

解决方案


If always match only one value or no value per entity is possible filter first, then convert entity to index and use Series.map:

df['new'] = df['entity'].map(df[df['gr'] == 'x'].set_index('entity')['v'])

print (df)
  entity gr  v  new
0      A  x  1    1
1      A  y  2    1
2      A  z  3    1
3      B  z  4    5
4      B  x  5    5
5      B  y  6    5

Your solution should be changed by GroupBy.first in GroupBy.transform:

df['new'] = (df.assign(new = df['v'].where(df['gr'] == 'x'))
               .groupby('entity')['new'].transform('first'))

print (df)
  entity gr  v  new
0      A  x  1  1.0
1      A  y  2  1.0
2      A  z  3  1.0
3      B  z  4  5.0
4      B  x  5  5.0
5      B  y  6  5.0

推荐阅读