首页 > 解决方案 > How to shift the pandas column value with group by and create new column using python?

问题描述

I have a data-frame:

    T   B   C   D   value
    0   USP JDF PG  0.956350953
    1   USP JDF PG  0.73231087
    2   USP JDF PG  0.560755661
    3   USP JDF PG  0.429389928
    4   USP JDF PG  0.32879866
    5   USP JDF PG  0.251772461

    0   PSP MRF PM  0.192790847
    1   PSP MRF PM  0.147626593
    2   PSP MRF PM  0.113042768
    3   PSP MRF PM  0.086560742
    4   PSP MRF PM  0.066282542
    5   PSP MRF PM  0.050754826

For group by B, C,D I want to create new shifted column for value and then calculate the difference.

Here is the output that I want.

    A   B   C   D   value        shift       diff
    0   USP JDF PG  0.956350953 0.73231087  0.224040083
    1   USP JDF PG  0.73231087  0.560755661 0.171555209
    2   USP JDF PG  0.560755661 0.429389928 0.131365733
    3   USP JDF PG  0.429389928 0.32879866  0.100591268
    4   USP JDF PG  0.32879866  0.251772461 0.077026199
    5   USP JDF PG  0.251772461 null    null

    0   PSP MRF PM  0.192790847 0.147626593 0.045164254
    1   PSP MRF PM  0.147626593 0.113042768 0.034583825
    2   PSP MRF PM  0.113042768 0.086560742 0.026482026
    3   PSP MRF PM  0.086560742 0.066282542 0.0202782
    4   PSP MRF PM  0.066282542 0.050754826 0.015527716
    5   PSP MRF PM  0.050754826 null    nul

diff=value-shift.

I tried using this code:

df['shift'] = df.groupby(['B', 'C','D'])['value'].shift(-1)

But it didn't work.

标签: python-3.xpandas

解决方案


Plain Vanilla

g = df.groupby(['B', 'C', 'D']).value
df.assign(shift=g.shift(-1), diff=g.diff(-1))

    A    B    C   D     value     shift      diff
0   0  USP  JDF  PG  0.956351  0.732311  0.224040
1   1  USP  JDF  PG  0.732311  0.560756  0.171555
2   2  USP  JDF  PG  0.560756  0.429390  0.131366
3   3  USP  JDF  PG  0.429390  0.328799  0.100591
4   4  USP  JDF  PG  0.328799  0.251772  0.077026
5   5  USP  JDF  PG  0.251772       NaN       NaN
6   0  PSP  MRF  PM  0.192791  0.147627  0.045164
7   1  PSP  MRF  PM  0.147627  0.113043  0.034584
8   2  PSP  MRF  PM  0.113043  0.086561  0.026482
9   3  PSP  MRF  PM  0.086561  0.066283  0.020278
10  4  PSP  MRF  PM  0.066283  0.050755  0.015528
11  5  PSP  MRF  PM  0.050755       NaN       NaN

pipe

df.groupby(['B', 'C', 'D']).value.pipe(
    lambda g: df.assign(shift=g.shift(-1), diff=g.diff(-1))
)

    A    B    C   D     value     shift      diff
0   0  USP  JDF  PG  0.956351  0.732311  0.224040
1   1  USP  JDF  PG  0.732311  0.560756  0.171555
2   2  USP  JDF  PG  0.560756  0.429390  0.131366
3   3  USP  JDF  PG  0.429390  0.328799  0.100591
4   4  USP  JDF  PG  0.328799  0.251772  0.077026
5   5  USP  JDF  PG  0.251772       NaN       NaN
6   0  PSP  MRF  PM  0.192791  0.147627  0.045164
7   1  PSP  MRF  PM  0.147627  0.113043  0.034584
8   2  PSP  MRF  PM  0.113043  0.086561  0.026482
9   3  PSP  MRF  PM  0.086561  0.066283  0.020278
10  4  PSP  MRF  PM  0.066283  0.050755  0.015528
11  5  PSP  MRF  PM  0.050755       NaN       NaN

推荐阅读