首页 > 解决方案 > Creating time series df with category and date and percentage change

问题描述

I have a dataframe like this:

category:      number:         date:
   dog           100         2020-01-01
   cat           50          2020-01-01
   dog           150         2020-01-02
   mouse         200         2020-01-01
   mouse         150         2020-01-02
   cat           100         2020-01-02

I am trying to create a dataframe that gets the percentage change for each individual category across each date, similar to this:

category:    number:        date:         percentage_change:
  dog          100        2020-01-01              -
  dog          150        2020-01-02             50%
  cat           50        2020-01-01              - 
  cat          100        2020-01-02             100%
  mouse        200        2020-01-01              -
  mouse        150        2020-01-02             25%

I have tried this:

df['number'].pct_change()

But this doesn't get pct_change for each category.

Any help greatly appreciated.

标签: pythonpandasdataframetime-series

解决方案


DataFrame.sort_values与 一起使用GroupBy.pct_change

df = df.sort_values(['category','date'])
df['percentage_change'] = df.groupby('category')['number'].pct_change()
print (df)
  category  number        date  percentage_change
1      cat      50  2020-01-01                NaN
5      cat     100  2020-01-02               1.00
0      dog     100  2020-01-01                NaN
2      dog     150  2020-01-02               0.50
3    mouse     200  2020-01-01                NaN
4    mouse     150  2020-01-02              -0.25

对于百分比:

s = df['percentage_change'].mul(100).round().fillna(0,downcast='infer').astype(str) + '%'
df['percentage_change'] = np.where(df['percentage_change'].isna(), '-', s)
print (df)
  category  number        date percentage_change
1      cat      50  2020-01-01                 -
5      cat     100  2020-01-02              100%
0      dog     100  2020-01-01                 -
2      dog     150  2020-01-02               50%
3    mouse     200  2020-01-01                 -
4    mouse     150  2020-01-02              -25%

推荐阅读