首页 > 解决方案 > 数据透视表中的计算问题

问题描述

我有下表

例子

我想要的地方

我的问题:

  1. 如何制作包含上述计算的表格?
  2. 如何进行从上一行汇总的计算?
  3. 如果我使用 shift(1) 编写一个函数,该函数从前一个单元格中获取值,我如何让它在 A 完成时停止计算?

标签: pythonaggregate

解决方案


处理您上面提供的数据框结构,这是一种不使用轮班的方法。

import pandas as pd

df = pd.read_excel('my_sample_data.xls', header=[0,1])
>>>df
              Ticker               Date Quantity       Price       Quantity_depo       Price_depo
  Unnamed: 0_level_1 Unnamed: 1_level_1       In   Out    In   Out       Initial Final    Initial Final
0                  A         2001-01-01        0  1000     0  9595           NaN   NaN        NaN   NaN
1                  A         2001-01-02       25   620    25   516           NaN   NaN        NaN   NaN
2                  A         2001-01-03      655  2660   655  2660           NaN   NaN        NaN   NaN
3                  A         2001-01-04        3   555   894   555           NaN   NaN        NaN   NaN
4                  B         2001-01-01        0  1000     0  9595           NaN   NaN        NaN   NaN
5                  B         2001-01-02       25   620    25   516           NaN   NaN        NaN   NaN
6                  B         2001-01-03      655  2660   655  2660           NaN   NaN        NaN   NaN
7                  B         2001-01-04        3   555   894   555           NaN   NaN        NaN   NaN

# Count of rows for each ticker
tkr = df.loc[:,'Ticker']
tkr.columns = ['Ticker']
gp = tkr.groupby('Ticker')['Ticker'].count()
gp_dict = {k:v for k,v in zip(gp.index, gp)}

idx = 0
for v in gp_dict.values():
    # Compute for the first row
    df.loc[idx, ('Quantity_depo', 'Initial')] = 10000 # Example
    df.loc[idx, ('Quantity_depo', 'Final')] = df.loc[idx, ('Quantity_depo', 'Initial')] + df.loc[idx, ('Quantity', 'In')] - df.loc[idx, ('Quantity', 'Out')]
    v += idx

    # Now fill up the rest of the rows
    for ix, value in df.loc[idx+1:v-1].iterrows():
        df.loc[ix, ('Quantity_depo', 'Initial')] = df.loc[ix-1, ('Quantity_depo', 'Final')]
        df.loc[ix, ('Quantity_depo', 'Final')] = df.loc[ix, ('Quantity_depo', 'Initial')] + df.loc[ix, ('Quantity', 'In')] - df.loc[ix, ('Quantity', 'Out')]

    # For the next loop
    idx += v

>>>df
             Ticker               Date Quantity       Price       Quantity_depo         Price_depo
  Unnamed: 0_level_1 Unnamed: 1_level_1       In   Out    In   Out       Initial   Final    Initial Final
0                  A         2001-01-01        0  1000     0  9595       10000.0  9000.0        NaN   NaN
1                  A         2001-01-02       25   620    25   516        9000.0  8405.0        NaN   NaN
2                  A         2001-01-03      655  2660   655  2660        8405.0  6400.0        NaN   NaN
3                  A         2001-01-04        3   555   894   555        6400.0  5848.0        NaN   NaN
4                  B         2001-01-01        0  1000     0  9595       10000.0  9000.0        NaN   NaN
5                  B         2001-01-02       25   620    25   516        9000.0  8405.0        NaN   NaN
6                  B         2001-01-03      655  2660   655  2660        8405.0  6400.0        NaN   NaN
7                  B         2001-01-04        3   555   894   555        6400.0  5848.0        NaN   NaN

推荐阅读