首页 > 解决方案 > Python帮助优化这个功能

问题描述

data = 
        Symbol   Value  Day
0         AACG  1.8708    1
1         AACG  1.8500    2
2         AACG  1.8869    3
3         AACG  1.8200    4
4         AACG  1.8578    5
...        ...     ...  ...
3407024   ZYXI   5.25    1
3407025   ZYXI   4.96    2
3407026   ZYXI   4.99    3
3407027   ZYXI   4.99    4
3407028   ZYXI   4.95    5
...        ...    ...  ...
3407250   ZYXI  8.1500  227
3407251   ZYXI  8.2600  228
3407252   ZYXI  8.3900  229
3407253   ZYXI  8.1200  230
3407254   ZYXI  8.0700  231
import pandas as pd
import numpy as np

for index, row in data.iterrows():
    for i in range(1, 91):
        cstr = 'day-' + str(i)
        val = 'NaN'
        try:
            val = float(data[np.logical_and(data['Symbol'] == row['Symbol'],
                            data['Day'] == row['Day'] - i)].Value)
        except:
            val = 'NaN'
        data.loc[index,cstr] = val

该函数循环遍历数据框中的每一行

对于数据框中的每一行,它循环 90 次 (i)

对于每个循环,它会添加一个带有值的列

value 是数据框中的值,其符号与行相同,但天为行中的天减去 i

output =
  Symbol   Value  Day   day-1   day-2   day-3   day-4... day-89 day-90
0   AACG  1.8708    1     NaN     NaN     NaN     NaN
1   AACG  1.8500    2  1.8708     NaN     NaN     NaN
2   AACG  1.8869    3  1.8500  1.8708     NaN     NaN
3   AACG  1.8200    4  1.8869  1.8500  1.8708     NaN
4   AACG  1.8578    5  1.8200  1.8869  1.8500  1.8708
5   AACG  1.8709    6  1.8578  1.8200  1.8869  1.8500
6   AACG  1.8700    7  1.8709  1.8578  1.8200  1.8869
7   AACG  1.8800    8  1.8700  1.8709  1.8578  1.8200
8   AACG  1.8000    9  1.8800  1.8700  1.8709  1.8578
9   AACG  1.7900   10  1.8000  1.8800  1.8700  1.8709

标签: pythonpandasnumpyoptimization

解决方案


尝试使用shiftpd.concat

N = 5
df_new = pd.DataFrame()
for i,grp in df.groupby('Symbol'):
    l = pd.concat([grp['Value'].shift(i).rename(f'Day_{i}') for i in range(1,N)], axis=1)
    final_df = pd.concat([grp, l], axis=1)
    df_new = df_new.append(final_df)

或者

def f(x):
    x['Day-0'] = x['Value']
    for i in range(1,N+1):
        x[f'Day-{i}'] = x[f'Day-{i-1}'].shift()
    x.drop('Day-0', inplace=True ,axis=1)
    return x

final_df = df.groupby('Symbol').apply(f)

**final_df:"

在此处输入图像描述


推荐阅读