首页 > 解决方案 > 对于大型 DataFrame,在 pandas 中回测交易机器人的最佳方法,无需逐行测试策略?

问题描述

最快的转弯方法是什么

df[["Strat1", "Close"]]

    Strat1  Close
0   Sell    14185.250000
1   Sell    14185.150391
2   Sell    14157.320312
3   Sell    14184.709961
4   Sell    14185.139648
5   Buy     14171.000000
6   Buy     14166.919922
7   Buy     14150.009766
8   Buy     14136.209961
9   Sell    14131.889648
10  Sell    14129.969727
11  Buy     14135.500000
12  Buy     14135.500000
13  Buy     14135.500000
14  Sell    14135.500000
15  Buy     14135.500000

进入

df[["Strat1", "Close"]]

    Strat1  Close
0   Sell    14185.250000
1           14185.150391
2           14157.320312
3           14184.709961
4           14185.139648
5   Buy     14171.000000
6           14166.919922
7           14150.009766
8           14136.209961
9   Sell    14131.889648
10          14129.969727
11  Buy     14135.500000
12          14135.500000
13          14135.500000
14  Sell    14135.500000
15  Buy     14135.500000

您是否必须逐行遍历 DataFrame?我的数据框有 140 万行数据,因此逐行迭代将不可避免地需要一段时间,而且我有很多列“Strat2”、“Strat3”要测试,进一步增加了解决此问题所需的时间。

标签: pythonpandasdataframeloops

解决方案


尝试:

df["Strat1"] = (
    df["Strat1"]
    .groupby((df["Strat1"] != df["Strat1"].shift()).cumsum())
    .transform(lambda x: [x.iat[0], *[""] * (len(x) - 1)])
)
print(df)

印刷:

   Strat1         Close
0    Sell  14185.250000
1          14185.150391
2          14157.320312
3          14184.709961
4          14185.139648
5     Buy  14171.000000
6          14166.919922
7          14150.009766
8          14136.209961
9    Sell  14131.889648
10         14129.969727
11    Buy  14135.500000
12         14135.500000
13         14135.500000
14   Sell  14135.500000
15    Buy  14135.500000

推荐阅读