python - Vectorising a loop based on the order of values in a series
问题描述
This question is based on a previous question I answered.
The input looks like:
Index Results Price
0 Buy 10
1 Sell 11
2 Buy 12
3 Neutral 13
4 Buy 14
5 Sell 15
I need to find every Buy-Sell sequence (ignoring extra Buy / Sell values out of sequence) and calculate the difference in Price.
The desired output:
Index Results Price Difference
0 Buy 10
1 Sell 11 1
2 Buy 12
3 Neutral 13
4 Buy 14
5 Sell 15 3
My solution is verbose but seems to work:
from numba import njit
@njit
def get_diffs(results, prices):
res = np.full(prices.shape, np.nan)
prev_one, prev_zero = True, False
for i in range(len(results)):
if prev_one and (results[i] == 0):
price_start = prices[i]
prev_zero, prev_one = True, False
elif prev_zero and (results[i] == 1):
res[i] = prices[i] - price_start
prev_zero, prev_one = False, True
return res
results = df['Results'].map({'Buy': 0, 'Sell': 1})
df['Difference'] = get_diffs(results.values, df['Price'].values)
Is there a vectorised method? I'm concerned about code maintainability and performance over a large number of rows.
Edit: Benchmarking code:
df = pd.DataFrame.from_dict({'Index': {0: 0, 1: 1, 2: 2, 3: 3, 4: 4, 5: 5},
'Results': {0: 'Buy', 1: 'Sell', 2: 'Buy', 3: 'Neutral', 4: 'Buy', 5: 'Sell'},
'Price': {0: 10, 1: 11, 2: 12, 3: 13, 4: 14, 5: 15}})
df = pd.concat([df]*10**4, ignore_index=True)
def jpp(df):
results = df['Results'].map({'Buy': 0, 'Sell': 1})
return get_diffs(results.values, df['Price'].values)
%timeit jpp(df) # 7.99 ms ± 142 µs per loop
解决方案
By using cumcount
to find the pair:
s=df.groupby('Results').cumcount()
df['Diff']=df.Price.groupby(s).diff().loc[df.Results.isin(['Buy','Sell'])]
df
Out[596]:
Index Results Price Diff
0 0 Buy 10 NaN
1 1 Sell 11 1.0
2 2 Buy 12 NaN
3 3 Neutral 13 NaN
4 4 Buy 14 NaN
5 5 Sell 15 3.0
推荐阅读
- c# - 如何找到 SkinnedMeshRenderer 的 BoundingBox 的世界坐标
- flutter - 如何在颤动中设置图像适合每台设备
- c# - 程序集没有强名称
- angular - 如何将派生类作为服务注入
- ios - 没有动画的 SwiftUI fullScreenCover
- java - 如何向 HTML 网页中的数据表添加附加功能?
- .net-core - Piranha CMS - UseManager 违反 CORS 政策
- python - 将多个字典读入嵌套列表
- reactjs - Laravel 响应下载问题返回字符而不是文件
- python - adjust_for_ambient_noise 在静默中给出超时错误