python - 根据另一个时间序列在熊猫系列中查找值的差异
问题描述
我在 pandas 中有一个金融时间序列和一个时间序列“位置”,当趋势为正时取值为 1,否则为 -1。位置系列不断交替 1 和 -1。是否有一种功能或一种聪明的方法可以找到一个积极时期的开始和结束之间的差异?更具体地说,我想对所有的增量求和,但为了做到这一点,我正在努力寻找一种方法来确定趋势的起点和终点。谢谢
解决方案
假设,你有一个这样的数据框:
date value
0 2020-11-12 10
1 2020-11-13 12
2 2020-11-14 15
3 2020-11-15 17
4 2020-11-16 17
5 2020-11-17 11
6 2020-11-18 12
7 2020-11-19 9
8 2020-11-20 7
并且您想计算上升周期的开始值和结束值之间的差异,然后您会得到以下结果:
start_date first_value last_value difference
trend_no
1 2020-11-12 10 17 7
2 2020-11-17 11 12 1
通过执行以下代码:
# work out the trend of the series
df['difference']= df['value'] - df['value'].shift(1).fillna(0)
df['trend']= np.sign(df['difference'])
# work out the start and end of an ascending series
df['start_ascend']= (df['trend'].shift(-1) > df['trend']).astype('bool')
df.loc[0, 'start_ascend']= True
df['end_ascend']= (df['trend'].shift(-1) < df['trend']).astype('bool')
# assign a number to the ascending trends
# note, that the trends are not yet limited correctly
df['trend_no']= df['start_ascend'].cumsum()
# now work out the borders of the ascending trends
# all records that belong to an ascending trend
# will have df['keep_mask'] == True
df['keep_mask']= np.nan
indexer= df['end_ascend'].shift(1).fillna(False)
df.loc[indexer, 'keep_mask']= 0.0
df.loc[df['start_ascend'], 'keep_mask']= 1.0
df['keep_mask']= df['keep_mask'].fillna(method='ffill').astype('bool')
# now do the final aggregation
df_res= df[df['keep_mask']].groupby(df['trend_no']).agg(start_date=('date', 'first'), first_value=('value', 'first'), last_value=('value', 'last'))
df_res['difference']= df_res['last_value'] - df_res['first_value']
df_res
如果您想了解上述步骤实际上做了什么,您可以查看数据框:
date value trend start_ascend end_ascend trend_no keep_mask
0 2020-11-12 10 1.0 True False 1 True
1 2020-11-13 12 1.0 False False 1 True
2 2020-11-14 15 1.0 False False 1 True
3 2020-11-15 17 1.0 False True 1 True
4 2020-11-16 17 0.0 False True 1 False
5 2020-11-17 11 -1.0 True False 2 True
6 2020-11-18 12 1.0 False True 2 True
7 2020-11-19 9 -1.0 False False 2 False
8 2020-11-20 7 -1.0 False False 2 False