首页 > 解决方案 > 如何围绕每个符号包含多个事件的变量生成一个窗口?

问题描述

我有一个问题..我的面板数据集看起来像没有变量“Window”的数据。现在我正在尝试创建变量“Window”,如下所示:

Symbol  Date        Close       Time      Event  Window
AAPL    09/03/2020  66,542503   16:25:00    NA    NA
AAPL    09/03/2020  71,334999   16:26:00    NA    -4
AAPL    09/03/2020  68,857498   16:27:00    NA    -3
AAPL    09/03/2020  62,057499   16:28:00    NA    -2
AAPL    09/03/2020  69,4925     16:29:00    NA    -1
AAPL    09/03/2020  60,552502   16:30:00    1      0
AAPL    09/03/2020  63,215      16:31:00    NA     1 
AAPL    10/03/2020  61,6675     09:30:00    NA     2 
AAPL    10/03/2020  61,195      09:31:00    NA     3 
AAPL    10/03/2020  57,310001   09:32:00    NA     4  
AAPL    10/03/2020  56,092499   09:33:00    NA    NA 
AAPL    15/03/2020  65,535603   15:45:00    NA    NA
AAPL    15/03/2020  66,357545   15:46:00    NA    NA
AAPL    15/03/2020  62,852345   15:47:00    NA    -4
AAPL    15/03/2020  64,057325   15:48:00    NA    -3
AAPL    16/03/2020  66,494545   09:30:00    NA    -2
AAPL    16/03/2020  63,557967   09:31:00    1     -1
AAPL    16/03/2020  64,415454   09:32:00    NA     0 
AAPL    16/03/2020  62,2357     09:33:00    NA     1 
AAPL    16/03/2020  64,4576     09:34:00    NA     2 
AAPL    16/03/2020  59,457579   09:35:00    NA     3  
AAPL    16/03/2020  58,092470   09:36:00    NA     4 
VISA    05/03/2020  186,960007  16:26:00    NA    NA 
VISA    05/03/2020  184,360001  16:27:00    NA    -4 
VISA    05/03/2020  171,130005  16:28:00    NA    -3 
VISA    05/03/2020  182,600006  16:29:00    NA    -2 
VISA    05/03/2020  172,949997  16:30:00    NA    -1 
VISA    06/03/2020  160,080002  09:32:00    1      0
VISA    06/03/2020  175,830002  09:33:00    NA     1 
VISA    06/03/2020  152,009995  09:34:00    NA     2 
VISA    06/03/2020  157,889999  09:35:00    NA     3 
VISA    06/03/2020  148,479996  09:36:00    NA     4 
VISA    06/03/2020  152,25      09:37:00    NA    NA 
VISA    06/03/2020  146,830002  09:38:00    NA    NA 
VISA    20/03/2020  192,203826  16:12:00    NA    NA 
VISA    20/03/2020  193,293752  16:13:00    NA    -4 
VISA    20/03/2020  192,204726  16:14:00    NA    -3 
VISA    20/03/2020  192,2396    16:15:00    NA    -2 
VISA    20/03/2020  194,185620  16:16:00    NA    -1 
VISA    20/03/2020  196,614289  16:17:00    1      0
VISA    20/03/2020  197,826200  16:18:00    NA     1 
VISA    21/03/2020  197,49176   09:29:00    NA     2 
VISA    21/03/2020  197,239230  09:30:00    NA     3 
VISA    21/03/2020  198,2300    09:31:00    NA     4 
VISA    21/03/2020  198,230028  09:32:00    NA    NA 
VISA    21/03/2020  197,247020  09:33:00    NA    NA 

我已经尝试过我发现的以下代码:

EventStudy <- EventStudy %>% group_by(Symbol) %>% mutate(Window = row_number() - match(1, Event), Window = ifelse(abs(Window) > 4, NA, Window)) %>% ungroup

不幸的是,它每个符号只给我一个窗口,但我的代码中每个符号有多个事件。例如,对于符号“AAPL”,我有两个事件。

我也尝试使用没有 group_by 的代码,但它也不能按预期工作。我也没有适合数据集的分组,之后我每组只有一个事件。

有没有办法为一个符号的多个事件修改代码?你能帮我创建变量“Window”吗?

提前谢谢了!

标签: reventswindow

解决方案


您可以借助辅助函数来实现此目的,以获得最接近的Event = 1值索引。

library(dplyr)

closest_index <- function(x, y) {
  y <- which(y == 1)
  y[sapply(x, function(i) which(abs(y - i) == min(abs(y - i)))[1])]
}

EventStudy %>%
  group_by(Symbol) %>%
  mutate(close_index = closest_index(row_number(), Event),  
         Window = row_number() - close_index, 
         Window = ifelse(abs(Window) > 4, NA, Window)) %>%
  ungroup %>%
  select(-close_index) 

这返回 -

#   Symbol       Date      Close     Time Event Window
#1    AAPL 09/03/2020  66,542503 16:25:00    NA     NA
#2    AAPL 09/03/2020  71,334999 16:26:00    NA     -4
#3    AAPL 09/03/2020  68,857498 16:27:00    NA     -3
#4    AAPL 09/03/2020  62,057499 16:28:00    NA     -2
#5    AAPL 09/03/2020    69,4925 16:29:00    NA     -1
#6    AAPL 09/03/2020  60,552502 16:30:00     1      0
#7    AAPL 09/03/2020     63,215 16:31:00    NA      1
#8    AAPL 10/03/2020    61,6675 09:30:00    NA      2
#9    AAPL 10/03/2020     61,195 09:31:00    NA      3
#10   AAPL 10/03/2020  57,310001 09:32:00    NA      4
#11   AAPL 10/03/2020  56,092499 09:33:00    NA     NA
#12   AAPL 15/03/2020  65,535603 15:45:00    NA     NA
#13   AAPL 15/03/2020  66,357545 15:46:00    NA     -4
#14   AAPL 15/03/2020  62,852345 15:47:00    NA     -3
#15   AAPL 15/03/2020  64,057325 15:48:00    NA     -2
#16   AAPL 16/03/2020  66,494545 09:30:00    NA     -1
#17   AAPL 16/03/2020  63,557967 09:31:00     1      0
#18   AAPL 16/03/2020  64,415454 09:32:00    NA      1
#19   AAPL 16/03/2020    62,2357 09:33:00    NA      2
#20   AAPL 16/03/2020    64,4576 09:34:00    NA      3
#21   AAPL 16/03/2020  59,457579 09:35:00    NA      4
#22   AAPL 16/03/2020  58,092470 09:36:00    NA     NA
#23   VISA 05/03/2020 186,960007 16:26:00    NA     NA
#24   VISA 05/03/2020 184,360001 16:27:00    NA     -4
#25   VISA 05/03/2020 171,130005 16:28:00    NA     -3
#26   VISA 05/03/2020 182,600006 16:29:00    NA     -2
#27   VISA 05/03/2020 172,949997 16:30:00    NA     -1
#28   VISA 06/03/2020 160,080002 09:32:00     1      0
#29   VISA 06/03/2020 175,830002 09:33:00    NA      1
#30   VISA 06/03/2020 152,009995 09:34:00    NA      2
#31   VISA 06/03/2020 157,889999 09:35:00    NA      3
#32   VISA 06/03/2020 148,479996 09:36:00    NA      4
#33   VISA 06/03/2020     152,25 09:37:00    NA     NA
#34   VISA 06/03/2020 146,830002 09:38:00    NA     NA
#35   VISA 20/03/2020 192,203826 16:12:00    NA     NA
#36   VISA 20/03/2020 193,293752 16:13:00    NA     -4
#37   VISA 20/03/2020 192,204726 16:14:00    NA     -3
#38   VISA 20/03/2020   192,2396 16:15:00    NA     -2
#39   VISA 20/03/2020 194,185620 16:16:00    NA     -1
#40   VISA 20/03/2020 196,614289 16:17:00     1      0
#41   VISA 20/03/2020 197,826200 16:18:00    NA      1
#42   VISA 21/03/2020  197,49176 09:29:00    NA      2
#43   VISA 21/03/2020 197,239230 09:30:00    NA      3
#44   VISA 21/03/2020   198,2300 09:31:00    NA      4
#45   VISA 21/03/2020 198,230028 09:32:00    NA     NA
#46   VISA 21/03/2020 197,247020 09:33:00    NA     NA

推荐阅读