首页 > 解决方案 > 将列中的值与 R 中的日期范围相乘

问题描述

我对论坛很陌生,我希望这个问题可以理解。

我有一个数据框(df)如下

id     date         announcement_date  ret
12055  2001-08-02   2001-08-03         1.0246
12055  2001-08-03   2001-08-03         1.123
12055  2001-08-04   2001-08-03         0.994
11033  2001-08-02   2001-08-05         1.020
11033  2001-08-03   2001-08-05         0.997
11033  2001-08-04   2001-08-05         0.949
11033  2001-08-05   2001-08-05         1.048
11033  2001-08-06   2001-08-05         1.060
11033  2001-08-07   2001-08-05         1.002

如何创建一个新列,其中包含按 id 分组的 'ret' 产品,从announcement_date 到最后存在的一天?也就是说,对于 id=11033,我想创建一个新列“产品”,如下所示:

id     date         announcement_date  ret    Product
11033  2001-08-02   2001-08-05         1.020  -
11033  2001-08-03   2001-08-05         0.997  -
11033  2001-08-04   2001-08-05         0.949  -
11033  2001-08-05   2001-08-05         1.048  1.048
11033  2001-08-06   2001-08-05         1.060  1.048*1.060
11033  2001-08-07   2001-08-05         1.002  1.048*1.060*1.002

我试过代码

df$product <- aggregate(ret ~ id + ret, df, prod)

那行得通,但我得到了所有日期每个“id”的“ret”乘积,即我不知道如何将“开始日期”设置为announcement_date。

标签: rdateproduct

解决方案


这是否有效:

library(purrr)
library(dplyr)
df %>% group_by(id) %>% filter(date>=announcement_date) %>% 
mutate(Product = accumulate(ret, `*`)) %>% as.data.frame() %>% right_join(df) %>% 
arrange(desc(id), date)
Joining, by = c("id", "date", "announcement_date", "ret")
     id       date announcement_date    ret  Product
1 12055 2001-08-02        2001-08-03 1.0246       NA
2 12055 2001-08-03        2001-08-03 1.1230 1.123000
3 12055 2001-08-04        2001-08-03 0.9940 1.116262
4 11033 2001-08-02        2001-08-05 1.0200       NA
5 11033 2001-08-03        2001-08-05 0.9970       NA
6 11033 2001-08-04        2001-08-05 0.9490       NA
7 11033 2001-08-05        2001-08-05 1.0480 1.048000
8 11033 2001-08-06        2001-08-05 1.0600 1.110880
9 11033 2001-08-07        2001-08-05 1.0020 1.113102

推荐阅读