r - 在 data.table 中计算 fama french 因子
问题描述
我正在尝试计算 r 中的 fama french 因子。经过几天的汗水和绝望,我设法计算了 6 个各自投资组合的回报……只是发现了一个我似乎无法解决的问题。
我的数据大致是这样的,这只是一个简化的数据集来说明我的问题:
> TestX = data.table(Group = c("SM", "SM", "SM", "SH", "SH", "SH", "SL", "SL", "SL"), Date= as.Date(c("1995-07-30","1995-07-30","1995-07-30","1995-07-30","1995-07-30","1995-07-30","1995-07-30","1995-07-30", "1995-07-30")), Code= c("C1", "C2", "C3", "C4", "C5", "C6", "C7", "C8", "C9"), SMRet = c(2,3,3, NA, NA, NA, NA, NA, NA), SHRet = c(NA, NA, NA, 5,5,5, NA, NA, NA), SLRet = c(NA, NA, NA, NA, NA, NA, 0,1,2) )
> TestX
Group Date Code SMRet SHRet SLRet
1: SM 1995-07-30 C1 2 NA NA
2: SM 1995-07-30 C2 3 NA NA
3: SM 1995-07-30 C3 3 NA NA
4: SH 1995-07-30 C4 NA 5 NA
5: SH 1995-07-30 C5 NA 5 NA
6: SH 1995-07-30 C6 NA 5 NA
7: SL 1995-07-30 C7 NA NA 0
8: SL 1995-07-30 C8 NA NA 1
9: SL 1995-07-30 C9 NA NA 2
Group 给出了组(SmallMedium、SmallHigh、SmallLow,我在真实的 data.table 中有其他组)。代码给出了各自的公司代码等。我想要做的是创建一个包含各自因素的新列。为此,我需要进行以下计算:
(Smret+SHret+SLret)/3
但是我该怎么做呢?
TestX[, Factor := (SMRet+SHRet+SLRet)/3, by = Date]
没用,我到处都是 NA。
Group Date Code SMRet SHRet SLRet Factor
1: SM 1995-07-30 C1 2 NA NA NA
2: SM 1995-07-30 C2 3 NA NA NA
3: SM 1995-07-30 C3 3 NA NA NA
4: SH 1995-07-30 C4 NA 5 NA NA
5: SH 1995-07-30 C5 NA 5 NA NA
6: SH 1995-07-30 C6 NA 5 NA NA
7: SL 1995-07-30 C7 NA NA 0 NA
8: SL 1995-07-30 C8 NA NA 1 NA
9: SL 1995-07-30 C9 NA NA 2 NA
我还需要按日期分组。真实的data.table还有402个月。
提前致谢。
编辑:这是一个更好的 data.table 来说明我的问题
TestX = data.table(Group = c("SM", "SM", "SH", "SH", "SL", "SL", "SM", "SM", "SH", "SH", "SL", "SL"), Date= as.Date(c("1995-07-30","1995-07-30","1995-07-30","1995-07-30","1995-07-30","1995-07-30","1995-08-30","1995-08-30", "1995-08-30", "1995-08-30","1995-08-30","1995-08-30")), Code= c("C1", "C2", "C3", "C4", "C5", "C6", "C7", "C8", "C9", "c10", "c11", "12"), SMRet = c(2,3, NA, NA, NA, NA, 4, 5, NA, NA, NA, NA), SHRet = c(NA, NA, 5, 5, NA, NA, NA, NA, 3, 4, NA, NA), SLRet = c(NA, NA, NA, NA, 0, 1, NA,NA,NA, NA, 2,3))
> TestX
Group Date Code SMRet SHRet SLRet
1: SM 1995-07-30 C1 2 NA NA
2: SM 1995-07-30 C2 3 NA NA
3: SH 1995-07-30 C3 NA 5 NA
4: SH 1995-07-30 C4 NA 5 NA
5: SL 1995-07-30 C5 NA NA 0
6: SL 1995-07-30 C6 NA NA 1
7: SM 1995-08-30 C7 4 NA NA
8: SM 1995-08-30 C8 5 NA NA
9: SH 1995-08-30 C9 NA 3 NA
10: SH 1995-08-30 c10 NA 4 NA
11: SL 1995-08-30 c11 NA NA 2
12: SL 1995-08-30 12 NA NA 3
这是期望的结果:
Group Date Code SMRet SHRet SLRet Factor
1: SM 1995-07-30 C1 2 NA NA 5.333333
2: SM 1995-07-30 C2 3 NA NA 5.333333
3: SH 1995-07-30 C3 NA 5 NA 5.333333
4: SH 1995-07-30 C4 NA 5 NA 5.333333
5: SL 1995-07-30 C5 NA NA 0 5.333333
6: SL 1995-07-30 C6 NA NA 1 5.333333
7: SM 1995-08-30 C7 4 NA NA 7.000000
8: SM 1995-08-30 C8 5 NA NA 7.000000
9: SH 1995-08-30 C9 NA 3 NA 7.000000
10: SH 1995-08-30 c10 NA 4 NA 7.000000
11: SL 1995-08-30 c11 NA NA 2 7.000000
12: SL 1995-08-30 12 NA NA 3 7.000000
所以:每个月:(SMRet+ShRet+SLRet)/3
解决方案
您可以使用以下代码计算 R 中的 fama french 因子:
TestX[ , newvar := sum(SMRet, SHRet, SLRet, na.rm=TRUE)/3, by=Date]
推荐阅读
- kubernetes - 仅允许 GKE 私有集群的辅助 IP 范围通过 CloudNAT 访问互联网
- python - 由于 GEKKO 中的用户定义函数导致“没有等式 (=) 或不等式 (>,<) 的方程”错误
- node.js - 将文件从 S3 触发器 lambda 上传到 API 端点时出错
- mysql - 查询运行快但过程慢
- javascript - 在页面上单击随机 div 显示并使用 jquery 隐藏
- reactjs - 为什么我的环境变量在反应中不起作用
- angular - 尽管存在规范文件,业力在 Angular 项目中显示“不完整:未找到规范”
- reactjs - 如何编辑通过纱线安装的节点模块?
- node.js - 用户 id 进入数据库 null React
- windows - win32:互斥锁释放的速度不够快?