r - 投资组合构建 - 因子模型
问题描述
我正在尝试在 R 中构建一个投资组合,我需要将不同的股票 (PERMNO) 划分为六个不同的投资组合。
我想创建一个逻辑,将股票分类为具有 mkt.cap > 给定年份所有股票 mkt.cap 的中位数(例如 2010 年)
此外,股票应根据上述两组中的BM(OBS)分为3组。
分类应该是这样的:
Mkt. Cap
Quartile BM (OBS) Over yearly median Under yearly median
>70% PF1 PF2
30-70% PF3 PF4
<30% PF5 PF6
我的数据表中的一个示例如下所示:
PERMNO Date ret mkt.cap BM (OBS)
10001 2009-12 0,1626 44918,3008 0,00000000000000000000
75672 2009-12 -0,2062 43722,1389 0,00001104509093018260
80928 2009-12 0,1770 689062,2694 0,00000688713518454942
80912 2009-12 -0,0274 71494,3516 0,00000984511341873784
76261 2009-12 0,0315 382438,0821 0,00000213437164919912
90303 2009-12 0,1959 964578,8864 0,00000000000000000000
91161 2009-12 0,2808 371170,0671 0,00000504687787573149
89841 2009-12 0,0438 1235170,0000 0,00000000000000000000
82515 2009-12 0,0565 934767,3563 0,00002803828655806010
84330 2009-12 -0,1000 166769,8187 0,00014664615387307400
10001 2010-01 -0,0189 43871,6618 0,00000000000000000000
75672 2010-01 -0,0260 42586,5000 0,00001115063263397240
80928 2010-01 -0,0704 640548,3269 0,00000728527479914769
80912 2010-01 0,0256 73322,8542 0,00000943960571401137
76261 2010-01 -0,0334 369662,6679 0,00000217133254998311
90303 2010-01 -0,1095 858998,8864 0,00000000000000000000
91161 2010-01 -0,1217 325990,6705 0,00000565055792544003
89841 2010-01 -0,0480 1175881,8965 0,00000000000000000000
82515 2010-01 -0,0377 899493,1499 0,00002865219568686880
84330 2010-01 0,0873 181329,0906 0,00013295614165661100
我的数据集相当广泛,所以代码应该能够在大数据集上快速运行。
我正在考虑为投资组合创建 6 个新的二元变量,这将是 = 0 或 = 1,具体取决于股票是否符合不同的标准,但我不知道该怎么做
谢谢
解决方案
如果您希望使用年度聚合/分位数计算新列,请使用此代码
df$YEAR <- substr(df$Date, 1, 4)
df$PF1 <- as.numeric(ave(df$BM_OBS, df$YEAR, FUN = function(x){x >= quantile(x, 0.7)}) & ave(df$mkt.cap, df$YEAR, FUN = function(x){x >= median(x)}))
df$PF2 <- as.numeric(ave(df$BM_OBS, df$YEAR, FUN = function(x){x >= quantile(x, 0.7)}) & ave(df$mkt.cap, df$YEAR, FUN = function(x){x < median(x)}))
df$PF3 <- as.numeric(ave(df$BM_OBS, df$YEAR, FUN = function(x){x < quantile(x, 0.7) & x >= quantile(x, 0.3)}) & ave(df$mkt.cap, df$YEAR, FUN = function(x){x >= median(x)}))
df$PF4 <- as.numeric(ave(df$BM_OBS, df$YEAR, FUN = function(x){x < quantile(x, 0.7) & x >= quantile(x, 0.3)}) & ave(df$mkt.cap, df$YEAR, FUN = function(x){x < median(x)}))
df$PF5 <- as.numeric(ave(df$BM_OBS, df$YEAR, FUN = function(x){x < quantile(x, 0.3)}) & ave(df$mkt.cap, df$YEAR, FUN = function(x){x >= median(x)}))
df$PF6 <- as.numeric(ave(df$BM_OBS, df$YEAR, FUN = function(x){x < quantile(x, 0.3)}) & ave(df$mkt.cap, df$YEAR, FUN = function(x){x < median(x)}))
获得
> df
PERMNO Date ret mkt.cap BM_OBS YEAR PF1 PF2 PF3 PF4 PF5 PF6
1 10001 2009-12 0.1626 44918.30 0.000000e+00 2009 0 0 0 0 0 1
2 75672 2009-12 -0.2062 43722.14 1.104509e-05 2009 0 1 0 0 0 0
3 80928 2009-12 0.1770 689062.27 6.887135e-06 2009 0 0 1 0 0 0
4 80912 2009-12 -0.0274 71494.35 9.845113e-06 2009 0 0 0 1 0 0
5 76261 2009-12 0.0315 382438.08 2.134372e-06 2009 0 0 1 0 0 0
6 90303 2009-12 0.1959 964578.89 0.000000e+00 2009 0 0 0 0 1 0
7 91161 2009-12 0.2808 371170.07 5.046878e-06 2009 0 0 0 1 0 0
8 89841 2009-12 0.0438 1235170.00 0.000000e+00 2009 0 0 0 0 1 0
9 82515 2009-12 0.0565 934767.36 2.803829e-05 2009 1 0 0 0 0 0
10 84330 2009-12 -0.1000 166769.82 1.466462e-04 2009 0 1 0 0 0 0
11 10001 2010-01 -0.0189 43871.66 0.000000e+00 2010 0 0 0 0 0 1
12 75672 2010-01 -0.0260 42586.50 1.115063e-05 2010 0 1 0 0 0 0
13 80928 2010-01 -0.0704 640548.33 7.285275e-06 2010 0 0 1 0 0 0
14 80912 2010-01 0.0256 73322.85 9.439606e-06 2010 0 0 0 1 0 0
15 76261 2010-01 -0.0334 369662.67 2.171333e-06 2010 0 0 1 0 0 0
16 90303 2010-01 -0.1095 858998.89 0.000000e+00 2010 0 0 0 0 1 0
17 91161 2010-01 -0.1217 325990.67 5.650558e-06 2010 0 0 0 1 0 0
18 89841 2010-01 -0.0480 1175881.90 0.000000e+00 2010 0 0 0 0 1 0
19 82515 2010-01 -0.0377 899493.15 2.865220e-05 2010 1 0 0 0 0 0
20 84330 2010-01 0.0873 181329.09 1.329561e-04 2010 0 1 0 0 0 0
使用的输入
df <- structure(list(PERMNO = c(10001L, 75672L, 80928L, 80912L, 76261L,
90303L, 91161L, 89841L, 82515L, 84330L, 10001L, 75672L, 80928L,
80912L, 76261L, 90303L, 91161L, 89841L, 82515L, 84330L), Date = c("2009-12",
"2009-12", "2009-12", "2009-12", "2009-12", "2009-12", "2009-12",
"2009-12", "2009-12", "2009-12", "2010-01", "2010-01", "2010-01",
"2010-01", "2010-01", "2010-01", "2010-01", "2010-01", "2010-01",
"2010-01"), ret = c(0.1626, -0.2062, 0.177, -0.0274, 0.0315,
0.1959, 0.2808, 0.0438, 0.0565, -0.1, -0.0189, -0.026, -0.0704,
0.0256, -0.0334, -0.1095, -0.1217, -0.048, -0.0377, 0.0873),
mkt.cap = c(44918.3008, 43722.1389, 689062.2694, 71494.3516,
382438.0821, 964578.8864, 371170.0671, 1235170, 934767.3563,
166769.8187, 43871.6618, 42586.5, 640548.3269, 73322.8542,
369662.6679, 858998.8864, 325990.6705, 1175881.8965, 899493.1499,
181329.0906), BM_OBS = c(0, 1.10450909301826e-05, 6.88713518454942e-06,
9.84511341873784e-06, 2.13437164919912e-06, 0, 5.04687787573149e-06,
0, 2.80382865580601e-05, 0.000146646153873074, 0, 1.11506326339724e-05,
7.28527479914769e-06, 9.43960571401137e-06, 2.17133254998311e-06,
0, 5.65055792544003e-06, 0, 2.86521956868688e-05, 0.000132956141656611
)), class = "data.frame", row.names = c(NA, -20L))
PERMNO Date ret mkt.cap BM_OBS
1 10001 2009-12 0.1626 44918.30 0.000000e+00
2 75672 2009-12 -0.2062 43722.14 1.104509e-05
3 80928 2009-12 0.1770 689062.27 6.887135e-06
4 80912 2009-12 -0.0274 71494.35 9.845113e-06
5 76261 2009-12 0.0315 382438.08 2.134372e-06
6 90303 2009-12 0.1959 964578.89 0.000000e+00
7 91161 2009-12 0.2808 371170.07 5.046878e-06
8 89841 2009-12 0.0438 1235170.00 0.000000e+00
9 82515 2009-12 0.0565 934767.36 2.803829e-05
10 84330 2009-12 -0.1000 166769.82 1.466462e-04
11 10001 2010-01 -0.0189 43871.66 0.000000e+00
12 75672 2010-01 -0.0260 42586.50 1.115063e-05
13 80928 2010-01 -0.0704 640548.33 7.285275e-06
14 80912 2010-01 0.0256 73322.85 9.439606e-06
15 76261 2010-01 -0.0334 369662.67 2.171333e-06
16 90303 2010-01 -0.1095 858998.89 0.000000e+00
17 91161 2010-01 -0.1217 325990.67 5.650558e-06
18 89841 2010-01 -0.0480 1175881.90 0.000000e+00
19 82515 2010-01 -0.0377 899493.15 2.865220e-05
20 84330 2010-01 0.0873 181329.09 1.329561e-04
推荐阅读
- java - 运行 savina 代码 https://github.com/shamsimam/savina
- java - 如何在 JAVA 中使用 JAXB 正确生成 XML
- python - 如何根据状态用 Python 总结范围?
- vscode-settings - 如何在调试模式下更改当前断线颜色?
- scripting - AHK 如何修复 KeyPress 没有打破循环
- c - 将结构保存在 bin 文件中后如何删除它的内容?
- xpath - 我想为两个 Web 元素编写相对 Xpath
- sql - 插入和不存在的插入之间的性能差异
- python - 如何从抓取的数据中删除字符为 '\n' 和 'xa0' 但保留空格?
- amazon-web-services - 从 Lambda 在 Amplify 中访问 S3