r - 有没有更快的方法来避免 R 中 K 交叉验证中的 for 循环?
问题描述
我有一个用于在 R 中实现 K 折叠交叉验证的代码:
set.seed(123)
y = rnorm(100,0,1)
x1 = rnorm(100,0,1)
x2 = rnorm(100,0,1)
x3 = rnorm(100,0,1)
x4 = rnorm(100,0,1)
x5 = rnorm(100,0,1)
data = data.frame(y,x1,x2,x3,x4,x5);head(data)
# create k = 10 fold cross validation
folds = cut(seq(1,nrow(data)),breaks=10,labels=FALSE)
# perform the cv
for(i in 1:10){
fold = which(folds==i,arr.ind=TRUE)
testData = data[fold, ]
trainData = data[-fold, ]
}
R 是否有更快的方法来实现这个 k 折 cv 并避免 for 循环?
解决方案
您test_data
可以更有效地划分为一个列表
split(data, folds)
#or better
split(data, ceiling(seq_len(nrow(data))/10))
$`1`
y x1 x2 x3 x4 x5
1 -0.56047565 -0.71040656 2.19881035 -0.7152422 -0.07355602 -0.60189285
2 -0.23017749 0.25688371 1.31241298 -0.7526890 -1.16865142 -0.99369859
3 1.55870831 -0.24669188 -0.26514506 -0.9385387 -0.63474826 1.02678506
4 0.07050839 -0.34754260 0.54319406 -1.0525133 -0.02884155 0.75106130
5 0.12928774 -0.95161857 -0.41433995 -0.4371595 0.67069597 -1.50916654
6 1.71506499 -0.04502772 -0.47624689 0.3311792 -1.65054654 -0.09514745
7 0.46091621 -0.78490447 -0.78860284 -2.0142105 -0.34975424 -0.89594782
8 -1.26506123 -1.66794194 -0.59461727 0.2119804 0.75640644 -2.07075107
9 -0.68685285 -0.38022652 1.65090747 1.2366750 -0.53880916 0.15012013
10 -0.44566197 0.91899661 -0.05402813 2.0375740 0.22729192 -0.07921171
$`2`
y x1 x2 x3 x4 x5
11 1.2240818 -0.57534696 0.1192452 1.3011760 0.49222857 -0.09736927
12 0.3598138 0.60796432 0.2436874 0.7567748 0.26783502 0.21615254
13 0.4007715 -1.61788271 1.2324759 -1.7267304 0.65325768 0.88246516
14 0.1106827 -0.05556197 -0.5160638 -0.6015067 -0.12270866 0.20559750
15 -0.5558411 0.51940720 -0.9925072 -0.3520465 -0.41367651 -0.61643584
16 1.7869131 0.30115336 1.6756969 0.7035239 -2.64314895 -0.73479925
17 0.4978505 0.10567619 -0.4411632 -0.1056713 -0.09294102 -0.13180279
18 -1.9666172 -0.64070601 -0.7230660 -1.2586486 0.43028470 0.31001699
19 0.7013559 -0.84970435 -1.2362731 1.6844357 0.53539884 -1.03968035
20 -0.4727914 -1.02412879 -1.2847157 0.9113913 -0.55527835 -0.18430887
$`3`
y x1 x2 x3 x4 x5
21 -1.0678237 0.11764660 -0.57397348 0.23743027 1.77950291 0.9672673
22 -0.2179749 -0.94747461 0.61798582 1.21810861 0.28642442 -0.1082801
23 -1.0260044 -0.49055744 1.10984814 -1.33877429 0.12631586 -0.6984207
24 -0.7288912 -0.25609219 0.70758835 0.66082030 1.27226678 -0.2759452
25 -0.6250393 1.84386201 -0.36365730 -0.52291238 -0.71846622 1.1146485
26 -1.6866933 -0.65194990 0.05974994 0.68374552 -0.45033862 0.5500440
27 0.8377870 0.23538657 -0.70459646 -0.06082195 2.39745248 1.2366758
28 0.1533731 0.07796085 -0.71721816 0.63296071 0.01112919 0.1390979
29 -1.1381369 -0.96185663 0.88465050 1.33551762 1.63356842 0.4102751
30 1.2538149 -0.07130809 -1.01559258 0.00729009 -1.43850664 -0.5584569
$`4`
y x1 x2 x3 x4 x5
31 0.42646422 1.44455086 1.95529397 1.0175586 -0.19051680 0.6053707
32 -0.29507148 0.45150405 -0.09031959 -1.1884340 0.37842390 -0.5063335
33 0.89512566 0.04123292 0.21453883 -0.7216044 0.30003855 -1.4205655
34 0.87813349 -0.42249683 -0.73852770 1.5192177 -1.00563626 0.1279930
35 0.82158108 -2.05324722 -0.57438869 0.3773880 0.01925927 1.9458512
36 0.68864025 1.13133721 -1.31701613 -2.0522228 -1.07742065 0.8009143
37 0.55391765 -1.46064007 -0.18292539 -1.3640375 0.71270333 1.1652534
38 -0.06191171 0.73994751 0.41898240 -0.2007810 1.08477509 0.3588557
39 -0.30596266 1.90910357 0.32430434 0.8657794 -2.22498770 -0.6085572
40 -0.38047100 -1.44389316 -0.78153649 -0.1018833 1.23569346 -0.2022409
$`5`
y x1 x2 x3 x4 x5
41 -0.69470698 0.7017843 -0.7886220 0.62418747 -1.2410445 -0.2732481
42 -0.20791728 -0.2621975 -0.5021987 0.95900538 0.4547693 -0.4686998
43 -1.26539635 -1.5721442 1.4960607 1.67105483 0.6599026 0.7041673
44 2.16895597 -1.5146677 -1.1373036 0.05601673 -0.1998898 -1.1973635
45 1.20796200 -1.6015362 -0.1790516 -0.05198191 -0.6451140 0.8663661
46 -1.12310858 -0.5309065 1.9023618 -1.75323736 0.1653210 0.8641525
47 -0.40288484 -1.4617556 -0.1009749 0.09932759 0.4388187 -1.1986224
48 -0.46665535 0.6879168 -1.3598407 -0.57185006 0.8833028 0.6394920
49 0.77996512 2.1001089 -0.6647694 -0.97400958 -2.0523370 2.4302267
50 -0.08336907 -1.2870305 0.4854600 -0.17990623 -1.6363793 -0.5572155
$`6`
y x1 x2 x3 x4 x5
51 0.25331851 0.7877388 -0.37560287 1.01494317 1.4304023 0.84490424
52 -0.02854676 0.7690422 -0.56187636 -1.99274849 1.0466288 -0.78220185
53 -0.04287046 0.3322026 -0.34391723 -0.42727929 0.4352889 1.11071142
54 1.36860228 -1.0083766 0.09049665 0.11663728 0.7151784 0.24982472
55 -0.22577099 -0.1194526 1.59850877 -0.89320757 0.9171749 1.65191539
56 1.51647060 -0.2803953 -0.08856511 0.33390294 -2.6609228 -1.45897073
57 -1.54875280 0.5629895 1.08079950 0.41142992 1.1102771 -0.05129789
58 0.58461375 -0.3724388 0.63075412 -0.03303616 -0.4849876 -0.52692518
59 0.12385424 0.9769734 -0.11363990 -2.46589819 0.2306168 -0.19726487
60 0.21594157 -0.3745809 -1.53290200 2.57145815 -0.2951578 -0.62957874
$`7`
y x1 x2 x3 x4 x5
61 0.37963948 1.0527115 -0.52111732 -0.2052993 0.87196495 -0.8338436
62 -0.50232345 -1.0491770 -0.48987045 0.6511933 -0.34847245 0.5787224
63 -0.33320738 -1.2601552 0.04715443 0.2737665 0.51850377 -1.0875807
64 -1.01857538 3.2410399 1.30019868 1.0246732 -0.39068498 1.4840309
65 -1.07179123 -0.4168576 2.29307897 0.8176594 -1.09278721 -1.1862066
66 0.30352864 0.2982276 1.54758106 -0.2097932 1.21001051 0.1010792
67 0.44820978 0.6365697 -0.13315096 0.3781678 0.74090001 0.5329893
68 0.05300423 -0.4837806 -1.75652740 -0.9454088 1.72426224 0.5867353
69 0.92226747 0.5168620 -0.38877986 0.8569230 0.06515393 -0.3017467
70 2.05008469 0.3689645 0.08920722 -0.4610383 1.12500275 0.0795020
$`8`
y x1 x2 x3 x4 x5
71 -0.4910312 -0.21538051 0.84501300 2.41677335 1.9754191 0.96126415
72 -2.3091689 0.06529303 0.96252797 -1.65104890 -0.2814821 -1.45646592
73 1.0057385 -0.03406725 0.68430943 -0.46398724 -1.3229511 -0.78173971
74 -0.7092008 2.12845190 -1.39527435 0.82537986 -0.2393516 0.32040231
75 -0.6880086 -0.74133610 0.84964305 0.51013255 -0.2140412 -0.44478198
76 1.0255714 -1.09599627 -0.44655722 -0.58948104 0.1516805 1.37000399
77 -0.2847730 0.03778840 0.17480270 -0.99678074 1.7123050 0.67325386
78 -1.2207177 0.31048075 0.07455118 0.14447570 -0.3261439 0.07216675
79 0.1813035 0.43652348 0.42816676 -0.01430741 0.3730047 -1.50775732
80 -0.1388914 -0.45836533 0.02467498 -1.79028124 -0.2276841 0.02610023
$`9`
y x1 x2 x3 x4 x5
81 0.005764186 -1.06332613 -1.6674751 0.03455107 0.02045071 -0.3164159
82 0.385280401 1.26318518 0.7364960 0.19023032 0.31405766 -0.1023465
83 -0.370660032 -0.34965039 0.3860266 0.17472640 1.32821470 -1.1815592
84 0.644376549 -0.86551286 -0.2656516 -1.05501704 0.12131838 0.4986580
85 -0.220486562 -0.23627957 0.1181445 0.47613328 0.71284232 -1.0389564
86 0.331781964 -0.19717589 0.1340386 1.37857014 0.77886003 -0.2262220
87 1.096839013 1.10992029 0.2210195 0.45623640 0.91477327 0.3814258
88 0.435181491 0.08473729 1.6408462 -1.13558847 -0.57439455 -0.7835158
89 -0.325931586 0.75405379 -0.2190504 -0.43564547 1.62688121 0.5829914
90 1.148807618 -0.49929202 0.1680654 0.34610362 -0.38095674 -1.3165104
$`10`
y x1 x2 x3 x4 x5
91 0.9935039 0.21444531 1.16838387 -0.6470456 -0.1057842 -2.8097747
92 0.5483970 -0.32468591 1.05418102 -2.1576463 1.4040503 0.4649680
93 0.2387317 0.09458353 1.14526311 0.8842508 1.2940839 0.8405398
94 -0.6279061 -0.89536336 -0.57746800 -0.8294776 -1.0899919 -0.2858454
95 1.3606524 -1.31080153 2.00248273 -0.5735603 -0.8730710 0.5041263
96 -0.6002596 1.99721338 0.06670087 1.5039006 -1.3580791 -1.1559165
97 2.1873330 0.60070882 1.86685184 -0.7741449 0.1818472 -0.1271486
98 1.5326106 -1.25127136 -1.35090269 0.8457315 0.1648409 -1.9415184
99 -0.2357004 -0.61116592 0.02098359 -1.2606829 0.3641147 1.1811809
100 -1.0264209 -1.18548008 1.24991457 -0.3545424 0.5521577 1.8599109
现在你train_data
也可以通过使用purrr::map
和创建anti_join
为一个列表
map(split(data, ceiling(seq_len(nrow(data))/10)), ~ anti_join(data, .x))
它的 baseR 等价物应该是
Map(function(x) setdiff(data, x), split(data1, ceiling(seq_len(nrow(data))/10)))
推荐阅读
- azure - 为什么重新启动函数应用的 Azure 应用服务不同步 azure 函数?
- python - 从 matplotlib 中的自动缩放中排除高于/低于阈值的值
- python - 如何在 Python 中撤消 os.unlink()?
- php - Instagram 基本显示 API:如何获取 CODE
- java - getParameterMetaData 上的 OJDBC8 SQLFeatureNotSupportedException
- c# - 如何从 .NET Core API 在 Azure 门户中调用函数 App
- sustainsys-saml2 - Sustainsys.Saml2 v2.4.0 与 Chrome 版本 81 和相同站点 cookie 不起作用
- java - Java / 如何创建具有其他用户 ID 的文件?
- spring-boot - 在 Maven 项目(SpringBoot)中配置 Gitlab CI 时出错
- javascript - 如何发布一个 JavaScript 对象,其中每个对象项成为带有 xmlhttprequest 的 $_POST 行?