r - 使用 for 循环从正态分布中采样
问题描述
所以我试图从均匀分布中抽样 1000 次,每次计算来自所述均匀分布的 20 个随机样本的平均值。
Now let's loop through 1000 times, sampling 20 values from a uniform distribution and computing the mean of the sample, saving this mean to a variable called sampMean within a tibble called uniformSampleMeans.
{r 2c}
unif_sample_size = 20 # sample size
n_samples = 1000 # number of samples
# set up q data frame to contain the results
uniformSampleMeans <- tibble(sampMean = runif(n_samples, unif_sample_size))
# loop through all samples. for each one, take a new random sample,
# compute the mean, and store it in the data frame
for (i in 1:n_samples){
uniformSampleMeans$sampMean[i] = summarize(uniformSampleMeans = mean(unif_sample_size))
}
我成功地生成了一个小标题,但是值是“NaN”。此外,当我进入我的 for 循环时,我得到一个错误。
Error in summarise_(.data, .dots = compat_as_lazy_dots(...)) : argument ".data" is missing, with no default
任何见解将不胜感激!
解决方案
data.frame
逐行构建在性能上是可怕的(每次添加一个时它都会对所有行进行完整的复制......所以第 900 行,添加一行你有两次原来的 900 行......这扩展性很差) .
此外,要意识到,抽取许多小的随机样本比只抽取一个较大的样本要昂贵得多。
set.seed(42)
m <- matrix(rnorm(1000*20), ncol = 20)
head(m)
# [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12]
# [1,] 1.371 2.325 0.251 -0.686 -0.142 0.0712 0.173 1.4163 -0.0575 -0.9221 1.163 -0.2945
# [2,] -0.565 0.524 -0.278 -0.793 -0.814 0.9703 -1.273 0.5572 -0.2490 -0.4958 -0.190 0.4641
# [3,] 0.363 0.971 -1.725 -0.407 -0.326 0.3100 -0.868 0.9812 -1.5242 -3.1105 -0.289 -1.5371
# [4,] 0.633 0.377 -2.007 -1.149 0.378 -0.1395 0.626 -0.5862 0.4636 -0.6928 -0.399 0.9862
# [5,] 0.404 -0.996 -1.292 1.116 -1.994 -0.3263 -0.106 0.9392 -1.1876 0.2989 0.709 0.6302
# [6,] -0.106 -0.597 0.366 -0.879 -0.999 -0.1188 -0.256 -0.0647 0.4941 -0.0687 -1.623 0.0573
# [,13] [,14] [,15] [,16] [,17] [,18] [,19] [,20]
# [1,] 0.0538 -1.80043 -2.29607 -1.020 0.496 0.110 1.0251 1.790
# [2,] 0.7534 -0.10643 0.00465 -0.754 0.519 -0.741 -1.4492 -0.262
# [3,] 0.2499 1.83347 -1.61634 -1.226 -0.422 -0.511 1.4175 -1.297
# [4,] -0.4441 1.02390 1.73313 -1.017 0.863 -0.912 -1.0353 0.618
# [5,] -0.0503 -0.00429 -0.67368 1.722 -0.778 -1.293 0.0853 -0.292
# [6,] -0.4678 2.27991 -0.09442 3.000 0.148 0.905 0.2451 -0.301
m2 <- apply(m, 1, mean)
length(m2)
# [1] 1000
head(m2)
# [1] 0.1513 -0.2089 -0.4366 -0.0339 -0.1544 0.0959
mean(m[1,])
# [1] 0.151
tibble(i = seq_along(m2), mu = m2)
# # A tibble: 1,000 x 2
# i mu
# <int> <dbl>
# 1 1 0.151
# 2 2 -0.209
# 3 3 -0.437
# 4 4 -0.0339
# 5 5 -0.154
# 6 6 0.0959
# 7 7 0.105
# 8 8 -0.503
# 9 9 0.0384
# 10 10 -0.175
# # ... with 990 more rows
推荐阅读
- mysql - 错误 1045 <28000>:拒绝用户访问
- c# - c# asp .net core HTTP POST Request 适用于 Postman 但不适用于我的 Angular 客户端(404 错误)
- c# - 在 C# 中获取我的 JSON 数据的特定部分
- angular - Ionic 3,限制离子输入的十进制字符数
- javascript - 使用 vue + webpack 加载图片
- ajax - VueJs - V-for 渲染失败属性或方法未定义
- r - dplyr 通过排除加入?
- java - JUnit4:如何验证变量名是否正确
- slack - 如何使用 Slack Bot 获取自己发送的所有消息的列表?
- c++ - glibcxx STL 在 std::valarray::sum() 的实现中是否不正确?