r - R Automatically name results of Loop
问题描述
I have a list of dataframes:
df_DJF = data.frame(replicate(2,sample(0:130,30,rep=TRUE)))
df_JJA = data.frame(replicate(2,sample(0:130,20,rep=TRUE)))
df_MAM = data.frame(replicate(2,sample(0:130,25,rep=TRUE)))
df_SON = data.frame(replicate(2,sample(0:130,15,rep=TRUE)))
df_list = list(df_DJF, df_JJA, df_MAM, df_SON)
I want to randomly choose 80% of each the dataframe. I can do that manually by doing this and using the sample_size as row index.
sample_size = floor(0.8*nrow(df_DJF))
picked_DJF = sample(seq_len(nrow(df_DJF)), size = sample_size)
My problem is that I have very many df with different number of rows. So I want to automatize this process. In the end I want to have 4 sample sizes with the correct number in it. The names of the sample_sizes should be:
samplenames = paste("sample_size", c("DJF", "JJA", "MAM", "SON"), sep = "_")
Same for the "picked"...it should be picked_DJF and so on...
解决方案
继续使用列表,而不是assign
. 设置您的names(df_list) = c("DJF", "JJA", "MAM", "SON")
, 然后为后续列表使用相同的名称,例如picked
列表。
# for a single sample size
picked = lapply(df_list, function(x) x[sample(1:nrow(x), size = floor(0.8 * nrow(x))), ])
Usinglapply
将保留原始列表的名称,因此您不必担心。
对于来自每个数据帧的多个样本大小,您可以创建一个带有嵌套的嵌套列表lapply
:
names(df_list) = c("DJF", "JJA", "MAM", "SON")
sample_prop = list(s1 = 0.2, s2 = 0.4, s3 = 0.6, s4 = 0.8)
picked = lapply(df_list, function(df) lapply(sample_prop, function(sp) {
df[sample(nrow(df), size = floor(sp * nrow(df))), ]
}))
# then access individual data frames with `$` or `[[`
picked$JJA$s3
# X1 X2
# 17 70 128
# 7 94 121
# 1 57 125
# 8 32 75
# 9 15 8
# 19 58 15
# 20 55 17
# 10 42 15
# 4 51 67
# 12 89 13
# 2 74 50
# 14 77 36
将数据框划分为“picked”和“unpicked”split
是有道理的。它已经返回一个list
. 这将给出一个三重嵌套列表结果:
result = lapply(df_list, function(df) lapply(sample_prop, function(sp) {
n_pick = floor(sp * nrow(df))
n_unpick = nrow(df) - n_pick
split(df, f = c(rep("picked", n_pick), rep("unpicked", n_unpick))[sample(nrow(df))])
}))
result$JJA$s3$unpicked
# X1 X2
# 2 74 50
# 3 62 78
# 4 51 67
# 6 103 42
# 7 94 121
# 11 59 60
# 14 77 36
# 16 83 72
推荐阅读
- python-3.x - 面临 seaborn.heatmap() 的问题
- php - 尝试更新信息时出错
- ios - DispatchSemaphore 是否等待特定的线程对象?
- python - 在机器学习中标准化数据集会降低准确性吗?
- firebase-realtime-database - 从 firebase 实时数据库 orderby 中检索数据
- excel - Excel VBA:从求解器迭代中复制单元格并使用 VBA 将结果粘贴到表中
- django - 尝试在标记模型中列出标记时出现“无法解析超链接关系的 URL”错误
- python - 有没有办法让语音识别不断地转录?(Python)
- python - 如何使用旋转矩阵旋转 3d 数组 - python
- python - 让 mypy 识别断言类型的列表