首页 > 解决方案 > 如果每个组在 R 中都有特定的块大小,如何将函数拆分或应用到散布的组?

问题描述

我有一个包含两组的数据框,但它们没有任何标识符。这两个组被多次穿插,但总是以相同的块大小。例如,在一个有 100 行的数据帧中,前 10 行属于 A 组,接下来的 6 行属于 B 组,接下来的 10 行又属于 A 组,等等。在下面的代码片段中,第 1 到 10 行属于A组,11到16属于B组,17到26又属于A组,以此类推。

1,0.001284150523134
2,0.002207901328802
3,0.002915323944762
4,0.003469731891528
5,0.003921566996723
6,0.004299059510231
7,0.004616158548743
8,0.004884272348136
9,0.005112133454531
10,0.005309570115060
11,0.004684340208769
12,0.004182199947536
13,0.003777556587011
14,0.003452226985246
15,0.003190805669874
16,0.002980756806210
17,0.003067432902753
18,0.003176181111485
19,0.003286415245384
20,0.003386073280126
21,0.003470669966191
22,0.003541931044310
23,0.003600175259635
24,0.003642340423539
25,0.003669032361358
26,0.003684990806505
...

如何将此数据框一分为二?或者更好的是,我怎样才能将计算/函数应用于这些块中的每一个,一次一个?

标签: rdataframesplit

解决方案


我认为您可以使用一些序列创建一个计数器:

dat <- data.frame(id=c(1:32))

dat$grp <- rep(rep(c(1,2), c(10,6)), length.out=nrow(dat))
dat
#   id grp
#1   1   1
 ...
#10 10   1
#11 11   2
 ...
#16 16   2
#17 17   1
 ...
#26 26   1
#27 27   2
 ...
#32 32   2

然后你可以通过///data.table's等在每个组中使用你想要aggregate的任何功能bydplyr::group_byby=


推荐阅读