首页 > 解决方案 > 两个 data.frame 列之间的条件差异

问题描述

我有一个整洁的实验数据框架,受试者在两个不同的条件下()在一个因连续变量上以不同的(!)个时间点( )ID测量了三次( ),比如说:TrialSessionDirectionLC

set.seed(5)
nSubjects <- 4
nDirections <- 2
nTrials <- 3
# Between 1 and 3 sessions per subject:
nSessions <- round(runif(nSubjects,
                         min = 1, max = 3))
mydat <- data.frame(ID = do.call(rep, args = list(1:nSubjects,
                                                  times = nSessions * nDirections * nTrials)),
                    Session = rep(sequence(nSessions),
                                  each = nDirections * nTrials),
                    Trial = rep(rep(1:nTrials,
                                    each = nDirections),
                                times = sum(nSessions)),
                    Direction = rep(c("up", "down"),
                                    times = nTrials * sum(nSessions)),
                    LC = 1:(nDirections * nTrials * sum(nSessions)))

我想计算的是一个长度向量,nrow(mydat)其中包含LC给定主题和试验以及方向的第一个和当前会话之间的差异。换句话说,从LC任何 ID、会话、试验和方向的每个(绝对)分数中,LC减去相同 ID、试验和方向的会话 == 1 的(绝对)分数,就像这样(为了简单起见,我选择了LC单调递增):

#     ID Session Trial Direction LC LC_diff
#  7   2       1     1        up  7       0
#  8   2       1     2      down  8       0
#  9   2       1     3        up  9       0
# 10   2       1     1      down 10       0
# 11   2       1     2        up 11       0
# 12   2       1     3      down 12       0
# 13   2       2     1        up 13       6
# 14   2       2     2      down 14       6
# 15   2       2     3        up 15       6
# 16   2       2     1      down 16       6
# 17   2       2     2        up 17       6
# 18   2       2     3      down 18       6

我认为下面的代码会产生预期的结果:

library(dplyr)
ordered <- group_by(mydat, ID, Session, Trial, Direction)
mydat$LC_diff <- summarise(ordered,
                           Diff = sum(abs(LC[Trial != 1]),
                                      - abs(LC[Trial == 1])))$Diff

可惜:

mydat[7:18, ]

#    ID Session Trial Direction LC LC_diff
# 7   2       1     1        up  7      -8
# 8   2       1     2      down  8      -7
# 9   2       1     3        up  9      10
# 10  2       1     1      down 10       9
# 11  2       1     2        up 11      12
# 12  2       1     3      down 12      11
# 13  2       2     1        up 13     -14
# 14  2       2     2      down 14     -13
# 15  2       2     3        up 15      16
# 16  2       2     1      down 16      15
# 17  2       2     2        up 17      18
# 18  2       2     3      down 18      17

我在这里完全不知所措,希望能指出我的代码错误的地方。

标签: r

解决方案


我不确定这是你的意思,但data.table会是这样的:

library(data.table)
setDT(mydat)[,new:= abs(LC)-abs(LC[1]),by=.(ID, Trial, Direction)]
mydat[ID==2,]
    ID Session Trial Direction LC new
 1:  2       1     1        up  7   0
 2:  2       1     1      down  8   0
 3:  2       1     2        up  9   0
 4:  2       1     2      down 10   0
 5:  2       1     3        up 11   0
 6:  2       1     3      down 12   0
 7:  2       2     1        up 13   6
 8:  2       2     1      down 14   6
 9:  2       2     2        up 15   6
10:  2       2     2      down 16   6
11:  2       2     3        up 17   6
12:  2       2     3      down 18   6

推荐阅读