首页 > 解决方案 > 如何在软件 R 中没有“for”的情况下高效编程?

问题描述

我正在尝试在没有一个“for”的情况下更有效地进行编程,但是当我删除一个循环时,时间会增加。

我做错了什么?

请不要关注结果,因为这些数字是符号,我在“for”中有更多代码。我需要改进这条线:“问题线”。

尝试 1 耗时 1.7 秒

尝试 2 需要 9 秒

nSteps = 200; p=0.45

v = data.frame(matrix(0,nrow=nSteps+1,ncol=nSteps+1))
v[nSteps+1,] <- rep(0.2,nSteps+1)

check = data.frame(matrix(15,nrow=nSteps+1,ncol=nSteps+1))

#################
### attempt 1 ###
    #################

   for ( m in nSteps:0){
    for (n in 1:(m+1)){
        hold = (1-p)*v[m+1,n]+p*v[m+1,n+1] #### problem line
        v[m,n] = ifelse(check[m,n]>=0,max(check[m,n],hold),max(hold,0))
    # more code here...     
    }
}

#################
### attempt 2 ###
    #################

seq1 = 1:nSteps
seq2 = 2:(nSteps+1)
for ( m in (nSteps:1)){
    vec = (1-p)*v[m+1,seq1]+p*v[m+1,seq2] ##### problem line
    v[m,]<-c(t(vec),0)
    # more code here... 
}

标签: r

解决方案


我快速浏览了一下,似乎可以通过将 data.frame 更改为 data.matrix 来提高性能。
一般来说,Matrix 的性能比 Data Frames 好得多,请查看以下内容

https://csgillespie.github.io/efficientR/7-1-data-types.html#matrix

我不确定您要完成什么...

例如,以下操作(标量乘以矩阵,乘以向量)使用 Data Matrix 明显更快

DF <- data.frame(a = 1:3, b = 4:6,c = 7:9)
V <- data.frame(a = 10:12)

dm <- data.matrix(DF)
dv <- data.matrix(V)

DFl <- list()
dml <- list()

system.time(
  for ( m in 2500:1){
    DFl[[m]] <-( 3 * DF * V[,1])
  }
)

system.time(
  for ( m in 2500:1){
    dml[[m]] <- (  3 * dm * dv[1])
  }
)

如果通过更改为矩阵(大约快 3 倍),第一种情况的性能会大大提高。

运行以下返回

尝试 1

用户系统已过

2.11 0.00 2.11

尝试 1a

用户系统已过

0.69 0.00 0.69

尝试 2

用户系统已过

8.60 0.00 8.63

尝试 3
用户系统已过

0.02 0.00 0.02

比较结果

真的

library(compare)
nSteps = 200; p=0.45

v = data.frame(matrix(0,nrow=nSteps+2,ncol=nSteps+2))

#CS added extra row, looks like the logic was assuming that out of range DF returns NULL and errors subscript out of bounds on data.matrix

v[nSteps+1,] <- rep(0.2,nSteps+1)
vtemp <- v

check = data.frame(matrix(15,nrow=nSteps+1,ncol=nSteps+1))


#################
### attempt 1 ###
#################
v<- vtemp 
system.time(
for ( m in nSteps:0){
  for (n in 1:(m+1)){
    hold = (1-p)*v[m+1,n]+p*v[m+1,n+1] #### problem line
    v[m,n] = ifelse(check[m,n]>=0,max(check[m,n],hold),max(hold,0))
    # more code here...     
  }
}
)
v1 <- v

#################
### attempt 1a ###
#################
v<- vtemp 
check2 = matrix(15,nrow=nSteps+1,ncol=nSteps+1)
v1a <- data.matrix(v) 
system.time(
  for ( m in nSteps:0){
    for (n in 1:(m+1)){
      hold = (1-p)*v1a[m+1,n]+p*v1a[m+1,n+1] #### problem line
      v1a[m,n] = ifelse(check[m,n]>=0,max(check[m,n],hold),max(hold,0))
      # more code here...     
    }
  }
)

v1a <- data.frame(v1a)
compare(v1,v1a)



#################
### attempt 2 ###
#################

v = data.frame(matrix(0,nrow=nSteps+1,ncol=nSteps+1))
v[nSteps+1,] <- rep(0.2,nSteps+1)
vtemp <- v


seq1 = 1:nSteps
seq2 = 2:(nSteps+1)
system.time(
for ( m in (nSteps:1)){
  vec = (1-p)*v[m+1,seq1]+p*v[m+1,seq2] ##### problem line
  v[m,]<-c(t(vec),0)
  # more code here... 
}
)
v2 <- v

#################
### attempt 3 ###
#################

seq1 = 1:nSteps
seq2 = 2:(nSteps+1)

v3 <- data.matrix(vtemp)
#note Matrix index is 0 based
system.time(
  for ( m in (nSteps:0)){
    vec = (1-p) * v3[m+1,seq1] + p * v3[m+1,seq2] ##### problem line
    v3[m,]<-c(t(vec),0)
    # more code here... 
  }
)
v3 <- data.frame(v3)

compare(v2,v3)

推荐阅读