首页 > 解决方案 > r:自定义函数不适用于单列数据框

问题描述

我创建了一个函数,该函数在ifrom0.01.0by的所有值上生成加权移动平均值0.1。见下文:

ewmaFunc<- function(x){
  # create datafame to store results in
  result_df<- data.frame(x)
  # assign names to be applied as a column
  xnames<- names(x)
  # define range for exponential weighted moving average (ewma)
  exponent<- seq(0, 1, .1)
  # create function for ewma
  ewma<- function(x){
    x*(1-i)+dplyr::lag(x, n= 1, default = 0)*i
  }
  for(i in exponent){
    result_column<- apply(x, 2, ewma)
    result_column_name<- paste(xnames, i, sep= "_")
    result_df[result_column_name] <- result_column  
  }
  return(data.frame(result_df)) 
}

出于某种原因,当我在单个列上运行该函数时dataframe,它不会应用由 表示的自定义名称result_column_name<- paste(xnames, i, sep= "_"),但是,如果数据框有多个列,则它可以正常工作。请参阅以下示例:

test<-data.frame(var=rnorm(10,5,2))
ewmaFunc(test)

            var      var      var      var      var      var      var      var      var      var
1  8.393294 8.393294 7.553964 6.714635 5.875306 5.035976 4.196647 3.357317 2.517988 1.678659
2  4.246326 4.246326 4.661023 5.075719 5.490416 5.905113 6.319810 6.734507 7.149203 7.563900
3  3.706380 3.706380 3.760374 3.814369 3.868364 3.922358 3.976353 4.030347 4.084342 4.138337
4  5.173313 5.173313 5.026620 4.879926 4.733233 4.586540 4.439846 4.293153 4.146460 3.999766
5  5.215499 5.215499 5.211280 5.207062 5.202843 5.198624 5.194406 5.190187 5.185969 5.181750
6  3.911693 3.911693 4.042074 4.172454 4.302835 4.433216 4.563596 4.693977 4.824357 4.954738
7  4.000666 4.000666 3.991769 3.982872 3.973974 3.965077 3.956180 3.947283 3.938385 3.929488
8  3.716434 3.716434 3.744857 3.773280 3.801704 3.830127 3.858550 3.886973 3.915397 3.943820
9  4.561364 4.561364 4.476871 4.392378 4.307885 4.223392 4.138899 4.054406 3.969913 3.885420
10 3.820445 3.820445 3.894537 3.968628 4.042720 4.116812 4.190904 4.264996 4.339088 4.413180

...

两列dataframe按预期工作:

test<-data.frame(var1=rnorm(10,5,2), var2= rnorm(10, 3, 5))
ewmaFunc(test)

        var1       var2    var1_0     var2_0  var1_0.1   var2_0.1 var1_0.2   var2_0.2 var1_0.3  var2_0.3
1   6.156138  8.0737011  6.156138  8.0737011  5.540524  7.2663310 4.924910  6.4589609 4.309297 5.6515908
2   5.020908  1.8764009  5.020908  1.8764009  5.134431  2.4961309 5.247954  3.1158609 5.361477 3.7355909
3   2.491374 -0.6065826  2.491374 -0.6065826  2.744327 -0.3582843 2.997281 -0.1099859 3.250234 0.1383124
4   3.986528  5.3498418  3.986528  5.3498418  3.837012  4.7541994 3.687497  4.1585569 3.537981 3.5629145
5   7.487246  0.5405067  7.487246  0.5405067  7.137174  1.0214402 6.787102  1.5023738 6.437031 1.9833073
6   3.368964  6.0020006  3.368964  6.0020006  3.780793  5.4558512 4.192621  4.9097018 4.604449 4.3635524
7   3.857049  9.2469373  3.857049  9.2469373  3.808241  8.9224436 3.759432  8.5979500 3.710624 8.2734563
8  10.864870  5.4223945 10.864870  5.4223945 10.164088  5.8048488 9.463306  6.1873031 8.762524 6.5697574
9   8.484475  0.4140111  8.484475  0.4140111  8.722515  0.9148494 8.960554  1.4156878 9.198594 1.9165261
10  6.520918  9.9092620  6.520918  9.9092620  6.717274  8.9597369 6.913630  8.0102118 7.109985 7.0606867

...

关于为什么会这样的任何反馈?

标签: rfunctionfor-loop

解决方案


不错的收获。虽然我懒得弄清楚在分配一个或多个列时列名发生了什么,但我只想提供一个似乎可行的解决方案。诀窍是为要附加的对象提供明确的列名。

  for(i in exponent){
    result_column<- apply(x, 2, ewma)
    result_column_name<- paste(xnames, i, sep= "_")
    colnames(result_column) <- result_column_name
    result_df[, result_column_name] <- result_column
  }

> ewmaFunc(test1)
        var    var_0  var_0.1  var_0.2  var_0.3  var_0.4  var_0.5  var_0.6  var_0.7
1  6.123084 6.123084 5.510776 4.898467 4.286159 3.673850 3.061542 2.449234 1.836925
2  7.276368 7.276368 7.161040 7.045711 6.930383 6.815055 6.699726 6.584398 6.469069
3  5.767394 5.767394 5.918292 6.069189 6.220086 6.370984 6.521881 6.672779 6.823676

我的评论也是使您的 ewma 函数对参数更加明确,例如i也用作输入参数。

ewma <- function(x, i) {
  x * (1 - i) + dplyr::lag(x, n = 1, default = 0) * i
}

推荐阅读