首页 > 解决方案 > 使用 dplyr mutate_at 根据其向量估计每个值的分位数

问题描述

我正在尝试data.frame使用与其值关联的分位数来修改我的数值变量。这个问题在那里那里被回答了两次。但是,我有一个data.frame包含多个变量的大变量,我想通过调用来完成它们mutate_at

我通过一个简单的mutate调用成功地做到了:

library(dplyr)    
dd <- data.frame(id = letters[1:10], v1 = runif(10), v2 = runif(10))
dd %>% 
  mutate(v1_quantile = ecdf(v1)(v1))

哪个有效:

   id         v1         v2 v1_quantile
1   a 0.08301544 0.77170687         0.1
2   b 0.40879466 0.85685036         0.6
3   c 0.51528499 0.75810797         0.8
4   d 0.39688235 0.85030203         0.5
5   e 0.22272271 0.40929666         0.2
6   f 0.29234964 0.05501699         0.4
7   g 0.58406590 0.57812295         0.9
8   h 0.49091149 0.74505186         0.7
9   i 0.92299912 0.88631354         1.0
10  j 0.27979485 0.03078879         0.3

但是,如果我尝试在mutate_at通话中执行此操作,则会收到错误消息:

dd %>% 
   mutate_at(vars(one_of("v1", "v2")), list(quantile = ~ecdf(.)(.)))
Error in `[.data.frame`(x, order(x, na.last = na.last, decreasing = decreasing)) : 
  undefined columns selected

知道我做错了什么以及如何解决吗?

我的sessionInfo

R version 3.5.1 (2018-07-02)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows Server x64 (build 14393)

Matrix products: default

locale:
[1] LC_COLLATE=English_United States.1252
[2] LC_CTYPE=English_United States.1252
[3] LC_MONETARY=English_United States.1252
[4] LC_NUMERIC=C
[5] LC_TIME=English_United States.1252

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base

other attached packages:
[1] dplyr_0.8.0.1

loaded via a namespace (and not attached):
 [1] tidyselect_0.2.5      compiler_3.5.1        magrittr_1.5
 [4] assertthat_0.2.1      R6_2.4.0              pillar_1.4.2
 [7] glue_1.3.1            rstudioapi_0.8.0.9000 tibble_2.1.3
[10] crayon_1.3.4          Rcpp_1.0.2            pkgconfig_2.0.3
[13] rlang_0.4.0           purrr_0.3.0

标签: rdplyrquantile

解决方案


推荐阅读