首页 > 解决方案 > 如何在分组数据帧上使用 dplyr::across 和多参数函数?

问题描述

我想计算多列的加权移动平均值,每列使用相同的权重。加权移动平均值应按组计算(与使用具有多个参数的函数的 `dplyr::across`形成对比)。

在下面的示例中,分组应该使加权移动平均值每年“重置”,从而为每年的前两个观测值产生缺失值。

我该如何进行这项工作?

library(tidyverse)

weighted.filter <- function(x, wt, filter, ...) {
  filter <- filter / sum(filter)
  stats::filter(x * wt, filter, ...) / stats::filter(wt, filter, ...)
}

economics %>%
  group_by(year = lubridate::year(date)) %>%
  arrange(date) %>%
  mutate(across(
    c(pce, psavert, uempmed),
    list("moving_average_weighted" = weighted.filter),
    wt = pop, filter = rep(1, 3), sides = 1
  ))
#> Error: Problem with `mutate()` input `..1`.
#> x Input `..1` can't be recycled to size 12.
#> ℹ Input `..1` is `(function (.cols = everything(), .fns = NULL, ..., .names = NULL) ...`.
#> ℹ Input `..1` must be size 12 or 1, not 6.
#> ℹ The error occurred in group 2: year = 1968.

reprex 包于 2021-03-31 创建(v1.0.0)

标签: rdplyrmoving-averageacross

解决方案


尝试

economics %>%
  group_by(year = lubridate::year(date)) %>%
  arrange(date) %>%
  mutate(across(
    c(pce, psavert, uempmed),
    list("moving_average_weighted" =
      ~ weighted.filter(., wt = pop, filter = rep(1, 3), sides = 1))
  ))
# # A tibble: 574 x 10
# # Groups:   year [49]
#    date         pce    pop psavert uempmed unemploy  year pce_moving_average_w~ psavert_moving_avera~ uempmed_moving_avera~
#    <date>     <dbl>  <dbl>   <dbl>   <dbl>    <dbl> <dbl>                 <dbl>                 <dbl>                 <dbl>
#  1 1967-07-01  507. 198712    12.6     4.5     2944  1967                   NA                   NA                   NA   
#  2 1967-08-01  510. 198911    12.6     4.7     2945  1967                   NA                   NA                   NA   
#  3 1967-09-01  516. 199113    11.9     4.6     2958  1967                  511.                  12.4                  4.60
#  4 1967-10-01  512. 199311    12.9     4.9     3143  1967                  513.                  12.5                  4.73
#  5 1967-11-01  517. 199498    12.8     4.7     3066  1967                  515.                  12.5                  4.73
#  6 1967-12-01  525. 199657    11.8     4.8     3018  1967                  518.                  12.5                  4.80
#  7 1968-01-01  531. 199808    11.7     5.1     2878  1968                   NA                   NA                   NA   
#  8 1968-02-01  534. 199920    12.3     4.5     3001  1968                   NA                   NA                   NA   
#  9 1968-03-01  544. 200056    11.7     4.1     2877  1968                  536.                  11.9                  4.57
# 10 1968-04-01  544  200208    12.3     4.6     2709  1968                  541.                  12.1                  4.40
# # ... with 564 more rows

推荐阅读