首页 > 解决方案 > R 条件变量

问题描述

我想为每个“站点”计算一个以 x= BC5、BC6、BC7 为条件的新变量(mean.BC)。换句话说,取 mean(19,70,84) 并将结果传递到具有站点“a”的所有行中,然后对具有站点“b、c 等...”的所有行执行相同操作,除了BC5、BC6、BC7 的“y”会在每个站点发生变化。

这可能不是最好的方法,我确实尝试使用 tidyr::spread() 使用“x”作为键来传播数据,但 id 对我来说没有任何意义。

x <- c("A1", "B2", "C3", "D4", "BC5", "BC6", "BC7")
y <- c(34, 45, 11, 10, 19, 70, 84, 12, 45, 55, 67, 89, 23, 1)
site <- c(rep("a", 7), rep("b", 7))

test.data <- data.frame(site, x, y)

# site x  y   meanBC
# 1    a        A1 34   
# 2    a        B2 45
# 3    a        C3 11
# 4    a        D4 10
# 5    a       BC5 19
# 6    a       BC6 70

test.data %>% as.tibble() %>% 
  group_by(site) %>% 
  mutate(meanBC= if_else(test.data$x==c("BC5","BC6","BC7"), mean(y), 999))
#> Error in test.data %>% as.tibble() %>% group_by(site) %>% mutate(meanBC = if_else(test.data$x == : could not find function "%>%"

期望的结果应该是这样的:

site rep.x..2.  y   meanBC
# 1    a        A1 34   57.6
# 2    a        B2 45   57.6
# 3    a        C3 11   57.6
# 4    a        D4 10   57.6
# 5    a       BC5 19   57.6
# 6    a       BC6 70   57.6

标签: r

解决方案


使用dplyr,我们可以group_by site计算meany其对应x的是 之一c("BC5", "BC6","BC7")

library(dplyr)
test.data %>%
   group_by(site) %>%
   mutate(mean.BC = mean(y[x %in% c("BC5", "BC6","BC7")]))

#  site   x       y mean.BC
# <fct>  <fct> <dbl>   <dbl>
# 1 a     A1       34    57.7
# 2 a     B2       45    57.7
# 3 a     C3       11    57.7
# 4 a     D4       10    57.7
# 5 a     BC5      19    57.7
# 6 a     BC6      70    57.7
# 7 a     BC7      84    57.7
# 8 b     A1       12    37.7
# 9 b     B2       45    37.7
#10 b     C3       55    37.7
#11 b     D4       67    37.7
#12 b     BC5      89    37.7
#13 b     BC6      23    37.7
#14 b     BC7       1    37.7

或与data.table

library(data.table)
setDT(test.data)[, mean.BC := mean(y[x %in% c("BC5", "BC6","BC7")]), by = site]

推荐阅读