首页 > 解决方案 > 如果元素在向量中,则所有组都在 dplyr 中取一些值

问题描述

我有一个国家数据集

disjoint_set <- c("CUW", "ARM")

id <- c(1,1,1,1,2,2,2,2,3,3,3,3)
period <- c(1,1,2,2,1,2,3,4,1,1,1,2)
iso <- c("CUW","USA","ARM","SPA","CUW","ARM","CHN","ARM","USA","CHN","ARM","GER")

countries <- data.frame(id, period, iso)

按id分组,如果变量iso的某个元素在向量disjoint_set中,我想分配 1 ,否则为 0(我希望为定义的组的所有元素分配 1 或 0)。新数据集看起来像countries_not_appear

disjoint = c(1,1,1,1,1,1,0,1,1,1,1,0)
countries_not_appear <- data.frame(id, period, iso, disjoint)

我尝试了以下方法。但这并没有成功

countries_not_appear <- countries %>% group_by(id, period) %>% mutate(disjoint = ifelse(iso %in% disjoint_set, 1, 0))

有什么线索吗?

标签: rgroup-bydplyr

解决方案


ifelse逐行比较。如果你想检查每个组使用any

library(dplyr)

disjoint_set <- c("CUW", "ARM")

id <- c(1,1,1,1,2,2,2,2,3,3,3,3)
period <- c(1,1,2,2,1,2,3,4,1,1,1,2)
iso <- c("CUW","USA","ARM","SPA","CUW","ARM","CHN","ARM","USA","CHN","ARM","GER")

countries <- data.frame(id, period, iso)

countries_not_appear <- data.frame(id, period, iso, disjoint)
#> Error in data.frame(id, period, iso, disjoint): object 'disjoint' not found

countries %>% 
  group_by(id, period) %>% 
  mutate(disjoint = as.numeric(any(iso %in% disjoint_set)))
#> # A tibble: 12 x 4
#> # Groups:   id, period [8]
#>       id period iso   disjoint
#>    <dbl>  <dbl> <chr>    <dbl>
#>  1     1      1 CUW          1
#>  2     1      1 USA          1
#>  3     1      2 ARM          1
#>  4     1      2 SPA          1
#>  5     2      1 CUW          1
#>  6     2      2 ARM          1
#>  7     2      3 CHN          0
#>  8     2      4 ARM          1
#>  9     3      1 USA          1
#> 10     3      1 CHN          1
#> 11     3      1 ARM          1
#> 12     3      2 GER          0

reprex 包于 2021-05-20 创建 (v2.0.0 )


推荐阅读