首页 > 解决方案 > 使用 R 中的剪切函数剪切数据时显示空组

问题描述

我有一个这样的数据框

gender <- c("m","m","m","m","m","f","f","f","f","f")
age <- c(18,28,39,49,3,
         13,16,6,19,37)

df <- data.frame(gender,age,stringsAsFactors = F) 

我正在尝试创建一个ageband包含 0-50 的 5 组的列。

df %>%
  mutate(ageband = cut( age, breaks = seq(0, 50, 5), right = FALSE)) %>%
  group_by(gender, ageband) %>%
  mutate(population = 1)  %>%
  summarize(population = sum(population, na.rm = TRUE)) 

我得到这个输出

 gender ageband population
1 f      [5,10)           1
2 f      [10,15)          1
3 f      [15,20)          2
4 f      [35,40)          1
5 m      [0,5)            1
6 m      [15,20)          1
7 m      [25,30)          1
8 m      [35,40)          1
9 m      [45,50)          1

这不会向我显示具有空行的组。我想用人口 = 0 填充空行。

我想要的输出是

   gender ageband population
1       f   [0,5)          0
2       f  [5,10)          1
3       f [10,15)          1
4       f [15,20)          2
5       f [20,25)          0
6       f [25,30)          0
7       f [30,35)          0
8       f [35,40)          1
9       f [40,45)          0
10      f [45,50)          0
11      m   [0,5)          1
12      m  [5,10)          0
13      m [10,15)          0
14      m [15,20)          1
15      m [20,25)          0
16      m [25,30)          1
17      m [30,35)          0
18      m [35,40)          1
19      m [40,45)          0
20      m [45,50)          1

我试过这样做,但不太好用

df %>%
  mutate(ageband = cut( age, breaks = seq(0, 50, 5), right = FALSE)) %>%
  group_by(gender, ageband) %>%
  mutate(population = 1)  %>%
  summarize(population = sum(population, na.rm = TRUE)) %>%
  mutate(population = coalesce(population, 0L))

有人可以指出我正确的方向吗?

标签: rdataframedplyr

解决方案


通过添加tidyr,您可以执行以下操作:

df %>%
 mutate(ageband = cut(age, breaks = seq(0, 50, 5), right = FALSE)) %>%
 count(gender, ageband) %>%
 complete(ageband, nesting(gender), fill = list(n = 0)) %>%
 arrange(gender, ageband)

  ageband gender     n
   <fct>   <chr>  <dbl>
 1 [0,5)   f          0
 2 [5,10)  f          1
 3 [10,15) f          1
 4 [15,20) f          2
 5 [20,25) f          0
 6 [25,30) f          0
 7 [30,35) f          0
 8 [35,40) f          1
 9 [40,45) f          0
10 [45,50) f          0
11 [0,5)   m          1
12 [5,10)  m          0
13 [10,15) m          0
14 [15,20) m          1
15 [20,25) m          0
16 [25,30) m          1
17 [30,35) m          0
18 [35,40) m          1
19 [40,45) m          0
20 [45,50) m          1

推荐阅读