首页 > 解决方案 > R 中的表格操作 - dplyr/tidyverse 解决方案

问题描述

我正在尝试将一个 df 转换为另一种格式。

开始df:

df <- data.frame(id = c("1", "2", "3","4", "5", "6", "7", "8", "9", "10"),
    criteria_A = c("present", "present", "absent", "absent", "absent", "present", "absent", "present", "absent", "present"),
    criteria_B =c("absent", "absent", "present", "absent", "absent", "present", "absent", "absent", "present", "present"))

我想通过存在/不存在来计算每个标准,并重新列出它:

df2 <- data.frame(criteria = c("criteria_A", "criteria_A", "criteria_B", "criteria_B"),
    count = c("5", "5", "4", "6"),
    status = c("present", "absent", "present", "absent"))

我考虑过按照标准以这种方式计算:

library(dplyr)
tmp1 <- df %>% group_by(criteria_A) %>% count() %>% mutate(criteria="criteria_A")
tmp1 <- tmp1 %>% rename(criteria_A=status)
tmp2 <- df %>% group_by(criteria_B) %>% count() %>% mutate(criteria="criteria_B")
tmp2 <- tmp2 %>% rename(criteria_B=status)

我想我可以垂直合并输出。当实际上我有数百个标准时,这不是一种有效或聪明的方法......

我敢肯定有一个优雅的解决方案,我不够聪明,无法弄清楚!

任何帮助将一如既往地感激不尽。

标签: rdplyr

解决方案


您可以在使用dplyr::tally将数据旋转为长格式后尝试使用pivot_longer

library(dplyr)

df %>%
  pivot_longer(-id, 
               names_to = 'criteria',
               values_to = 'status') %>%
  group_by(criteria, status) %>%
  tally

#----
  criteria   status      n
  <chr>      <chr>   <int>
1 criteria_A absent      5
2 criteria_A present     5
3 criteria_B absent      6
4 criteria_B present     4


推荐阅读