首页 > 解决方案 > 具有多个变量的频率表,按分类变量分组

问题描述

我想创建一个由分类变量(颜色)分组的多个变量(X1 - X4)的频率表。这是示例数据:

df <- data.frame(name = paste0("obj", 1:6),
                 X1 = c(0,1,1,1,0,1),
                 X2 = c(1,1,1,1,1,1),
                 X3 = c(0,1,1,0,0,0),
                 X4 = c(0,1,1,1,0,0),
                 color = c("red","red","blue","green","green","blue"),
                 other = c(5,3,1,8,4,3))

理想情况下,这是输出的样子:

\begin{table}[]
\begin{tabular}{lllll}
Var & red & blue & green & total \\
X1  & 1   & 2    & 1     & 4     \\
X2  & 2   & 2    & 2     & 6     \\
X3  & 1   & 1    & 0     & 2     \\
X4  & 1   & 1    & 1     & 3    
\end{tabular}
\end{table}

非常感谢!

标签: r

解决方案


您可以获取长格式的数据,并为每个colorsum获取值,获取宽格式的数据并添加Total列。

library(dplyr)
library(tidyr)

df %>%
  pivot_longer(cols = starts_with('X'), names_to = 'col') %>%
  group_by(col, color) %>%
  summarise(n = sum(value)) %>%
  pivot_wider(names_from = color, values_from = n) %>%
  ungroup %>%
  janitor::adorn_totals(where = 'col') 
  #Or use `rowSums`
  #mutate(Total = rowSums(.[-1]))

# col blue green red Total
#  X1    2     1   1     4
#  X2    2     2   2     6
#  X3    1     0   1     2
#  X4    1     1   1     3
 

推荐阅读