首页 > 解决方案 > How to group by on all the columns in data.frame?

问题描述

I have following data.frame in R:

  Introvert      Extrovert      Nature       Presence
     0              -1            3             Yes     
     1               3            2             No
     2               5            4             Yes
     1              -2            0             No

Now, I want to code the responses in following manner:

    3,4 <- Positives
    0,1,2 <- Neutral
    < 0 <- Negatives

And then get the count of Positives, Negatives and Neutrals across Yes and No.
I have 20 columns of responses like the above. How can I do it in simpler code in R?

I am doing it ifelse and then group_by for every column.

My sample desired dataframe would be:

         Introvert_Positive      Introvert_Negative     Introvert_Neutral

  Yes        0                         0                      2
  No         0                         0                      2  

标签: r

解决方案


这个怎么样?

library(tidyverse);
df %>%
    gather(key, value, -Presence) %>%
    mutate(bin = cut(
        value,
        breaks = c(-Inf, -1, 2.5, Inf),
        labels = c("Negatives", "Neutral", "Positives"))) %>%
    select(-value) %>%
    unite(col, key, bin, sep = "_") %>%
    count(Presence, col) %>%
    spread(col, n)
## A tibble: 2 x 6
#  Presence Extrovert_Negativ… Extrovert_Positi… Introvert_Neutr… Nature_Neutral
#  <fct>                 <int>             <int>            <int>          <int>
#1 No                        1                 1                2              2
#2 Yes                       1                 1                2             NA
## ... with 1 more variable: Nature_Positives <int>

解释:我们使用cutwithlabels重新编码响应;剩下的就是gathering、uniteing 相关列、counting 出现次数以及spreading 从长到宽。


样本数据

df <- read.table(text =
    "Introvert      Extrovert      Nature       Presence
     0              -1            3             Yes
     1               3            2             No
     2               5            4             Yes
     1              -2            0             No", header = T)

推荐阅读