首页 > 解决方案 > 如何记录与每种类型的记录相关的计数?

问题描述

我有一些数据:

structure(list(date = structure(c(17888, 17888, 17888, 17888, 
17889, 17889, 17891, 17891, 17891, 17891, 17891, 17892, 17894
), class = "Date"), type = structure(c(4L, 6L, 15L, 16L, 2L, 
5L, 2L, 3L, 5L, 6L, 8L, 2L, 2L), .Label = c("aborted-live-lead", 
"conversation-archived", "conversation-auto-archived", "conversation-auto-archived-store-offline-or-busy", 
"conversation-claimed", "conversation-created", "conversation-dropped", 
"conversation-restarted", "conversation-transfered", "cs-transfer-connected", 
"cs-transfer-ended", "cs-transfer-failed", "cs-transfer-initiate", 
"cs-transfer-request", "getnotified-requested", "lead-created", 
"lead-expired"), class = "factor"), count = c(1L, 1L, 1L, 1L, 
3L, 1L, 1L, 1L, 2L, 2L, 1L, 1L, 2L)), row.names = c(NA, -13L), class = c("tbl_df", 
"tbl", "data.frame"))

它看起来像这样:

> head(dat)
# A tibble: 6 x 3
  date       type                                             count
  <date>     <fct>                                            <int>
1 2018-12-23 conversation-auto-archived-store-offline-or-busy     1
2 2018-12-23 conversation-created                                 1
3 2018-12-23 getnotified-requested                                1
4 2018-12-23 lead-created                                         1
5 2018-12-24 conversation-archived                                3
6 2018-12-24 conversation-claimed                                 1

对于每个唯一type值,count每天都有一个相关联的值。

如何计算每个值的所有值type(无论日期如何)并将它们列在两列数据框中(格式如下):

type                   count
------                 ------
conversation-created   10
conversation-archived  4
lead-created           2
...

这样做的原因是显示整个日期范围内每种事件类型的总计数。

我认为我必须使用 from 的select()功能,dplyr但我确信我遗漏了一些东西。

这是我到目前为止所拥有的 - 它汇总了count列中的每个值,这不是我想要的,因为我希望它按天分解:

dat %>%
  select(type, count) %>% 
  summarise(count = sum(count)) %>%
  ungroup()

标签: r

解决方案


似乎是group_bysummarizewith的组合sum

dat %>% group_by(type) %>% summarise(count = sum(count))
# A tibble: 8 x 2
#   type                                             count
#   <fct>                                            <int>
# 1 conversation-archived                                7
# 2 conversation-auto-archived                           1
# 3 conversation-auto-archived-store-offline-or-busy     1
# 4 conversation-claimed                                 3
# 5 conversation-created                                 3
# 6 conversation-restarted                               1
# 7 getnotified-requested                                1
# 8 lead-created                                         1

无论如何都不需要selectassummarize会删除所有其他变量。或者您可能对 感到困惑selectgroup_by这就是我们在这种情况下想要的 - 总结countwheretype取相同值的那些值。


推荐阅读