r - How to sum data per month for a certain group?
问题描述
I have searched for similar questions but none really ask what I need to know.
My question is how can I add the values of one column of every month for a certain group. My data set has 3 columns: Date, Province and Total reported infections. Now, I need the Total reported infections per month per province, and I am not quite sure how to do this.
I hope this makes sense
Here is a random sample of my data set:
ds <- structure(list(Date_of_publication = c("2021-04-05", "2020-09-16",
"2020-05-21", "2021-04-11", "2020-04-05", "2021-04-23", "2021-06-17",
"2021-07-25", "2021-02-08", "2021-01-17", "2021-02-25", "2021-01-16",
"2021-07-11", "2021-08-10", "2020-11-02", "2020-07-04", "2020-03-01",
"2021-01-22", "2021-07-25", "2021-01-14"), Province = c("Noord-Brabant",
"Limburg", "Flevoland", "Noord-Holland", "Noord-Holland", "Zuid-Holland",
"Utrecht", "Friesland", "Drenthe", "Flevoland", "Noord-Holland",
"Overijssel", "Zuid-Holland", "Zuid-Holland", "Utrecht", "Noord-Holland",
"Overijssel", "Limburg", "Gelderland", "Noord-Brabant"), Total_reported = c(66L,
3L, 0L, 26L, 1L, 16L, 0L, 18L, 6L, 15L, 24L, 19L, 1L, 8L, 0L,
0L, 0L, 18L, 6L, 12L)), class = "data.frame", row.names = c(NA,
-20L))
解决方案
format
使用和sum
为Total_reported
每个月和提取日期和月份Province
。
使用dplyr
-
library(dplyr)
ds %>%
group_by(year_month = format(as.Date(Date_of_publication), '%b %Y'), Province) %>%
summarise(Total_reported = sum(Total_reported, na.rm = TRUE)) %>%
ungroup
# year_month Province Total_reported
# <chr> <chr> <int>
# 1 Apr 2020 Noord-Holland 1
# 2 Apr 2021 Noord-Brabant 66
# 3 Apr 2021 Noord-Holland 26
# 4 Apr 2021 Zuid-Holland 16
# 5 Aug 2021 Zuid-Holland 8
# 6 Feb 2021 Drenthe 6
# 7 Feb 2021 Noord-Holland 24
# 8 Jan 2021 Flevoland 15
# 9 Jan 2021 Limburg 18
#10 Jan 2021 Noord-Brabant 12
#11 Jan 2021 Overijssel 19
#12 Jul 2020 Noord-Holland 0
#13 Jul 2021 Friesland 18
#14 Jul 2021 Gelderland 6
#15 Jul 2021 Zuid-Holland 1
#16 Jun 2021 Utrecht 0
#17 Mar 2020 Overijssel 0
#18 May 2020 Flevoland 0
#19 Nov 2020 Utrecht 0
#20 Sep 2020 Limburg 3
或以 R 为基数 -
aggregate(Total_reported ~ year_month + Province,
transform(ds, year_month = format(as.Date(Date_of_publication), '%b %Y')),
sum, na.rm = TRUE)
推荐阅读
- javascript - 如何检测在 GeckoWebBrowser 中完成的所有 JavaScript
- spring - docker run - 错误:无法访问 jarfile
- android - 如何在 Android Studio 中创建项目的不同副本
- c# - 机器学习语音识别
- vb.net - 将复选框绑定到自定义类中的共享属性
- php - 正则表达式与 utf8 中的逗号和等号字符匹配
- javascript - 这条 Javascript 行有什么作用
- javascript - 如何使用 javascript 创建一个与 txt 文件一起工作(打开、读取、进行和显示计算)的程序?
- javascript - JavaScript数字:先乘然后除?
- node.js - 如何使用表单生成器将带有 Angular 7 文本的图像上传到 Nodejs?