r - group_by(月和年)和 summarise_if 不会留下单个月的平均值
问题描述
我很抱歉我的标题含糊不清,但我不知道我需要如何解决这个问题。
我有一个每年每月政府回复的数据框,我想获得每月每列的平均值。
现在我认为我的语法是正确的,只是我的结果不会让我每年只有一个月,而是多个月,而且我一生都无法掌握自己做错了什么。
我正在运行的代码是:
test <- ds %>%
group_by(month, year, C1_calculated, C2_calculated, C3_calculated, C4_calculated, C5_calculated, C6_calculated, C7_calculated, C8_calculated, E1_calculated, E2_calculated) %>%
summarise_if(is.integer, mean) %>%
ungroup
结果如下:
test2 <- structure(list(month = c(1, 1, 1, 2, 2, 2), year = c(2020, 2021,
2021, 2020, 2021, 2021), C1_calculated = c(0, 11.1111111111111,
11.1111111111111, 0, 7.40740740740741, 11.1111111111111), C2_calculated = c(0,
11.1111111111111, 11.1111111111111, 0, 11.1111111111111, 11.1111111111111
), C3_calculated = c(0, 11.1111111111111, 11.1111111111111, 0,
11.1111111111111, 11.1111111111111), C4_calculated = c(0, 11.1111111111111,
11.1111111111111, 0, 11.1111111111111, 11.1111111111111), C5_calculated = c(0,
5.55555555555556, 5.55555555555556, 0, 5.55555555555556, 5.55555555555556
), C6_calculated = c(0, 3.7037037037037, 7.40740740740741, 0,
7.40740740740741, 7.40740740740741), C7_calculated = c(0, 5.55555555555556,
5.55555555555556, 0, 5.55555555555556, 5.55555555555556), C8_calculated = c(0,
9.72222222222222, 9.72222222222222, 0, 9.72222222222222, 9.72222222222222
), E1_calculated = c(0, 50, 50, 0, 50, 50), E2_calculated = c(0,
37.5, 37.5, 0, 37.5, 37.5)), row.names = c(NA, -6L), class = c("tbl_df",
"tbl", "data.frame"))
如您所见,我每年有多个同一月份的实例,这并没有真正的帮助。
这是我的数据集的 20 行的随机样本:
ds <- structure(list(C1_calculated = c(3.7037037037037, 7.40740740740741,
3.7037037037037, 7.40740740740741, 3.7037037037037, 3.7037037037037,
11.1111111111111, 11.1111111111111, 3.7037037037037, 7.40740740740741,
3.7037037037037, 3.7037037037037, 3.7037037037037, 3.7037037037037,
11.1111111111111, 7.40740740740741, 3.7037037037037, 7.40740740740741,
0, 7.40740740740741), C2_calculated = c(7.40740740740741, 7.40740740740741,
7.40740740740741, 11.1111111111111, 7.40740740740741, 7.40740740740741,
7.40740740740741, 7.40740740740741, 7.40740740740741, 7.40740740740741,
7.40740740740741, 3.7037037037037, 7.40740740740741, 7.40740740740741,
7.40740740740741, 7.40740740740741, 7.40740740740741, 11.1111111111111,
0, 7.40740740740741), C3_calculated = c(0, 11.1111111111111,
11.1111111111111, 11.1111111111111, 5.55555555555556, 2.77777777777778,
11.1111111111111, 11.1111111111111, 5.55555555555556, 11.1111111111111,
11.1111111111111, 5.55555555555556, 0, 11.1111111111111, 11.1111111111111,
11.1111111111111, 11.1111111111111, 11.1111111111111, 0, 11.1111111111111
), C4_calculated = c(5.55555555555556, 11.1111111111111, 11.1111111111111,
11.1111111111111, 0, 11.1111111111111, 11.1111111111111, 11.1111111111111,
0, 11.1111111111111, 11.1111111111111, 0, 5.55555555555556, 11.1111111111111,
11.1111111111111, 8.33333333333333, 11.1111111111111, 11.1111111111111,
0, 8.33333333333333), C5_calculated = c(0, 5.55555555555556,
5.55555555555556, 5.55555555555556, 5.55555555555556, 0, 5.55555555555556,
5.55555555555556, 0, 5.55555555555556, 5.55555555555556, 0, 0,
5.55555555555556, 5.55555555555556, 0, 5.55555555555556, 5.55555555555556,
0, 0), C6_calculated = c(3.7037037037037, 3.7037037037037, 3.7037037037037,
7.40740740740741, 0, 3.7037037037037, 7.40740740740741, 7.40740740740741,
0, 7.40740740740741, 3.7037037037037, 0, 3.7037037037037, 3.7037037037037,
7.40740740740741, 3.7037037037037, 7.40740740740741, 7.40740740740741,
0, 3.7037037037037), C7_calculated = c(0, 5.55555555555556, 5.55555555555556,
5.55555555555556, 0, 0, 5.55555555555556, 5.55555555555556, 0,
5.55555555555556, 5.55555555555556, 0, 0, 5.55555555555556, 5.55555555555556,
5.55555555555556, 5.55555555555556, 5.55555555555556, 0, 5.55555555555556
), C8_calculated = c(8.33333333333333, 8.33333333333333, 9.72222222222222,
9.72222222222222, 9.72222222222222, 8.33333333333333, 8.33333333333333,
8.33333333333333, 9.72222222222222, 9.72222222222222, 9.72222222222222,
9.72222222222222, 8.33333333333333, 9.72222222222222, 8.33333333333333,
8.33333333333333, 9.72222222222222, 9.72222222222222, 0, 8.33333333333333
), E1_calculated = c(50, 50, 50, 50, 50, 50, 50, 50, 50, 50,
50, 50, 50, 50, 50, 50, 50, 50, 0, 50), E2_calculated = c(62.5,
62.5, 37.5, 37.5, 12.5, 37.5, 62.5, 62.5, 37.5, 37.5, 37.5, 37.5,
62.5, 37.5, 62.5, 62.5, 37.5, 37.5, 0, 62.5), month = c(8, 5,
5, 2, 8, 9, 5, 4, 7, 4, 6, 7, 8, 6, 5, 6, 4, 2, 1, 6), year = c(2020,
2020, 2021, 2021, 2021, 2020, 2020, 2020, 2021, 2021, 2021, 2021,
2020, 2021, 2020, 2020, 2021, 2021, 2020, 2020)), row.names = c(227L,
145L, 488L, 423L, 586L, 272L, 122L, 104L, 565L, 464L, 520L, 555L,
224L, 539L, 124L, 157L, 482L, 417L, 14L, 162L), class = "data.frame")
提前感谢您的所有帮助!
解决方案
你的数据:
test2 <- structure(list(month = c(1, 1, 1, 2, 2, 2), year = c(2020, 2021,
2021, 2020, 2021, 2021), C1_calculated = c(0, 11.1111111111111,
11.1111111111111, 0, 7.40740740740741, 11.1111111111111), C2_calculated = c(0,
11.1111111111111, 11.1111111111111, 0, 11.1111111111111, 11.1111111111111
), C3_calculated = c(0, 11.1111111111111, 11.1111111111111, 0,
11.1111111111111, 11.1111111111111), C4_calculated = c(0, 11.1111111111111,
11.1111111111111, 0, 11.1111111111111, 11.1111111111111), C5_calculated = c(0,
5.55555555555556, 5.55555555555556, 0, 5.55555555555556, 5.55555555555556
), C6_calculated = c(0, 3.7037037037037, 7.40740740740741, 0,
7.40740740740741, 7.40740740740741), C7_calculated = c(0, 5.55555555555556,
5.55555555555556, 0, 5.55555555555556, 5.55555555555556), C8_calculated = c(0,
9.72222222222222, 9.72222222222222, 0, 9.72222222222222, 9.72222222222222
), E1_calculated = c(0, 50, 50, 0, 50, 50), E2_calculated = c(0,
37.5, 37.5, 0, 37.5, 37.5)), row.names = c(NA, -6L), class = c("tbl_df",
"tbl", "data.frame"))
总结:
library(tidyverse)
test2 %>%
group_by(month, year) %>%
summarise_at(vars(starts_with("C")), mean)
输出
# A tibble: 4 x 10
# Groups: month [2]
month year C1_calculated C2_calculated C3_calculated C4_calculated C5_calculated C6_calculated C7_calculated C8_calculated
<dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 1 2020 0 0 0 0 0 0 0 0
2 1 2021 11.1 11.1 11.1 11.1 5.56 5.56 5.56 9.72
3 2 2020 0 0 0 0 0 0 0 0
4 2 2021 9.26 11.1 11.1 11.1 5.56 7.41 5.56 9.72
推荐阅读
- sql - 我正在尝试使此功能和触发器工作:
- python - 在 unittest 中测试条件导入
- r - For循环交叉验证R
- javascript - 如何将单独的 JS 文件中的值传递给 Thymeleaf
- git - 让任何人都可以编辑 GitHub 存储库
- docker - 不同端口的环境变量如何在 docker-compose mariadb 中工作?
- google-translate - 谷歌翻译小部件从下拉列表中删除 pageLangguage
- css - 如何使用自定义 CSS 修改 Strapi 管理面板?
- python - 如何在 matplotlob 直方图中标记 bin(Python)
- laravel - 重定向标头在 Laravel 中可见而不是隐藏