首页 > 解决方案 > 按多个组对特定行求和

问题描述

我有一个像下面这样的数据框...

df <- data.frame(row.names = c(1,2,3,4,5,6,7,8), Week = c(1,1,2,2,52,52,53,53), State = c("Florida", "Georgia","Florida", "Georgia","Florida", "Georgia","Florida", "Georgia"), Count_2001 = c(25,16,83,45,100,98,22,34), Count_2002 = c(3, 78, 22, 5, 78, 6, 88, 97))

我现在正在尝试操作此数据集,以便在所有 Count 列中将列表中每个州的第 52 周和第 53 周汇总在一起。与此示例类似.. GROUP BY 用于特定行

新数据集应将这些行汇总在一起,为每个州创建新的第 52 周行,如下面的示例...

df2 <- data.frame(row.names = c(1,2,3,4,5,6), Week = c(1,1,2,2,52,52), State = c("Florida", "Georgia","Florida", "Georgia","Florida", "Georgia"), Count_2001 = c(25,16,83,45,122,132), Count_2002 = c(3, 78, 22, 5, 166, 103))

R中有一个简单的解决方案吗?

标签: rdataframegroup-by

解决方案


将您的 53s 更改为 52s 并按组求和:

library(dplyr)
df %>%
  mutate(Week = case_when(Week == 53 ~ 52, TRUE ~ Week)) %>%
  group_by(State, Week) %>%
  summarize(across(everything(), sum))
# # A tibble: 6 x 4
# # Groups:   State [2]
#   State    Week Count_2001 Count_2002
#   <chr>   <dbl>      <dbl>      <dbl>
# 1 Florida     1         25          3
# 2 Florida     2         83         22
# 3 Florida    52        122        166
# 4 Georgia     1         16         78
# 5 Georgia     2         45          5
# 6 Georgia    52        132        103

推荐阅读