首页 > 解决方案 > 如何累积几列并求和?

问题描述

我想合并 15 列,它们有 3 个相同的列(所以它有 5 个相同的副本)。我的数据看起来像这样(例如,为了简单起见,它只有 3 个副本)

   date     sku1  prod1  tot1  sku2  prod2  tot2  sku3  prod3  tot3
01/02/2019  100     a    100
01/02/2019  100     a    200    101    b     50
02/02/2019  101     b    100
02/02/2019  101     b     50    102    c    100   100     a     50
02/02/2019  102     c     50

变成这样

   date     sku  all_prod  total
01/02/2019  100     a       300
01/02/2019  101     b        50
02/02/2019  101     b       150
02/02/2019  102     c       150
02/02/2019  100     a        50

有人知道怎么做吗?提前非常感谢

标签: raddition

解决方案


使用dplyrandtidyr我们可以gather将数据转换为长格式,从列名中删除数字,将spread其转换为宽格式,group_by date并在每组中prod取值和取值。sumtot

library(dplyr)
library(tidyr)

df %>%
  gather(key, value, -date, na.rm = TRUE) %>%
  mutate(key = sub("(.*)\\d+", "\\1", key)) %>%
  group_by(key) %>%
  mutate(row = row_number()) %>%
  spread(key, value) %>%
  mutate_at(vars(sku, tot), as.numeric) %>%
  group_by(date, prod) %>%
  summarise(sku = sku[1L], 
            tot = sum(tot))

#  date       prod    sku   tot
#  <fct>      <chr> <dbl> <dbl>
#1 01/02/2019 a       100   300
#2 01/02/2019 b       101    50
#3 02/02/2019 a       100    50
#4 02/02/2019 b       101   150
#5 02/02/2019 c       102   150

数据

df <- structure(list(date = structure(c(1L, 1L, 2L, 2L, 2L), .Label = 
c("01/02/2019", "02/02/2019"), class = "factor"), sku1 = c(100, 100, 101, 101, 
102), prod1 = structure(c(1L, 1L, 2L, 2L, 3L), .Label = c("a", 
"b", "c"), class = "factor"), tot1 = c(100, 200, 100, 50, 50), 
sku2 = c(NA, 101, NA, 102, NA), prod2 = structure(c(NA, 1L, 
NA, 2L, NA), .Label = c("b", "c"), class = "factor"), tot2 = c(NA, 
50, NA, 100, NA), sku3 = c(NA, NA, NA, 100, NA), prod3 = 
structure(c(NA, NA, NA, 1L, NA), .Label = "a", class = "factor"), tot3 = c(NA, 
NA, NA, 50, NA)), row.names = c(NA, -5L), class = "data.frame")

推荐阅读