首页 > 解决方案 > 试图在不删除其他列的情况下折叠数据框中的多个观察值

问题描述

> Total
                          Product Quantity Price Total
 1:                  tomatoes 1kg        1    16    16
 2:                small cucumber        1    10    10
 3:                  beetroot 1kg        1    15    15
 4:                 potatoes 1 kg        1    15    15
 5:                asparagus 200g        4    45   180
 6:           red apples 4 medium        1    10    10
 7:       beef fillet strips 500g        1    90    90
 8:               back bacon 200g        1    30    30
 9: chicken drums and thighs 1 kg        1    75    75
10:             kudu biltong 250g        2    80   160
11:                   t bone 500g        1    66    66
12:             free range eggs 6        1    15    15
13:                  tomatoes 1kg        1    16    16
14:  calistos jalape=c3=b1o salsa        1    40    40
15:               lean beef mince        1    54    54
16:            free range eggs 30        1    65    65
17:                 potatoes 1 kg        1    15    15
18:        strawberry punnet 250g        1    22    22
19:          chicken whole 1.4 kg        1    65    65
20:                small cucumber        4    10    40
21:                   swiss chard        2    14    28
22:                  tomatoes 1kg        3    16    48
23:                   carrot 1 kg        2    14    28
24:                          kale        2    14    28
25:               butternut cubes        2    14    28
26:                 potatoes 1 kg        2    15    30
27:                   onions 1 kg        1    15    15
28:  oyster mushrooms 200g punnet        1    35    35
29:        strawberry punnet 250g        2    22    44
30:            free range eggs 30        1    65    65
31:                small cucumber        2    10    20
32:                  tomatoes 1kg        1    16    16
33:                 broccoli head        1    25    25
34:        cauliflower whole head        2    25    50
35:                   carrot 1 kg        2    14    28
36:               butternut cubes        2    14    28
37:                          kale        2    14    28
38:                   butter 500g        1    57    57
39:  oyster mushrooms 200g punnet        2    35    70
40:                      coleslaw        1    15    15
                          Product Quantity Price Total

我有很多重复的数据框,我尝试了不同的方法将它们折叠在一起,但它总是删除价格和总列。

aggregate(Quantity~Product,data=Total,FUN=sum)

得到我:

> Total
                         Product Quantity
1                 asparagus 200g        4
2                back bacon 200g        1
3        beef fillet strips 500g        1
4                   beetroot 1kg        1
5                  broccoli head        1
6                    butter 500g        1
7                butternut cubes        4
8   calistos jalape=c3=b1o salsa        1
9                    carrot 1 kg        4
10        cauliflower whole head        2
11 chicken drums and thighs 1 kg        1
12          chicken whole 1.4 kg        1
13                      coleslaw        1
14            free range eggs 30        2
15             free range eggs 6        1
16                          kale        4
17             kudu biltong 250g        2
18               lean beef mince        1
19                   onions 1 kg        1
20  oyster mushrooms 200g punnet        3
21                 potatoes 1 kg        4
22           red apples 4 medium        1
23                small cucumber        7
24        strawberry punnet 250g        3
25                   swiss chard        2
26                   t bone 500g        1
27                  tomatoes 1kg        6

这确实会折叠它,但它会删除其他列。

Total %>% group_by(Product) %>%  summarise(quantity = sum(Quantity))

做同样的事情。

预期的输出应该将PriceandTotal与所有的Product' 合并。

链接到数据框

标签: rdataframedplyrtidyverse

解决方案


使用gather 将列中的变量表示为行。然后使用 group 和 summary 函数来获得每个组的总和/平均值。同样,数据可以使用展开按列显示。

Total %>%
  gather(key = variable, value = value, c(Quantity,Price,Total)) %>%
  group_by(Product, variable) %>%
  summarize(sum = sum(value)) %>%
  spread(variable, sum)

推荐阅读