r - 试图在不删除其他列的情况下折叠数据框中的多个观察值
问题描述
> Total
Product Quantity Price Total
1: tomatoes 1kg 1 16 16
2: small cucumber 1 10 10
3: beetroot 1kg 1 15 15
4: potatoes 1 kg 1 15 15
5: asparagus 200g 4 45 180
6: red apples 4 medium 1 10 10
7: beef fillet strips 500g 1 90 90
8: back bacon 200g 1 30 30
9: chicken drums and thighs 1 kg 1 75 75
10: kudu biltong 250g 2 80 160
11: t bone 500g 1 66 66
12: free range eggs 6 1 15 15
13: tomatoes 1kg 1 16 16
14: calistos jalape=c3=b1o salsa 1 40 40
15: lean beef mince 1 54 54
16: free range eggs 30 1 65 65
17: potatoes 1 kg 1 15 15
18: strawberry punnet 250g 1 22 22
19: chicken whole 1.4 kg 1 65 65
20: small cucumber 4 10 40
21: swiss chard 2 14 28
22: tomatoes 1kg 3 16 48
23: carrot 1 kg 2 14 28
24: kale 2 14 28
25: butternut cubes 2 14 28
26: potatoes 1 kg 2 15 30
27: onions 1 kg 1 15 15
28: oyster mushrooms 200g punnet 1 35 35
29: strawberry punnet 250g 2 22 44
30: free range eggs 30 1 65 65
31: small cucumber 2 10 20
32: tomatoes 1kg 1 16 16
33: broccoli head 1 25 25
34: cauliflower whole head 2 25 50
35: carrot 1 kg 2 14 28
36: butternut cubes 2 14 28
37: kale 2 14 28
38: butter 500g 1 57 57
39: oyster mushrooms 200g punnet 2 35 70
40: coleslaw 1 15 15
Product Quantity Price Total
我有很多重复的数据框,我尝试了不同的方法将它们折叠在一起,但它总是删除价格和总列。
aggregate(Quantity~Product,data=Total,FUN=sum)
得到我:
> Total
Product Quantity
1 asparagus 200g 4
2 back bacon 200g 1
3 beef fillet strips 500g 1
4 beetroot 1kg 1
5 broccoli head 1
6 butter 500g 1
7 butternut cubes 4
8 calistos jalape=c3=b1o salsa 1
9 carrot 1 kg 4
10 cauliflower whole head 2
11 chicken drums and thighs 1 kg 1
12 chicken whole 1.4 kg 1
13 coleslaw 1
14 free range eggs 30 2
15 free range eggs 6 1
16 kale 4
17 kudu biltong 250g 2
18 lean beef mince 1
19 onions 1 kg 1
20 oyster mushrooms 200g punnet 3
21 potatoes 1 kg 4
22 red apples 4 medium 1
23 small cucumber 7
24 strawberry punnet 250g 3
25 swiss chard 2
26 t bone 500g 1
27 tomatoes 1kg 6
这确实会折叠它,但它会删除其他列。
Total %>% group_by(Product) %>% summarise(quantity = sum(Quantity))
做同样的事情。
预期的输出应该将Price
andTotal
与所有的Product
' 合并。
解决方案
使用gather 将列中的变量表示为行。然后使用 group 和 summary 函数来获得每个组的总和/平均值。同样,数据可以使用展开按列显示。
Total %>%
gather(key = variable, value = value, c(Quantity,Price,Total)) %>%
group_by(Product, variable) %>%
summarize(sum = sum(value)) %>%
spread(variable, sum)