首页 > 解决方案 > 如何根据数据表中的特定行按组计算比率

问题描述

我有以下结构的交易数据:

MWE <-data.table(
  Exporter=rep(c("France","Germany","United States","World"),each=4),
  Importer=rep(c("France","Germany","United States","World"),4),
  Value=c(0,20,30,50,
          30,0,40,70,
          80,80,0,160,
          110,100,70,280)
)

MWE

> MWE
         Exporter      Importer Value
 1:        France        France     0
 2:        France       Germany    20
 3:        France United States    30
 4:        France         World    50
 5:       Germany        France    30
 6:       Germany       Germany     0
 7:       Germany United States    40
 8:       Germany         World    70
 9: United States        France    80
10: United States       Germany    80
11: United States United States     0
12: United States         World   160
13:         World        France   110
14:         World       Germany   100
15:         World United States    70
16:         World         World   280

我想创建一个新变量,它是每个国家在一个国家的进口商中所占的份额。我不能轻松地使用sumN在我的真实数据中有不同的国家组(World在我的例子中)做事情。

所以基本上我想要一个新变量,即by Exporter, percent = value/value(World) . 我怎样才能做到这一点 ?

 Desired_Output
         Exporter      Importer Value   Percent
 1:        France        France     0 0.0000000
 2:        France       Germany    20 0.4000000
 3:        France United States    30 0.6000000
 4:        France         World    50 1.0000000
 5:       Germany        France    30 0.4285714
 6:       Germany       Germany     0 0.0000000
 7:       Germany United States    40 0.5714286
 8:       Germany         World    70 1.0000000
 9: United States        France    80 0.5000000
10: United States       Germany    80 0.5000000
11: United States United States     0 0.0000000
12: United States         World   160 1.0000000
13:         World        France   110 0.3928571
14:         World       Germany   100 0.3571429
15:         World United States    70 0.2500000
16:         World         World   280 1.0000000

标签: rdata.table

解决方案


这个怎么样dplyr?因为你已经包括了世界,所以你需要将百分比翻倍。这只有在世界被一致地包含时才有效。否则,您可以使用 if_else 语句。

MWE %>% group_by(Exporter) %>%
  mutate(2*Value/sum(Value))

推荐阅读