首页 > 解决方案 > 创建循环以计算条件下的百分比并填写结果

问题描述

我试图找到一些信息,但并没有真正找到我想要的东西。

这是我的数据框full.data(摘录)

country year      sector    emissions iso2 PercentageDifference
....
Austria 2011       Total 7.011567e+07   AT                    0
Austria 2011   Regulated 4.214836e+07   AT                    0
Austria 2011 Unregulated 2.796732e+07   AT                    0
Austria 2011         ETS 3.059942e+07   AT                    0
Austria 2012       Total 6.766140e+07   AT                    0
Austria 2012   Regulated 3.949445e+07   AT                    0
Austria 2012 Unregulated 2.816695e+07   AT                    0
Austria 2012         ETS 2.838706e+07   AT                    0
Austria 2013       Total 6.800123e+07   AT                    0
Austria 2013   Regulated 3.857396e+07   AT                    0
Austria 2013 Unregulated 2.942727e+07   AT                    0
Austria 2013         ETS 2.980441e+07   AT                    0
Austria 2014       Total 6.425333e+07   AT                    0
Austria 2014   Regulated 3.631107e+07   AT                    0
Austria 2014 Unregulated 2.794225e+07   AT                    0
Austria 2014         ETS 2.805597e+07   AT                    0
Austria 2015       Total 6.670398e+07   AT                    0
Austria 2015   Regulated 3.800309e+07   AT                    0
Austria 2015 Unregulated 2.870090e+07   AT                    0
Austria 2015         ETS 2.949206e+07   AT                    0
Austria 2016       Total 6.740209e+07   AT                    0
Austria 2016   Regulated 3.765177e+07   AT                    0
Austria 2016 Unregulated 2.975031e+07   AT                    0
Austria 2016         ETS 2.900012e+07   AT                    0
Austria 2017         ETS 3.055523e+07   AT                    0
Belgium 1990       Total 1.204844e+08   BE                    0
Belgium 1990   Regulated 7.861411e+07   BE                    0
Belgium 1990 Unregulated 4.187029e+07   BE                    0
Belgium 1991       Total 1.235447e+08   BE                    0
Belgium 1991   Regulated 7.981152e+07   BE                    0
Belgium 1991 Unregulated 4.373319e+07   BE                    0
Belgium 1992       Total 1.226578e+08   BE                    0
Belgium 1992   Regulated 7.828396e+07   BE                    0
Belgium 1992 Unregulated 4.437385e+07   BE                    0
Belgium 1993       Total 1.215573e+08   BE                    0
Belgium 1993   Regulated 7.675229e+07   BE                    0
Belgium 1993 Unregulated 4.480499e+07   BE                    0
Belgium 1994       Total 1.249382e+08   BE                    0
Belgium 1994   Regulated 8.064799e+07   BE                    0
Belgium 1994 Unregulated 4.429020e+07   BE                    0
....

我正在尝试填写地点和地点full.data$PercentageDifference的百分比(排放部门=ETS 是部门的 xx.y%=监管)。该百分比值应归档到where 。这应该发生在每年和每个国家。我会假设我需要一个循环。我已经读过 dplyr 对此很有用,但并没有真正弄清楚如何做到这一点。但是,如果有比 dplyr 更好的方法,那对我来说很好。emissionssector=ETSsector=RegulatedPercentageDifferencesector=ETS

结果看起来像这样

country year      sector    emissions iso2 PercentageDifference
....
Austria 2011       Total 7.011567e+07   AT                    0
Austria 2011   Regulated 4.214836e+07   AT                    0
Austria 2011 Unregulated 2.796732e+07   AT                    0
Austria 2011         ETS 3.059942e+07   AT                72.6%
Austria 2012       Total 6.766140e+07   AT                    0
Austria 2012   Regulated 3.949445e+07   AT                    0
Austria 2012 Unregulated 2.816695e+07   AT                    0
Austria 2012         ETS 2.838706e+07   AT                71.9%
Austria 2013       Total 6.800123e+07   AT                    0
Austria 2013   Regulated 3.857396e+07   AT                    0
Austria 2013 Unregulated 2.942727e+07   AT                    0
Austria 2013         ETS 2.980441e+07   AT                77.3%
Austria 2014       Total 6.425333e+07   AT                    0
Austria 2014   Regulated 3.631107e+07   AT                    0
Austria 2014 Unregulated 2.794225e+07   AT                    0
Austria 2014         ETS 2.805597e+07   AT                77.3%

到目前为止,我还没有发布我所做的事情,因为我没有做很多事情......

感谢您提供任何形式的帮助。

北湖

标签: rdplyrpercentage

解决方案


这里有一个基于 tydiverse/dplyr 的解决方案。

文件 stack.txt 包含上面示例中的粘贴文本。

library(tidyverse)
full_data <- read.table("stack.txt", quote="\"", comment.char="") 
names(full_data) <-  c("country", "year", "sector", "emission", "iso", "perc")

full_data <- full_data %>% 
  select(-perc)
full_data %>% 
  select(-iso) %>% 
  spread(sector, emission) %>% 
  mutate(percentage = ETS/Regulated) %>% 
  select(country, year, percentage) %>%
  right_join(full_data) %>%
  select(country, year, sector, emission, iso, percentage) %>% 
  mutate(percentage = ifelse(sector == "ETS", percentage, 0))

结果:

   country year      sector  emission iso percentage
1  Austria 2011       Total  70115670  AT  0.0000000
2  Austria 2011   Regulated  42148360  AT  0.0000000
3  Austria 2011 Unregulated  27967320  AT  0.0000000
4  Austria 2011         ETS  30599420  AT  0.7259931
5  Austria 2012       Total  67661400  AT  0.0000000
6  Austria 2012   Regulated  39494450  AT  0.0000000
7  Austria 2012 Unregulated  28166950  AT  0.0000000
8  Austria 2012         ETS  28387060  AT  0.7187607
9  Austria 2013       Total  68001230  AT  0.0000000
10 Austria 2013   Regulated  38573960  AT  0.0000000
11 Austria 2013 Unregulated  29427270  AT  0.0000000

如果您想要更多解释,它是如何工作的,我建议打破管道并查看中间结果,即

full_data %>% 
  select(-iso) %>% 
  spread(sector, emission) %>%
  mutate(percentage = ETS/Regulated)

   country year      ETS Regulated     Total Unregulated percentage
1  Austria 2011 30599420  42148360  70115670    27967320  0.7259931
2  Austria 2012 28387060  39494450  67661400    28166950  0.7187607
3  Austria 2013 29804410  38573960  68001230    29427270  0.7726562
4  Austria 2014 28055970  36311070  64253330    27942250  0.7726561
5  Austria 2015 29492060  38003090  66703980    28700900  0.7760437
6  Austria 2016 29000120  37651770  67402090    29750310  0.7702193

问候帕维尔


推荐阅读