r - 计算 R 中 data.table 中的滞后百分比差异
问题描述
我正在尝试计算计数变量之间的滞后差异。但是,我的数据有一个组变量。我希望分别为每个组计算滞后。
到目前为止,我有以下内容:
dput(head(mydata,20))
structure(list(startYear = structure(c(1L, 2L, 3L, 4L, 5L, 1L,
2L, 3L, 4L, 5L, 1L, 1L, 3L, 2L, 3L, 1L, 1L, 1L, 2L, 3L), .Label = c("2014",
"2015", "2016", "2017", "2018"), class = "factor"), groupID = c("AISAC-0000",
"AISAC-0000", "AISAC-0000", "AISAC-0000", "AISAC-0000", "ASSAT-0000",
"ASSAT-0000", "ASSAT-0000", "ASSAT-0000", "ASSAT-0000", "BAYER-0001",
"BAYSC-0002", "GECER-0002", "HANIN-0000", "HANIN-0000", "HOCED-0001",
"HOCEN-0000", "INDAL-0000", "INDAL-0000", "INDAL-0000"), N = c(82,
124, 60, 164, 65, 142, 183, 142, 75, 185, 145, 22, 162, 92, 4,
166, 57, 11, 199, 137)), row.names = c(NA, -20L), class = c("data.table",
"data.frame"))
mydata <- mydata[ ,var_calc := paste0(round((N/lag(N) - 1) * 100, digits = 3) , " %")]
所需的输出是:
mydata %>%
group_by(groupID) %>%
arrange(startYear, .by_group = TRUE) %>%
mutate(var_calc := paste0(round((N/lag(N) - 1) * 100, digits = 3) , " %"))
什么是.by_group = TRUE
替代品data.table
?
我怎样才能强迫积极的价值观有一个+
?
解决方案
您可以在 中执行以下操作data.table
,使用by
分组而shift
不是滞后:
编辑:添加功能以获得更好的百分比和加号输出
library(data.table)
mydata <- structure(list(startYear = structure(c(1L, 2L, 3L, 4L, 5L, 1L,
2L, 3L, 4L, 5L, 1L, 1L, 3L, 2L, 3L, 1L, 1L, 1L, 2L, 3L), .Label = c("2014",
"2015", "2016", "2017", "2018"), class = "factor"), groupID = c("AISAC-0000",
"AISAC-0000", "AISAC-0000", "AISAC-0000", "AISAC-0000", "ASSAT-0000",
"ASSAT-0000", "ASSAT-0000", "ASSAT-0000", "ASSAT-0000", "BAYER-0001",
"BAYSC-0002", "GECER-0002", "HANIN-0000", "HANIN-0000", "HOCED-0001",
"HOCEN-0000", "INDAL-0000", "INDAL-0000", "INDAL-0000"), N = c(82,
124, 60, 164, 65, 142, 183, 142, 75, 185, 145, 22, 162, 92, 4,
166, 57, 11, 199, 137)), row.names = c(NA, -20L), class = c("data.table",
"data.frame"))
setDT(mydata)
addplus <- function(x, digits=3){
x <- setNames(x, round(100*x, digits = digits))
ifelse(is.na(x), x,
ifelse(sign(x) == 1, paste0("+", names(x), "%"), paste0(names(x), "%"))
)
}
mydata[ , var_calc := addplus(N/shift(N) - 1), by="groupID"][]
#> startYear groupID N var_calc
#> 1: 2014 AISAC-0000 82 <NA>
#> 2: 2015 AISAC-0000 124 +51.22%
#> 3: 2016 AISAC-0000 60 -51.613%
#> 4: 2017 AISAC-0000 164 +173.333%
#> 5: 2018 AISAC-0000 65 -60.366%
#> 6: 2014 ASSAT-0000 142 <NA>
#> 7: 2015 ASSAT-0000 183 +28.873%
#> 8: 2016 ASSAT-0000 142 -22.404%
#> 9: 2017 ASSAT-0000 75 -47.183%
#> 10: 2018 ASSAT-0000 185 +146.667%
#> 11: 2014 BAYER-0001 145 <NA>
#> 12: 2014 BAYSC-0002 22 <NA>
#> 13: 2016 GECER-0002 162 <NA>
#> 14: 2015 HANIN-0000 92 <NA>
#> 15: 2016 HANIN-0000 4 -95.652%
#> 16: 2014 HOCED-0001 166 <NA>
#> 17: 2014 HOCEN-0000 57 <NA>
#> 18: 2014 INDAL-0000 11 <NA>
#> 19: 2015 INDAL-0000 199 +1709.091%
#> 20: 2016 INDAL-0000 137 -31.156%
由reprex 包(v0.3.0)于 2020-05-12 创建
推荐阅读
- c# - 如何登录到浏览器开发者工具的控制台
- python - 在“例外”块中执行操作是一种好的编码习惯吗?
- c - 不能在程序集中调用“putchar”
- r - XML标记内的变量到R数据框中的列
- assembly - 测试所有相邻的真实位
- javascript - addEventListener e=> 是什么意思?
- c - 如何在C中标记要选择的字符
- protractor - Angular e2e 在简单的 by.css 期望中失败
- python - 使用 numpy 将 3d 点转换为新坐标系的函数
- android - 迁移到 androidx 后 Theme.AppCompat.Light.NoActionBar 无法正常工作