r - 分组列后计算函数中的值
问题描述
我已经在有土壤和没有土壤的罐子中采样了气体 x 和 y 的浓度。没有土壤的空罐子中的气体浓度最高,现在我想知道土壤吸收了多少气体。
浓度是在不同条件下用罐子测量的:source
要么o
要么要么要么。要么 要么和被命名为, , , , , .n
activity
low
high
sampling
soil
blank
jars
jar1
jar2
jar3
jar4
blank1
blank2
我现在想计算每个独特测量条件下气体x
和气体的相对浓度,例如= 、= 。y
source
o
activity
low
计算应该是((blank1+blank2)/2/jar1)。我在名为x_pct
和的两列中给出了预期值y_pct
。
关于如何设置有效代码的任何想法?
数据如下所示:
> df
source activty jar sampling x y x_pct y_pct
1 o low blank1 blank 34 46 1.00 0.99
2 o high blank1 blank 31 43 1.02 1.01
3 n low blank1 blank 32 44 0.98 1.01
4 n high blank1 blank 35 47 1.01 1.01
5 o low jar1 soil 21 33 1.62 1.38
6 o high jar1 soil 22 34 1.43 1.28
7 n low jar1 soil 23 34 1.37 1.31
8 n high jar1 soil 23 35 1.54 1.36
9 o low jar2 soil 27 39 1.26 1.17
10 o high jar2 soil 28 46 1.13 0.95
11 n low jar2 soil 29 41 1.09 1.09
12 n high jar2 soil 27 39 1.31 1.22
13 o low blank2 blank 34 45 1.00 1.01
14 o high blank2 blank 32 44 0.98 0.99
15 n low blank2 blank 31 45 1.02 0.99
16 n high blank2 blank 36 48 0.99 0.99
17 o low jar3 soil 25 37 1.36 1.23
18 o high jar3 soil 25 37 1.26 1.18
19 n low jar3 soil 26 38 1.21 1.17
20 n high jar3 soil 25 37 1.42 1.28
21 o low jar4 soil 19 34 1.79 1.34
22 o high jar4 soil 18 30 1.75 1.45
23 n low jar4 soil 20 34 1.58 1.31
24 n high jar4 soil 20 33 1.78 1.44
输入
df <- structure(list(source = c("o", "o", "n", "n", "o", "o", "n",
"n", "o", "o", "n", "n", "o", "o", "n", "n", "o", "o", "n", "n",
"o", "o", "n", "n"), activty = c("low", "high", "low", "high",
"low", "high", "low", "high", "low", "high", "low", "high", "low",
"high", "low", "high", "low", "high", "low", "high", "low", "high",
"low", "high"), jar = c("blank1", "blank1", "blank1", "blank1",
"jar1", "jar1", "jar1", "jar1", "jar2", "jar2", "jar2", "jar2",
"blank2", "blank2", "blank2", "blank2", "jar3", "jar3", "jar3",
"jar3", "jar4", "jar4", "jar4", "jar4"), sampling = c("blank",
"blank", "blank", "blank", "soil", "soil", "soil", "soil", "soil",
"soil", "soil", "soil", "blank", "blank", "blank", "blank", "soil",
"soil", "soil", "soil", "soil", "soil", "soil", "soil"), x = c(34L,
31L, 32L, 35L, 21L, 22L, 23L, 23L, 27L, 28L, 29L, 27L, 34L, 32L,
31L, 36L, 25L, 25L, 26L, 25L, 19L, 18L, 20L, 20L), y = c(46L,
43L, 44L, 47L, 33L, 34L, 34L, 35L, 39L, 46L, 41L, 39L, 45L, 44L,
45L, 48L, 37L, 37L, 38L, 37L, 34L, 30L, 34L, 33L), x_pct = c(1,
1.02, 0.98, 1.01, 1.62, 1.43, 1.37, 1.54, 1.26, 1.13, 1.09, 1.31,
1, 0.98, 1.02, 0.99, 1.36, 1.26, 1.21, 1.42, 1.79, 1.75, 1.58,
1.78), y_pct = c(0.99, 1.01, 1.01, 1.01, 1.38, 1.28, 1.31, 1.36,
1.17, 0.95, 1.09, 1.22, 1.01, 0.99, 0.99, 0.99, 1.23, 1.18, 1.17,
1.28, 1.34, 1.45, 1.31, 1.44)), class = "data.frame", row.names = c(NA,
-24L))
解决方案
或许是这样的:
calculate <- function(d, y) {
## average blank:
m <- colMeans( d[ grepl("blank",d$jar), c("x","y") ] )
## the ratio
d %>% mutate( x_pct = m["x"] / x, y_pct = m["y"] / y )
}
df %>% group_by( source,activity ) %>%
group_modify( calculate ) %>% print.data.frame
它给了我:
> df %>% group_by( source,activity ) %>%
+ group_modify( calculate ) %>% print.data.frame
source activity jar sampling x y x_pct y_pct
1 n high blank1 blank 35 47 1.0142857 1.0106383
2 n high jar1 soil 23 35 1.5434783 1.3571429
3 n high jar2 soil 27 39 1.3148148 1.2179487
4 n high blank2 blank 36 48 0.9861111 0.9895833
5 n high jar3 soil 25 37 1.4200000 1.2837838
6 n high jar4 soil 20 33 1.7750000 1.4393939
7 n low blank1 blank 32 44 0.9843750 1.0113636
8 n low jar1 soil 23 34 1.3695652 1.3088235
9 n low jar2 soil 29 41 1.0862069 1.0853659
10 n low blank2 blank 31 45 1.0161290 0.9888889
11 n low jar3 soil 26 38 1.2115385 1.1710526
12 n low jar4 soil 20 34 1.5750000 1.3088235
13 o high blank1 blank 31 43 1.0161290 1.0116279
14 o high jar1 soil 22 34 1.4318182 1.2794118
15 o high jar2 soil 28 46 1.1250000 0.9456522
16 o high blank2 blank 32 44 0.9843750 0.9886364
17 o high jar3 soil 25 37 1.2600000 1.1756757
18 o high jar4 soil 18 30 1.7500000 1.4500000
19 o low blank1 blank 34 46 1.0000000 0.9891304
20 o low jar1 soil 21 33 1.6190476 1.3787879
21 o low jar2 soil 27 39 1.2592593 1.1666667
22 o low blank2 blank 34 45 1.0000000 1.0111111
23 o low jar3 soil 25 37 1.3600000 1.2297297
24 o low jar4 soil 19 34 1.7894737 1.3382353
推荐阅读
- pytorch - 损失振荡而不是减少 seq2seq gru pytorch
- c# - 诊断工具意外失败
- c++ - 你能告诉我函数的时间复杂度(Big-O)是多少吗?
- python - 使用 selenium (Python) 实现 Instagram 自动化 - 关注和取消关注按钮
- cucumber - 无法在 serenity bdd 中并行执行测试用例
- flutter - 如何在 Flutter 的 material_tag_editor 中使用 'Enter' 作为分隔符
- flutter - Flutter Repaint 边界重绘规则
- javascript - JS 函数适用于所有浏览器但并非所有机器
- php - PHP数组到XML cdata
- node.js - 在 Nodejs 中使用 liquibase