r - 在 R 中计算两个不同的均值
问题描述
我正在尝试从“R”中的以下数据集中计算两种不同的方法
Plot Date Time Canopyheight mean pre post Diff
103B1 11/12/2019 1 50
103B1 11/12/2019 4 50
103B1 11/12/2019 6 78
103B1 11/12/2019 22 100 69.5
103B1 11/13/2019 1 60
103B1 11/13/2019 4 70
103B1 11/13/2019 6 80
103B1 11/13/2019 22 100 77.5 73.5
103B1 11/14/2019 1 50
103B1 11/14/2019 4 50
103B1 11/14/2019 6 78
103B1 11/14/2019 22 100 69.5
103B1 11/15/2019 1 60
103B1 11/15/2019 4 80
103B1 11/15/2019 6 90
103B1 11/15/2019 22 120 87.5 78.5 5.0
我能够获得平均值,但无法获得前后值。
预期结果
使用代码,我们应该能够得到 '73.5' 的值,它是 '69.5 和 77.5' 的平均值,其他值也是这样计算的。差值将计算为 Pre 和 Post 值之间的差值。
编码
Prepost <- Prepost %>% group_by(Plot, Date) %>%
mutate(meancanopyheight = mean(Canopyheight, na.rm = T))
Prepost$Preharvest <- lapply(Prepost$Date, function(m) mean(Prepost$meanCanopyheight[Prepost$Date >= m |Prepost$Date <= m+4| Prepost$Date == m+8], na.rm = TRUE))
我尝试计算但无法计算,我已在此处添加代码供您参考。
谢谢您的帮助。
解决方案
你可以dplyr
这样使用:
library(dplyr)
df %>%
group_by(Date) %>%
summarize(mean = mean(Canopyheight)) %>%
mutate(group = rep(c("pre", "post"), each = 2)) %>%
group_by(group) %>%
summarize(mean = mean(mean))
#> # A tibble: 2 x 2
#> group mean
#> <chr> <dbl>
#> 1 post 78.5
#> 2 pre 73.5
由reprex 包于 2020-02-20 创建(v0.3.0)
基于来自 OP 的进一步数据,使该解决方案更通用:
library(dplyr)
df <- structure(list(Plot = c("TF_103B1", "TF_103B1", "TF_103B1", "TF_103B1",
"TF_103B1", "TF_103B1", "TF_103B1", "TF_103B1", "TF_103B1", "TF_103B1",
"TF_103B1", "TF_103B1", "TF_103B1", "TF_103B1", "TF_103B1", "TF_103B1",
"TF_103B1", "TF_103B1", "TF_103B1", "TF_103B1", "TF_103B1", "TF_103B1",
"TF_103B1", "TF_103B1", "TF_103B1", "TF_103B1", "TF_103B1", "TF_103B1",
"TF_103B1", "TF_103B1", "TF_103B1", "TF_103B1", "TF_103B1", "TF_103B1",
"TF_103B1", "TF_103B1", "TF_103B1", "TF_103B1", "TF_103B1", "TF_103B1",
"TF_103B1", "TF_103B1"), Date = structure(c(18217, 18217, 18217,
18217, 18218, 18218, 18218, 18218, 18219, 18219, 18219, 18219,
18220, 18221, 18221, 18221, 18221, 18222, 18222, 18222, 18222,
18246, 18246, 18246, 18246, 18247, 18247, 18247, 18247, 18248,
18248, 18248, 18248, 18249, 18250, 18250, 18250, 18250, 18251,
18251, 18251, 18251), class = "Date"), Time = c("1", "4", "6",
"22", "1", "4", "6", "22", "1", "4", "6", "22", "22", "1", "4",
"6", "22", "1", "4", "6", "22", "1", "4", "6", "22", "1", "4",
"6", "22", "1", "4", "6", "22", "22", "1", "4", "6", "22", "1",
"4", "6", "22"), Canopyheight = c(2064.55, 2064.51, 2063.03,
2063.62, 2065.94, 2064.83, 2061.58, 2064.07, 2066.97, 2063.99,
2065.37, 2064.7, 2067.8, 2065.6, 2067.05, 2064.95, 2075.76, 2073.06,
2079.23, 2072.75, 2068.81, 2065.66, 2065.85, 2065.65, 2063.65,
2063.44, 2068.05, 2072.38, 2067.2, 2068.1, 2067.26, 2069.27,
2063.05, 2088.45, 2086.24, 2088.91, 2092.04, 2092, 2092.67, 2090.7,
2091.59, 2090.99)), row.names = c(NA, 42L), class = "data.frame")
df <- df %>%
group_by(Date) %>%
summarize(mean = mean(Canopyheight)) %>%
mutate(prepost = rep(rep(c("pre", "post"), each = 3), length.out = n()))
df$start_date <- rep(df$Date[seq(nrow(df)) %% 6 == 0], each = 6)
df %>%
group_by(start_date, prepost) %>%
summarize(mean = mean(mean))
#> # A tibble: 4 x 3
#> # Groups: start_date [2]
#> start_date prepost mean
#> <date> <chr> <dbl>
#> 1 2019-11-22 post 2070.
#> 2 2019-11-22 pre 2064.
#> 3 2019-12-21 post 2090.
#> 4 2019-12-21 pre 2067.
由reprex 包(v0.3.0)于 2020-02-21 创建
推荐阅读
- javascript - mouseenter , mouseover 或任何鼠标事件在我的脚本上不起作用?
- android - Ionic 4:保持 Ionic 应用程序在后台运行、提取地理位置并防止被 Android 系统关闭的最佳方法?
- javascript - angularjs点击事件,附加内容后
- mysql - 如何在pyspark sql或Mysql中按键求和
- reflection - F# Generic Map.count 使用反射
- python - 我想从格式为 YEARWEEKNUM 的时间字段创建一个标志
- scheme - Scheme 中的语法和变量
- embedded-linux - Android - 将访问权限设置为分区
- button - Flutter:将按钮边缘与其他组件对齐
- html - 根据数值字段(列)的值,用不同的颜色阴影(每行 - 一个阴影)为 Angular 8 中的 mat-table 行着色