首页 > 解决方案 > 计算要绘制的“切割”数据的平均值和四分位数范围

问题描述

抱歉,我是 R 新手,我有一个包含树木高度和树冠密度的数据集,例如:

i_h100   i_cd
2.89     0.0198
2.88     0.0198
17.53    0.658
27.23    0.347

我想将“h_100”重新组合为 2m 间隔,从 2m 最小到 30m 最大,然后我想计算每个间隔的平均 i_cd 值和四分位距,以便我可以用最小二乘回归绘制它们。我用来获取平均值的代码有问题。这是我到目前为止所拥有的:

mydata=read.csv("irelandish.csv")
height=mydata$i_h100
breaks=seq(2,30,by=2)  #2m intervals
height.cut=cut(height, breaks, right=TRUE)

#attempt at calculating means per group
install.packages("dplyr")
mean=summarise(group_by(cut(height, breaks, right=TRUE), 
mean(mydata$i_cd)))
install.packages("reshape2")
dcast(mean)

提前感谢您的任何建议。

标签: r

解决方案


用于aggregate()计算分组均值。

# Some example data
set.seed(1)

i_h100 <- round(runif(100, 2, 30), 2)
i_cd <- rexp(100, 1/i_h100)
mydata <- data.frame(i_cd, i_h100)

# Grouping i_h100
mydata$i_h100_2m <- cut(mydata$i_h100, seq(2, 30, by=2))
head(mydata)
#        i_cd i_h100 i_h100_2m
# 1  2.918093   9.43    (8,10]
# 2 13.735728  12.42   (12,14]
# 3 13.966347  18.04   (18,20]
# 4  2.459760  27.43   (26,28]
# 5  8.477551   7.65     (6,8]
# 6  6.713224  27.15   (26,28]

# Calculate groupwise means of i_cd
i_cd_2m_mean <- aggregate(i_cd ~ i_h100_2m, mydata, mean)

# And IQR
i_cd_2m_iqr <- aggregate(i_cd ~ i_h100_2m, mydata, IQR)

upper <- i_cd_2m_mean[,2]+(i_cd_2m_iqr[,2]/2)
lower <- i_cd_2m_mean[,2]-(i_cd_2m_iqr[,2]/2)

# Plotting the result
plot.default(i_cd_2m_mean, xaxt="n", ylim=range(c(upper, lower)),
  main="Groupwise means \U00B1 0.5 IQR", type="n")
points(upper, pch=2, col="lightblue", lwd=1.5)
points(lower, pch=6, col="pink", lwd=1.5)
points(i_cd_2m_mean, pch=16)

axis(1, i_cd_2m[,1], as.character(i_cd_2m[,1]), cex.axis=0.6, las=2)

在此处输入图像描述


推荐阅读