首页 > 解决方案 > 进行单向方差分析

问题描述

我有一个带有网格开口测量值的数据集以及用于获取这些测量值的工具。我想对数据完成单向方差分析。这是我的代码:

df<-structure(list(MeasurementTool = c("Wedge", "Wedge", "Wedge", 
                                   "Wedge", "Wedge", "Wedge", "Wedge", "Wedge", "Wedge", "Wedge", 
                                   "Wedge", "Wedge", "Wedge", "Wedge", "Wedge", "Wedge", "Wedge", 
                                   "Wedge", "Wedge", "Wedge", "Weighted Wedge", "Weighted Wedge", 
                                   "Weighted Wedge", "Weighted Wedge", "Weighted Wedge", "Weighted Wedge", 
                                   "Weighted Wedge", "Weighted Wedge", "Weighted Wedge", "Weighted Wedge", 
                                   "Weighted Wedge", "Weighted Wedge", "Weighted Wedge", "Weighted Wedge", 
                                   "Weighted Wedge", "Weighted Wedge", "Weighted Wedge", "Weighted Wedge", 
                                   "Weighted Wedge", "Weighted Wedge", "ICES Gauge", "ICES Gauge", 
                                   "ICES Gauge", "ICES Gauge", "ICES Gauge", "ICES Gauge", "ICES Gauge", 
                                   "ICES Gauge", "ICES Gauge", "ICES Gauge", "ICES Gauge", "ICES Gauge", 
                                   "ICES Gauge", "ICES Gauge", "ICES Gauge", "ICES Gauge", "ICES Gauge", 
                                   "ICES Gauge", "ICES Gauge", "ICES Gauge"), 
               MeshOpening = c(157L, 155L, 160L, 160L, 161L, 160L, 158L, 161L, 162L, 162L, 160L, 163L, 
                                158L, 160L, 161L, 165L, 164L, 158L, 164L, 163L, 159L, 158L, 165L, 
                                164L, 159L, 160L, 158L, 159L, 160L, 163L, 159L, 160L, 158L, 158L, 
                                158L, 162L, 160L, 159L, 159L, 159L, 159L, 159L, 159L, 155L, 156L, 
                                156L, 158L, 160L, 156L, 155L, 160L, 160L, 157L, 159L, 158L, 155L, 
                                158L, 157L, 156L, 158L)), row.names = c(NA, -60L), class = "data.frame") 

df$`MeasurementTool`<- as.factor(df$`MeasurementTool`)

group_by(df, 'MeasurementTool') %>% summarise(count = n(), mean = mean('MeshOpening', na.rm = TRUE), sd = sd('MeshOpening', na.rm = TRUE))

它给了我这些警告信息:

警告信息:

1:在 mean.default("MeshOpening", na.rm = TRUE) 中:参数不是数字或逻辑:返回 NA

2:在 var(if (is.vector(x) || is.factor(x)) x else as.double(x), na.rm = na.rm) 中:强制引入的 NA

标签: ranova

解决方案


你被dplyr::summarise工作方式绊倒了。它期待一个 R name(又名symbol),即字母周围没有引号:

group_by(df, 'MeasurementTool') %>% summarise(count = n(), mean = mean(MeshOpening, na.rm = TRUE), sd = sd(MeshOpening, na.rm = TRUE))
# A tibble: 1 × 4
  `"MeasurementTool"` count  mean    sd
  <chr>               <int> <dbl> <dbl>
1 MeasurementTool        60  159.  2.48

在 tidyverse 之前的日子里,我们经常像您一样通过字符值名称来引用列,但是许多人似乎喜欢将列名视为第一类对象,这在 tidyverse 中是现在的常态。

更好的是不仅要解决错误的原因,还要得到你真正想要的:

group_by(df, MeasurementTool) %>% summarise(count = n(), 
                                          mean = mean(MeshOpening, na.rm = TRUE), 
                                          sd = sd(MeshOpening, na.rm = TRUE))
# A tibble: 3 × 4
  MeasurementTool count  mean    sd
  <fct>           <int> <dbl> <dbl>
1 ICES Gauge         20  158.  1.73
2 Wedge              20  161.  2.56
3 Weighted Wedge     20  160.  2.06

可以说,如果 group_by 函数的第二个参数的值不会被解释为与列名匹配的值,则该函数应该抛出一个错误或至少一个警告。


推荐阅读