r - 进行单向方差分析
问题描述
我有一个带有网格开口测量值的数据集以及用于获取这些测量值的工具。我想对数据完成单向方差分析。这是我的代码:
df<-structure(list(MeasurementTool = c("Wedge", "Wedge", "Wedge",
"Wedge", "Wedge", "Wedge", "Wedge", "Wedge", "Wedge", "Wedge",
"Wedge", "Wedge", "Wedge", "Wedge", "Wedge", "Wedge", "Wedge",
"Wedge", "Wedge", "Wedge", "Weighted Wedge", "Weighted Wedge",
"Weighted Wedge", "Weighted Wedge", "Weighted Wedge", "Weighted Wedge",
"Weighted Wedge", "Weighted Wedge", "Weighted Wedge", "Weighted Wedge",
"Weighted Wedge", "Weighted Wedge", "Weighted Wedge", "Weighted Wedge",
"Weighted Wedge", "Weighted Wedge", "Weighted Wedge", "Weighted Wedge",
"Weighted Wedge", "Weighted Wedge", "ICES Gauge", "ICES Gauge",
"ICES Gauge", "ICES Gauge", "ICES Gauge", "ICES Gauge", "ICES Gauge",
"ICES Gauge", "ICES Gauge", "ICES Gauge", "ICES Gauge", "ICES Gauge",
"ICES Gauge", "ICES Gauge", "ICES Gauge", "ICES Gauge", "ICES Gauge",
"ICES Gauge", "ICES Gauge", "ICES Gauge"),
MeshOpening = c(157L, 155L, 160L, 160L, 161L, 160L, 158L, 161L, 162L, 162L, 160L, 163L,
158L, 160L, 161L, 165L, 164L, 158L, 164L, 163L, 159L, 158L, 165L,
164L, 159L, 160L, 158L, 159L, 160L, 163L, 159L, 160L, 158L, 158L,
158L, 162L, 160L, 159L, 159L, 159L, 159L, 159L, 159L, 155L, 156L,
156L, 158L, 160L, 156L, 155L, 160L, 160L, 157L, 159L, 158L, 155L,
158L, 157L, 156L, 158L)), row.names = c(NA, -60L), class = "data.frame")
df$`MeasurementTool`<- as.factor(df$`MeasurementTool`)
group_by(df, 'MeasurementTool') %>% summarise(count = n(), mean = mean('MeshOpening', na.rm = TRUE), sd = sd('MeshOpening', na.rm = TRUE))
它给了我这些警告信息:
警告信息:
1:在 mean.default("MeshOpening", na.rm = TRUE) 中:参数不是数字或逻辑:返回 NA
2:在 var(if (is.vector(x) || is.factor(x)) x else as.double(x), na.rm = na.rm) 中:强制引入的 NA
解决方案
你被dplyr::summarise
工作方式绊倒了。它期待一个 R name
(又名symbol
),即字母周围没有引号:
group_by(df, 'MeasurementTool') %>% summarise(count = n(), mean = mean(MeshOpening, na.rm = TRUE), sd = sd(MeshOpening, na.rm = TRUE))
# A tibble: 1 × 4
`"MeasurementTool"` count mean sd
<chr> <int> <dbl> <dbl>
1 MeasurementTool 60 159. 2.48
在 tidyverse 之前的日子里,我们经常像您一样通过字符值名称来引用列,但是许多人似乎喜欢将列名视为第一类对象,这在 tidyverse 中是现在的常态。
更好的是不仅要解决错误的原因,还要得到你真正想要的:
group_by(df, MeasurementTool) %>% summarise(count = n(),
mean = mean(MeshOpening, na.rm = TRUE),
sd = sd(MeshOpening, na.rm = TRUE))
# A tibble: 3 × 4
MeasurementTool count mean sd
<fct> <int> <dbl> <dbl>
1 ICES Gauge 20 158. 1.73
2 Wedge 20 161. 2.56
3 Weighted Wedge 20 160. 2.06
可以说,如果 group_by 函数的第二个参数的值不会被解释为与列名匹配的值,则该函数应该抛出一个错误或至少一个警告。
推荐阅读
- vb6 - 从win7移到win10后,某些数据库代码不起作用
- reactjs - 测试 react.js 时出现“ReferenceError:waitForElement 未定义”
- git - 为什么 Git 要将我的行尾更正为 CRLF,即使我希望它们在 LF 中?
- oracle - 对 JDBC 批量更新的所有行进行原子锁定
- javascript - 如何在 reactjs 中为组件动态添加属性
- android - 为什么 getChildAt 在某些情况下在 GridView 的 onItemClick 侦听器中返回 null,即使视图可见
- javascript - iOS 12.2 添加时冻结
- c# - 我有一个邮件字符串数组,我想在 WPF 中创建一个包含 2 列、邮件列和复选框列的数据网格
- angular - 在角度/节点中将数据从 api 发布到
- windows - 在关机/重启时执行脚本