首页 > 解决方案 > 绘制残差(连续)与解释变量(分类)警告:“在 xy.coords(x, y, xlabel, ylabel, log) 中:强制引入的 NAs”

问题描述

长时间的googler,第一次提问,如果我的问题格式不是很好,对不起。

我有一个名为 daily 的小标题,这是 dput 输出:

structure(list(Moon_Phase = c("mid", "mid", "mid", "mid", "mid", 
"new", "new", "new", "new", "new", "new", "new", "new", "new", 
"new", "new", "new", "new", "new", "new", "new", "new", "new", 
"new", "new", "new"), name = c("Al_Capone", "Al_Capone", "Bonnie", 
"Clyde", "Clyde", "Al_Capone", "Al_Capone", "Barb", "Barb", "Biggie", 
"Biggie", "Bonnie", "Bowser", "Bowser", "Doe", "Doe", "Jesse", 
"Jesse", "Lizzie", "Lizzie", "Louise", "Louise", "Roxy", "Roxy", 
"Sue", "Sue"), `date(DateTime)` = structure(c(17215, 17216, 17156, 
17155, 17156, 17133, 17134, 17161, 17162, 17157, 17158, 17156, 
17216, 17217, 17199, 17200, 17161, 17162, 17185, 17186, 17133, 
17134, 17196, 17197, 17193, 17194), class = "Date"), count = c(60970.2127659574, 
47145.2054794521, 66323.6514522822, 51168.932038835, 64211.673151751, 
75354.5454545455, 76069.5652173913, 52992, 42865.1162790698, 
63810.6870229008, 70530.612244898, 54834.2379958246, 60198.4962406015, 
56254.2056074766, 70338.4615384615, 64800, 44400, 57466.6666666667, 
54477.8761061947, 46423.8805970149, 58830.7692307692, 70478.0487804878, 
62786.4406779661, 66541.935483871, 58493.4306569343, 60781.3953488372
), avg = c(0.167566808400667, 0.0916716980460977, 0.169983135592288, 
0.0950009067366473, 0.172076034264729, 0.195215802633862, 0.213308643950517, 
0.160601492425918, 0.0352463837761031, 0.181835110358351, 0.175611555735529, 
0.102218432032213, 0.141489253083123, 0.129562604439575, 0.169391188107789, 
0.148380507250866, 0.158557388456314, 0.146077250703009, 0.120220050003983, 
0.0801402704143268, 0.15458396257616, 0.192381143851207, 0.165149903514201, 
0.138869248196884, 0.137792634329098, 0.15698540693065)), class = c("grouped_df", 
"tbl_df", "tbl", "data.frame"), row.names = c(NA, -26L), vars = c("Moon_Phase", 
"name"), drop = TRUE)

我创建了一个线性模型:

m0 <- lm(count ~ Moon_Phase, data = daily)

我想检查我的模型的独立性,所以我绘制了残差与解释变量:

plot(x = daily$Moon_Phase,
     y = E1,
     xlab = "Moon Phase",
     ylab = "Normalized residuals",
     xlim = c(0,nrow(daily))
)

我还想检查是否需要包含 daily$name 作为随机效应,因此我将无随机效应线性模型的残差与潜在随机效应进行了比较:

lm.test <- lm(count ~ Moon_Phase, data = daily)
lm.test.resid <- rstandard(lm.test)

并绘制:

plot(lm.test.resid ~ daily$name, 
     xlab = "Name",
     ylab = "Standardized residuals",
     xlim = c(0,nrow(daily))) 

每次我使用 plot() 时,都会收到以下警告消息:

警告消息:在 xy.coords(x, y, xlabel, ylabel, log) 中:强制引入的 NA

有谁知道为什么会出现该消息?我在这里和其他论坛上看到其他人遇到过这个问题,并且将事物转换为因子和/或数字似乎对其他人有帮助,所以我尝试了as.numeric(E1),as.factor(daily$Moon_Phase)as.factor(daily$name),但这似乎没有帮助。

谢谢!

标签: rplotstatisticsmixed-modelscoercion

解决方案


推荐阅读