r - 绘制残差(连续)与解释变量(分类)警告:“在 xy.coords(x, y, xlabel, ylabel, log) 中:强制引入的 NAs”
问题描述
长时间的googler,第一次提问,如果我的问题格式不是很好,对不起。
我有一个名为 daily 的小标题,这是 dput 输出:
structure(list(Moon_Phase = c("mid", "mid", "mid", "mid", "mid",
"new", "new", "new", "new", "new", "new", "new", "new", "new",
"new", "new", "new", "new", "new", "new", "new", "new", "new",
"new", "new", "new"), name = c("Al_Capone", "Al_Capone", "Bonnie",
"Clyde", "Clyde", "Al_Capone", "Al_Capone", "Barb", "Barb", "Biggie",
"Biggie", "Bonnie", "Bowser", "Bowser", "Doe", "Doe", "Jesse",
"Jesse", "Lizzie", "Lizzie", "Louise", "Louise", "Roxy", "Roxy",
"Sue", "Sue"), `date(DateTime)` = structure(c(17215, 17216, 17156,
17155, 17156, 17133, 17134, 17161, 17162, 17157, 17158, 17156,
17216, 17217, 17199, 17200, 17161, 17162, 17185, 17186, 17133,
17134, 17196, 17197, 17193, 17194), class = "Date"), count = c(60970.2127659574,
47145.2054794521, 66323.6514522822, 51168.932038835, 64211.673151751,
75354.5454545455, 76069.5652173913, 52992, 42865.1162790698,
63810.6870229008, 70530.612244898, 54834.2379958246, 60198.4962406015,
56254.2056074766, 70338.4615384615, 64800, 44400, 57466.6666666667,
54477.8761061947, 46423.8805970149, 58830.7692307692, 70478.0487804878,
62786.4406779661, 66541.935483871, 58493.4306569343, 60781.3953488372
), avg = c(0.167566808400667, 0.0916716980460977, 0.169983135592288,
0.0950009067366473, 0.172076034264729, 0.195215802633862, 0.213308643950517,
0.160601492425918, 0.0352463837761031, 0.181835110358351, 0.175611555735529,
0.102218432032213, 0.141489253083123, 0.129562604439575, 0.169391188107789,
0.148380507250866, 0.158557388456314, 0.146077250703009, 0.120220050003983,
0.0801402704143268, 0.15458396257616, 0.192381143851207, 0.165149903514201,
0.138869248196884, 0.137792634329098, 0.15698540693065)), class = c("grouped_df",
"tbl_df", "tbl", "data.frame"), row.names = c(NA, -26L), vars = c("Moon_Phase",
"name"), drop = TRUE)
我创建了一个线性模型:
m0 <- lm(count ~ Moon_Phase, data = daily)
我想检查我的模型的独立性,所以我绘制了残差与解释变量:
plot(x = daily$Moon_Phase,
y = E1,
xlab = "Moon Phase",
ylab = "Normalized residuals",
xlim = c(0,nrow(daily))
)
我还想检查是否需要包含 daily$name 作为随机效应,因此我将无随机效应线性模型的残差与潜在随机效应进行了比较:
lm.test <- lm(count ~ Moon_Phase, data = daily)
lm.test.resid <- rstandard(lm.test)
并绘制:
plot(lm.test.resid ~ daily$name,
xlab = "Name",
ylab = "Standardized residuals",
xlim = c(0,nrow(daily)))
每次我使用 plot() 时,都会收到以下警告消息:
警告消息:在 xy.coords(x, y, xlabel, ylabel, log) 中:强制引入的 NA
有谁知道为什么会出现该消息?我在这里和其他论坛上看到其他人遇到过这个问题,并且将事物转换为因子和/或数字似乎对其他人有帮助,所以我尝试了as.numeric(E1)
,as.factor(daily$Moon_Phase)
和as.factor(daily$name)
,但这似乎没有帮助。
谢谢!
解决方案
推荐阅读
- string - 我可以将 Twig 变量标记为与捕获的文本块一样安全吗?
- c# - 具有相同数据源的 DataGridView 和 ListBox 选择相同的元素
- bitbucket-pipelines - 为多个项目创建单个 bitbucket-pipelines.yml 文件
- ios - 动画 UIStackView 子视图会导致布局问题
- python - 计算 Pandas 中两个值内的位置的计数
- matlab - 使用 MATLAB 的 GPU 功能计算 sum(a.*exp(b.*c),1) 的有效方法
- python - Postman 生成的 Python 脚本不起作用
- c# - 如何使用 LINQ 在 c# Windows 表单中显示结果值甚至为空
- r - 嵌套列表到数据框
- google-sheets - 为什么应用于包含时间和文本的范围的 Google 表格 SUM 函数返回十进制数