r - 在 glm 函数中使用特定值对数据进行子集化
问题描述
我需要检查 $20K 和 40K 特定收入的支出。我的 glm() 有效,但是当我添加子集时,出现错误:
glm(district21$expend ~ 1 + income, family = gaussian(link = "identity"),data = district21, subset = income == 20000)
glm.fit(x = numeric(0), y = numeric(0), weights = NULL, start = NULL, : object 'fit' not found 另外:警告消息:1:在 glm.fit(x = numeric(0), y = numeric(0), weights = NULL, start = NULL, : 在第 1 次迭代中没有观察到的信息 2: glm.fit: 算法没有收敛
我还想知道如何在子集争论中具体说明高于或低于中位数的收入?IE
glm(district21$expend ~ 1 + income, family = gaussian(link = "identity"),data = district21, subset = income > median())
解决方案
子集需要是一个逻辑向量。所以试试这个:
glm(expend ~ 1 + income, family = gaussian(link = "identity"), data = district21, subset = district21$income == 20000)
要为大于中位数的值设置子集,请尝试:
subset = district21$income > median(district21$income)
或者,您可以使用tidyverse
管道提前对数据进行子集化:
library(tidyverse)
dplyr::filter(district21, income == 20000) %>%
glm(expend ~ 1 + income, family = gaussian(link = "identity"), data = .)
dplyr::filter(district21, income > median(income)) %>%
glm(expend ~ 1 + income, family = gaussian(link = "identity"), data = .)