首页 > 解决方案 > 使用 VGLM 将零截断负二项式拟合到某些数据集失败

问题描述

我一直在尝试将零截断负二项式拟合到一些计数数据。在某些情况下,它可以很好地工作,例如:

library(ggplot2)
library(VGAM)
df <- data.frame(n = 1:7, freq = c(0.32, 0.286, 0.19, 0.10, 0.05, 0.02, 0.01))
fit = vglm(df$n ~ 1, posnegbinomial, weights=df$freq)
ggplot(df, aes(x = n, y = freq)) +
  geom_point(colour= "red") +
  geom_point(aes(x = n, 
                 y = dnbinom(n, size=Coef(fit)[2], 
                             mu = Coef(fit)[1])/(1-dnbinom(0, size=Coef(fit)[2], 
                                                           mu = Coef(fit)[1]))),
                 colour = "blue")

1

但是,在其他情况下,它会失败:

df2 <- data.frame(n = 1:4, freq = c(0.87, 0.11, 0.01, 0.0005))
fit2 = vglm(df2$n ~ 1, posnegbinomial, weights=df2$freq)
ggplot(df2, aes(x = n, y = freq)) +
  geom_point(colour= "red") +
  geom_point(aes(x = n, 
                 y = dnbinom(n, size=Coef(fit2)[2], 
                             mu = Coef(fit2)[1])/(1-dnbinom(0, size=Coef(fit2)[2], 
                                                           mu = Coef(fit2)[1]))),
                 colour = "blue")

在此处输入图像描述

在这种情况下,泊松拟合效果很好:

fit_p = vglm(df2$n ~ 1, pospoisson, weights=df2$freq)

ggplot(df2, aes(x = n, y = freq)) +
  geom_point(colour= "red") +
  geom_point(aes(x = n, 
                 y = dpois(n, Coef(fit_p)[1])/(1-dpois(0, Coef(fit_p)[1]))),
             colour = "blue")

在此处输入图像描述

有什么办法可以解决这个问题吗?

谢谢!

标签: rvgam

解决方案


你确定你没有在你的第二个模型中混入一些东西吗?我越来越适合了。为了避免重复代码中的错误,您可能需要使用函数 for y,如下所示:

library(VGAM)
fit1 <- vglm(df1$n ~ 1, posnegbinomial, weights=df1$freq)
fit2 <- vglm(df2$n ~ 1, posnegbinomial, weights=df2$freq)

## helper fun to get y
gy <- function(x) {
  cf <- Coef(x)
  dnbinom(seq(nrow(x@residuals)), size=cf[2], mu=cf[1]) / 
    (1-dnbinom(0, size=cf[2], mu=cf[1]))
}

## plot
op <- par(mfrow=c(1, 2))

plot(df1$n, df1$freq, main="fit1")
points(gy(fit1), col=2)

plot(df2$n, df2$freq, main="fit2")
points(gy(fit2), col=2)

par(op)

在此处输入图像描述


数据:

df1 <- data.frame(n=1:7, freq=c(0.32, 0.286, 0.19, 0.10, 0.05, 0.02, 0.01))
df2 <- data.frame(n=1:4, freq=c(0.87, 0.11, 0.01, 0.0005))

推荐阅读