r - 如何在 survreg 区间回归中模拟乘法异方差?
问题描述
我想比较 R 中的两个区间回归模型,使用survival::survival
. 第二个模型由相同的基本模型组成,但也应该对条件方差进行建模。我已经知道如何在 Stata 中计算这个,但我想知道如何在 R 中做到这一点。
在 R 中,为了获得第一个模型,我在下限和上限之外创建了一个生存对象,y.min
并将y.max
其用作因变量。
library(survival)
intdat$y <- with(intdat, Surv(y.min, y.max, type="interval2"))
summary(sfit <- survreg(y ~ x1 + x2, data=intdat, dist="gaussian"))
# Call:
# survreg(formula = y ~ x1 + x2, data = intdat, dist = "gaussian")
# Value Std. Error z p
# (Intercept) 230.4794 11.2796 20.43 <2e-16
# x11 17.0302 9.4905 1.79 0.073
# x2 -4.9073 2.8457 -1.72 0.085
# Log(scale) 3.5762 0.0728 49.14 <2e-16
#
# Scale= 35.7
#
# Gaussian distribution
# Loglik(model)= -470.5 Loglik(intercept only)= -473.1
# Chisq= 5.2 on 2 degrees of freedom, p= 0.074
# Number of Newton-Raphson Iterations: 3
# n= 100
在 Stata 中,我得到完全相同的结果intreg
:
. use intdat, clear
. intreg y_min y_max i.x1 x2
现在我想将条件方差指定为乘性异方差并添加自变量来对方差建模。在 Stata 中,我可以使用该het()
选项。
. intreg y_min y_max i.x1 x2, het(i.cl y_min)
[...]
Interval regression Number of obs = 100
Wald chi2(2) = 1.30
Log likelihood = -436.8589 Prob > chi2 = 0.5222
------------------------------------------------------------------------------
| Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
model |
x1 |
1 | 5.186822 5.18936 1.00 0.318 -4.984136 15.35778
x2 | -1.259959 1.803787 -0.70 0.485 -4.795317 2.275399
_cons | 237.1218 7.644733 31.02 0.000 222.1384 252.1052
-------------+----------------------------------------------------------------
lnsigma |
cl |
B | -.2222544 .5215087 -0.43 0.670 -1.244393 .7998839
C | -1.601658 .447561 -3.58 0.000 -2.478861 -.7244541
D | -1.063609 .4233763 -2.51 0.012 -1.893411 -.2338066
E | -.8029018 .4324614 -1.86 0.063 -1.65051 .0447069
F | -1.698248 .4749063 -3.58 0.000 -2.629047 -.767449
G | -.9595613 .4689475 -2.05 0.041 -1.878681 -.0404412
|
y_min | -.002804 .0023935 -1.17 0.241 -.0074952 .0018873
_cons | 5.096241 .4189039 12.17 0.000 4.275205 5.917278
------------------------------------------------------------------------------
0 left-censored observations
94 uncensored observations
0 right-censored observations
6 interval observations
我现在想在 R 中复制这个结果。但是我还没有找到一种方法来添加自变量来模拟方差。
一个r-help 帖子建议包含+ strata()
类似的问题,我尝试了:
survreg(y ~ x1 + x2 + strata(cl) + strata(y.min), data=intdat, dist="gaussian")
但是,结果与预期的结果完全不同。进一步的搜索结果也将我引向了 GARCH 和其他不同的方法,但我找不到解决这个看似简单的问题的方法。有谁知道如何为 R 解决这个问题?
数据:
intdat <- structure(list(x1 = structure(c(1L, 1L, 1L, 1L, 2L, 1L, 1L, 1L,
1L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L,
1L, 1L, 1L, 2L, 2L, 1L, 1L, 2L, 2L, 1L, 1L, 1L, 1L, 2L, 1L, 1L,
1L, 1L, 2L, 1L, 2L, 1L, 1L, 1L, 1L, 2L, 1L, 1L, 1L, 2L, 1L, 1L,
1L, 1L, 1L, 1L, 2L, 1L, 1L, 1L, 1L, 1L, 2L, 2L), .Label = c("0",
"1"), class = "factor"), x2 = c(8.45510499910282, 5.91889385427315,
4.17438726989564, 2.89037175789616, 3.95124371858143, 5.32300997913841,
3.58351893845611, 4.02535169073515, 4.48863636973214, 5.20948615284142,
2.30258509299405, 5.74300318780948, 1.94591014905531, 4.969813299576,
4.24849524204936, 2.83321334405622, 3.93182563272433, 3.36729582998647,
2.99573227355399, 2.99573227355399, 3.66356164612965, 3.76120011569356,
3.80666248977032, 3.87120101090789, 5.08759633523238, 0.693147180559945,
5.08140436498446, 2.07944154167984, 5.2040066870768, 4.39444915467244,
4.51085950651685, 1.79175946922805, 3.73766961828337, 4.56434819146784,
0.693147180559945, 4.20469261939097, 2.30258509299405, 4.31748811353631,
3.09104245335832, 2.07944154167984, 4.35670882668959, 5.12989871492307,
3.58351893845611, 4.20469261939097, 4.81218435537242, 4.15888308335967,
3.17805383034795, 3.58351893845611, 4.57471097850338, 3.2188758248682,
1.38629436111989, 4.44265125649032, 5.91889385427315, 3.98898404656427,
6.34388043412633, 2.99573227355399, 3.61091791264422, 0.693147180559945,
5.92425579741453, 4.23410650459726, 3.09104245335832, 3.09104245335832,
1.09861228866811, 3.78418963391826, 6.83840520084734, 3.85014760171006,
2.07944154167984, 5.2040066870768, 2.89037175789616, 5.30330490805908,
8.45489216521886, 2.39789527279837, 3.61091791264422, 2.63905732961526,
2.83321334405622, 4.15888308335967, 1.94591014905531, 1.79175946922805,
3.2188758248682, 5.11198778835654, 4.53259949315326, 4.56434819146784,
4.11087386417331, 5.50125821054473, 4.06044301054642, 4.33073334028633,
3.71357206670431, 2.39789527279837, 2.39789527279837, 1.09861228866811,
5.61677109766657, 4.76217393479776, 4.59511985013459, 4.48863636973214,
3.29583686600433, 2.77258872223978, 5.97126183979046, 3.17805383034795,
4.04305126783455, 3.3322045101752), cl = structure(c(6L, 2L,
4L, 3L, 6L, 3L, 3L, 3L, 3L, 3L, 3L, 6L, 6L, 3L, 3L, 3L, 5L, 5L,
1L, 4L, 3L, 3L, 3L, 3L, 7L, 3L, 1L, 3L, 3L, 3L, 5L, 7L, 5L, 5L,
5L, 7L, 6L, 1L, 3L, 5L, 6L, 3L, 6L, 5L, 3L, 3L, 3L, 3L, 3L, 3L,
3L, 3L, 2L, 2L, 6L, 3L, 1L, 3L, 1L, 3L, 3L, 4L, 3L, 4L, 7L, 3L,
4L, 3L, 6L, 4L, 6L, 6L, 7L, 3L, 3L, 3L, 3L, 3L, 4L, 4L, 5L, 5L,
6L, 3L, 4L, 3L, 3L, 6L, 7L, 3L, 4L, 3L, 3L, 3L, 7L, 3L, 3L, 5L,
4L, 3L), .Label = c("A", "B", "C", "D", "E", "F", "G"), class = "factor"),
y.min = c(0, 131, 234.33, 230.9, 252, 245.09, 224.11, 235.52,
236, 234.04, 237, 247, 234.5, 251.02, 223, 230.8, 200, 181.01,
120, 204, 232.5, 222, 225, 200.04, 206.99, 266, 101, 220.87,
225.5, 206, 200.81, 191, 200.99, 200, 160.98, 200.5, 230,
120, 0, 193, 246, 235, 248, 221, 231, 236.55, 223, 240.5,
226, 246, 231.99, 251.01, 146, 151.5, 231.02, 250, 124, 0,
96, 255, 251, 221, 241, 236, 201.51, 235.5, 239, 230, 263,
191, 0, 0, 222, 232.77, 219.55, 207.77, 215, 251, 191.88,
237, 207.67, 180, 218, 233, 201, 272, 261, 0, 171.53, 200.99,
153.99, 243.23, 233.99, 231, 201, 281, 221.5, 178, 201, 231
), y.max = c(279, 131, 234.33, 230.9, 252, 245.09, 224.11,
235.52, 236, 234.04, 237, 247, 234.5, 251.02, 223, 230.8,
200, 181.01, 120, 204, 232.5, 222, 225, 200.04, 206.99, 266,
101, 220.87, 225.5, 206, 200.81, 191, 200.99, 200, 160.98,
200.5, 230, 120, 349, 193, 246, 235, 248, 221, 231, 236.55,
223, 240.5, 226, 246, 231.99, 251.01, 146, 151.5, 231.02,
250, 124, 210, 96, 255, 251, 221, 241, 236, 201.51, 235.5,
239, 230, 263, 191, 279, 310, 222, 232.77, 219.55, 207.77,
215, 251, 191.88, 237, 207.67, 180, 218, 233, 201, 272, 261,
310, 171.53, 200.99, 153.99, 243.23, 233.99, 231, 201, 281,
221.5, 178, 201, 231)), row.names = c(4673L, 1092L, 2140L,
3803L, 3096L, 257L, 3936L, 467L, 5061L, 2913L, 1104L, 4386L,
4579L, 1142L, 2907L, 5082L, 5396L, 3155L, 176L, 1810L, 3447L,
2256L, 3980L, 2496L, 72L, 505L, 3397L, 531L, 3845L, 3858L, 1138L,
3191L, 293L, 2338L, 2903L, 5468L, 2534L, 5256L, 4640L, 1438L,
4166L, 4300L, 2935L, 5368L, 4488L, 4342L, 2150L, 4513L, 2904L,
459L, 3307L, 5357L, 1090L, 3625L, 554L, 871L, 4545L, 2937L, 3513L,
4517L, 4008L, 2912L, 484L, 735L, 4874L, 4177L, 965L, 1765L, 1487L,
1898L, 4674L, 941L, 4424L, 4968L, 2755L, 4722L, 4260L, 1013L,
762L, 3399L, 2835L, 3704L, 4936L, 2014L, 2126L, 4041L, 1556L,
945L, 1963L, 3366L, 2278L, 2112L, 3844L, 5476L, 782L, 4742L,
2440L, 2021L, 3168L, 4960L), class = "data.frame")
readstata13::save.dta13(intdat, file="intdat.dta", convert.underscore=TRUE)
解决方案
推荐阅读
- postgresql - 将雪花中的数据卸载到 Postgres 中?
- javascript - Vue 中的动态输入修饰符
- apache-flink - 使用 Apache Flink 流处理缓冲转换后的消息(例如,1000 条)
- reactjs - 我是 React Redux 的初学者。我能否获得一些建议来开发一些示例应用程序以提高自己在该主题上的水平?
- c# - 为什么我的秒表/时间跨度没有在整个方法中保留其信息?
- ios - 关于在 Swift 中设置闹钟日期的问题
- python - 如果线程在 n 秒后变为非活动状态,如何停止线程
- csv - 加载 CSV Neo4j “Neo.ClientError.Statement.SemanticError:无法使用 Test1 的空属性值合并节点”
- magento - Magento 1.9 以编程方式更改产品状态
- android - 设备未在 android studio for mac 中显示