首页 > 解决方案 > Potential issue in svylogrank test results

问题描述

I am running a survival model with sampling weights and was hoping to get the weighted logrank test results. When I use the svylogrank() function from the survey package, the results were puzzling. If I run the default svylogrank test, the returned chisq value seemed way too high, and when I run the function with the method set to "score", the results were much more reasonable and as expected. Based on my understanding of the parameter method, it should only matter for performance and should not affect the results, and I believe my model matrix is also relatively small.

Could someone help advise on this issue?

D = data.table(unique_id = 1:135,
               weights = rep(1,135),
               event_time = c(0.53512437, 1.35655869, 2.00414189, 2.37276648, 3.20343526, 0.96618494, 2.57894309, 0.94575080, 1.25347833, 1.44416450, 5.04038200, 7.80587169 ,
                        6.53631154, 6.31914568, 7.00146597, 9.67616088, 7.94212358, 9.70693890, 10.67575835, 10.06764688, 12.29175616, 13.60092871, 13.12508566, 14.66417522,
                        15.35250691, 0.93368707, 0.19087611, 3.15533767, 4.40821633, 17.54334957, 17.95177642, 15.50903946, 16.48376185, 20.87956697, 21.24571398, 22.34297263,
                        23.36042629, 21.01760215, 23.84785038, 26.06105822, 4.16866350, 1.96922485, 0.66199008, 6.76987830, 1.55617685, 0.19095871, 3.13291784, 5.43159409,
                        9.55805671, 4.31437322, 0.78259860, 5.26415156, 3.45095686, 1.69128712, 8.41942426, 3.33748695, 6.08516173, 2.72897404, 0.22789783, 0.86348009,
                        2.35707587, 2.97477615, 12.33273800, 0.58532123, 0.14586238, 10.67948547, 4.07655972, 3.94405136, 0.37226898, 1.42558725, 1.47680658, 4.22506540,
                        1.56703478, 8.37484756, 12.54015087, 1.80994787, 3.66453633, 1.02834532, 1.99065652, 1.23577436, 16.21981618, 14.35039798, 4.15321606, 2.79740679,
                        0.35538726, 7.46823358, 1.66329088, 7.46525382, 2.62734831, 3.19057957, 0.33317193, 0.09122886, 9.14616245, 2.48542578, 2.37569263, 5.48499630,
                        2.22749399, 2.64816296, 0.97101545, 1.42468625, 1.27668904, 0.03692447, 1.98783210, 5.47692729, 3.88316178, 0.32921277, 1.77225345, 9.33268901,
                        2.44517775, 1.49813702, 2.56059172, 3.43194832, 1.22955630, 3.56263947, 9.07060099, 3.58312362, 2.22755370, 4.24783776, 3.46364804, 1.61671354,
                        11.10973565, 7.18764270, 1.80400046, 6.39833474, 6.72825192, 6.46063344, 5.76855531, 5.27157807, 4.66154734, 3.50019718, 2.27156678, 3.28531594,
                        2.35699896, 2.94956000, 8.85381736),
               event_flag = c(1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 1, 1, 1, 0, 0, 0, 1, 
                              1, 0, 0, 0, 0, 1, 0, 0, 1, 1, 1, 1, 1, 0, 0, 1, 0, 0, 0, 0, 0, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 0, 1, 1, 1, 1, 0, 1, 0, 0, 1, 0, 0, 1, 0, 1, 1, 1, 
                              1, 1, 1, 0, 1, 0, 1, 0, 1, 1, 0, 1, 1, 1, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 1, 0, 1, 1, 0, 0, 0),
               group = c(rep("group1", 40), rep("group2", 95)))

svykm_formula = "Surv(event_time, event_flag) ~ group" %>% as.formula
svy_design = svydesign( ids = D[ , unique_id ], weights = D[ , weights], data = D )

svylogrank(formula = svykm_formula, design = svy_design)
svylogrank(formula = svykm_formula, design = svy_design, method = "score")

Results:

> svylogrank(formula = svykm_formula, design = svy_design)
[[1]]
         score                               
[1,] -21.31458 3.99272 -5.338361 9.379041e-08

[[2]]
       chisq            p 
2.849810e+01 9.379041e-08 

attr(,"class")
[1] "svylogrank"
> svylogrank(formula = svykm_formula, design = svy_design, method = "score")
z.groupgroup2         Chisq             p 
   1.85490656    3.44067835    0.06360957 
attr(,"class")
[1] "svylogrank"

标签: rstatisticssurveysurvival-analysis

解决方案


是的,这是一个错误(将在下一个版本中修复)。两者method="large"method="score"都是正确的,但默认method="small"值不是。

问题是它survival::coxph.detail并没有完全按照风险矩阵的记录(或者我误解了它)。但是,如果观察结果的顺序正确,您就无法判断。我最初用一个数据集对此进行了测试,它意外地给出了正确的答案。

更新:这使用原始数据。和同意method="large"method="score"如果您对数据进行排序以使时间按升序排列,则默认值method="small"也同意(将来的版本不需要这样做)

> ii<-with(D, order(event_time, event_flag))
> 
> svy_design2 = svydesign( ids = ~unique_id , weights = ~weights, data = D[ii,] )
> 
> svylogrank(formula = svykm_formula, design = svy_design2)
[[1]]
        score                             
[1,] 6.994881 3.771016 1.854907 0.06360957

[[2]]
     chisq          p 
3.44067835 0.06360957 

attr(,"class")
[1] "svylogrank"
> svylogrank(formula = svykm_formula, design = svy_design, method = "large")
[[1]]
     score       se        z          p
1 6.994881 3.771016 1.854907 0.06360957

[[2]]
     chisq          p 
3.44067835 0.06360957 

attr(,"class")
[1] "svylogrank"
> 
> svylogrank(formula = svykm_formula, design = svy_design, method = "score")
z.groupgroup2         Chisq             p 
   1.85490656    3.44067835    0.06360957 
attr(,"class")
[1] "svylogrank"

推荐阅读