首页 > 解决方案 > 在 R 中,如何仅使用加法公式在分类变量之间添加手动交互?

问题描述

我想拟合 Cox 回归模型并通过 SurvMiner::ggforest() 使用森林图绘制模型系数。不幸的是,情节不支持交互。他们被跳过。作者提供了一个临时解决方案,但仅适用于两个数值变量。他们通过将数字相乘来生成交互项,然后以加法的方式将该项添加到模型中。

因此,他们不是输入 Y ~ X1 * X2,而是输入 Y ~ X1 + X2 + X1_X2。现在,这个附加项被估计并且可以被绘制出来。

好的,现在我有 3 个分类变量。如何以这种方式手动添加交互项?

数据:

data <- structure(list(time = c(3, 19, 13, 45, 87, 90, 24, 75, 17, 4, 
109, 16, 14, 45, 27, 48, 9, 40, 28, 16, 8, 32, 21, 15, 12, 15, 
8, 6, 35, 43, 63, 97, 22, 34, 16, 23, 70, 9, 28, 60, 105, 24, 
32, 41, 89, 38, 24, 82, 17, 79, 62, 17, 14, 68, 66, 5, 53, 30, 
14, 66, 59, 34, 53, 17, 15, 39, 20, 14, 10, 43, 11, 12, 47, 105, 
34, 111, 84, 39, 25, 17, 44, 19, 17, 113, 31, 26, 9, 81, 9, 77, 
38, 19, 84, 22, 86, 18, 56, 113, 13, 45), status = c(1L, 1L, 
1L, 1L, 0L, 0L, 1L, 0L, 1L, 1L, 0L, 1L, 1L, 0L, 1L, 1L, 1L, 1L, 
1L, 0L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 0L, 0L, 1L, 1L, 
1L, 1L, 1L, 1L, 0L, 1L, 0L, 1L, 1L, 1L, 0L, 1L, 1L, 0L, 1L, 0L, 
0L, 1L, 1L, 0L, 0L, 1L, 0L, 1L, 1L, 0L, 1L, 1L, 0L, 1L, 1L, 1L, 
1L, 1L, 1L, 0L, 1L, 1L, 0L, 0L, 1L, 0L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 0L, 1L, 1L, 1L, 0L, 1L, 0L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L), Gender = structure(c(2L, 2L, 2L, 2L, 1L, 2L, 2L, 1L, 
1L, 2L, 1L, 2L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 2L, 1L, 2L, 2L, 1L, 
1L, 2L, 2L, 2L, 1L, 2L, 2L, 2L, 2L, 1L, 2L, 2L, 2L, 2L, 2L, 1L, 
1L, 2L, 2L, 2L, 2L, 1L, 2L, 1L, 1L, 1L, 2L, 2L, 1L, 2L, 1L, 2L, 
2L, 2L, 2L, 1L, 2L, 2L, 2L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 2L, 
2L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 1L, 1L, 2L, 1L, 2L, 2L, 2L, 2L, 
1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 1L, 2L), .Label = c("Female", 
"Male"), class = "factor"), Branch = structure(c(1L, 2L, 2L, 
1L, 2L, 1L, 2L, 2L, 2L, 1L, 1L, 2L, 2L, 1L, 2L, 2L, 1L, 2L, 2L, 
2L, 2L, 1L, 1L, 2L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 1L, 2L, 2L, 
1L, 2L, 2L, 1L, 2L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 
2L, 1L, 1L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 1L, 1L, 2L, 
1L, 2L, 1L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 1L, 2L, 1L, 
2L, 2L, 1L, 2L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 1L, 2L, 1L, 2L, 2L, 
1L), .Label = c("X", "Y"), class = "factor"), Type = structure(c(2L, 
2L, 1L, 2L, 1L, 1L, 1L, 1L, 2L, 2L, 1L, 2L, 2L, 2L, 1L, 1L, 2L, 
1L, 1L, 2L, 2L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 2L, 
1L, 1L, 2L, 2L, 1L, 2L, 1L, 1L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 2L, 2L, 1L, 1L, 2L, 1L, 1L, 1L, 1L, 1L, 2L, 1L, 2L, 2L, 
2L, 1L, 2L, 2L, 1L, 1L, 2L, 1L, 1L, 1L, 1L, 2L, 2L, 1L, 1L, 1L, 
1L, 1L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 1L, 1L, 2L, 2L, 1L, 2L, 1L, 
1L, 1L, 2L), .Label = c("A", "B"), class = "factor")), row.names = c(NA, 
100L), class = "data.frame")

结果:

> summary(coxph(Surv(time, status) ~ Type * Branch*Gender, data = data))
Call:
coxph(formula = Surv(time, status) ~ Type * Branch * Gender, 
    data = data)

  n= 100, number of events= 74 

                             coef exp(coef) se(coef)      z Pr(>|z|)  
TypeB                     1.96981   7.16933  0.77083  2.555   0.0106 *
BranchY                   0.64430   1.90466  0.60113  1.072   0.2838  
GenderMale                0.66762   1.94960  0.59199  1.128   0.2594  
TypeB:BranchY             0.25488   1.29030  0.98364  0.259   0.7955  
TypeB:GenderMale         -0.87810   0.41557  0.88206 -0.996   0.3195  
BranchY:GenderMale        0.07818   1.08132  0.73736  0.106   0.9156  
TypeB:BranchY:GenderMale -0.66212   0.51576  1.13590 -0.583   0.5600 

现在我想用手动添加的术语替换交互。它不能是(A+B+C)^3or ...A:B + A:C + B:C,因为情节不承认这一点。

我需要使用交互创建其他列。

怎么做?我以一种棘手的方式进行了管理,但它仅适用于最简单的部分。

我将其简化为 2 个因素:Type x Branch

这就是我期望得到的:

> summary(coxph(Surv(time, status) ~ Type * Branch, data = data))
Call:
coxph(formula = Surv(time, status) ~ Type * Branch, data = data)

  n= 100, number of events= 74 

                 coef exp(coef) se(coef)      z Pr(>|z|)    
TypeB          1.3439    3.8341   0.3834  3.506 0.000456 ***
BranchY        0.5761    1.7791   0.3430  1.680 0.093022 .  
TypeB:BranchY -0.2145    0.8070   0.4838 -0.443 0.657529    

我可以复制它:

> m <- as.data.frame(model.matrix(Surv(time, status) ~ Type * Branch, data=data))
> summary(coxph(Surv(time, status) ~ Type + Branch + m$`TypeB:BranchY`, data = data))
Call:
coxph(formula = Surv(time, status) ~ Type + Branch + m$`TypeB:BranchY`, 
    data = data)

  n= 100, number of events= 74 

                     coef exp(coef) se(coef)      z Pr(>|z|)    
TypeB              1.3439    3.8341   0.3834  3.506 0.000456 ***
BranchY            0.5761    1.7791   0.3430  1.680 0.093022 .  
m$`TypeB:BranchY` -0.2145    0.8070   0.4838 -0.443 0.657529    

现在森林地块找到了它,但没有正确处理它。

在此处输入图像描述

当我添加 3 级交互时,即使模型拟合也是错误的:

我期望看到的:

> summary(coxph(Surv(time, status) ~ Type * Branch * Gender, data = data))
Call:
coxph(formula = Surv(time, status) ~ Type * Branch * Gender, 
    data = data)

  n= 100, number of events= 74 

                             coef exp(coef) se(coef)      z Pr(>|z|)  
TypeB                     1.96981   7.16933  0.77083  2.555   0.0106 *
BranchY                   0.64430   1.90466  0.60113  1.072   0.2838  
GenderMale                0.66762   1.94960  0.59199  1.128   0.2594  
TypeB:BranchY             0.25488   1.29030  0.98364  0.259   0.7955  
TypeB:GenderMale         -0.87810   0.41557  0.88206 -0.996   0.3195  
BranchY:GenderMale        0.07818   1.08132  0.73736  0.106   0.9156  
TypeB:BranchY:GenderMale -0.66212   0.51576  1.13590 -0.583   0.5600  
---

我所看到的:

> m <- as.data.frame(model.matrix(Surv(time, status) ~ Type * Branch * Gender, data=data))
> summary(coxph(Surv(time, status) ~ Type + Branch + m$`TypeB:BranchY` + 
+                 m$`TypeB:GenderMale` + 
+                 m$`BranchY:GenderMale` + 
+                 m$`TypeB:BranchY:GenderMale`, data = data))
Call:
coxph(formula = Surv(time, status) ~ Type + Branch + m$`TypeB:BranchY` + 
    m$`TypeB:GenderMale` + m$`BranchY:GenderMale` + m$`TypeB:BranchY:GenderMale`, 
    data = data)

  n= 100, number of events= 74 

                                coef exp(coef) se(coef)      z Pr(>|z|)  
TypeB                         1.5403    4.6659   0.6437  2.393   0.0167 *
BranchY                       0.2168    1.2420   0.4274  0.507   0.6120  
m$`TypeB:BranchY`             0.6797    1.9732   0.8889  0.765   0.4445  
m$`TypeB:GenderMale`         -0.2101    0.8105   0.6536 -0.321   0.7479  
m$`BranchY:GenderMale`        0.7439    2.1042   0.4416  1.685   0.0921 .
m$`TypeB:BranchY:GenderMale` -1.3253    0.2657   0.9715 -1.364   0.1725  

有没有什么聪明的方法可以手动生成交互项以将它们添加到森林图中?我想我很接近,但不知道有多接近......

标签: rsurvival-analysisinteractioncox-regressionforestplot

解决方案


也许使用该gtsummary软件包会有所帮助

library(dplyr)
library(gtsummary)
library(survival)

#plotting betas

data %>%
  coxph(Surv(time, status) ~ Type * Branch*Gender, data = .) %>% 
  tbl_regression() %>%
  plot()

贝塔图

#plotting hazard ratios
    data %>%
      coxph(Surv(time, status) ~ Type * Branch*Gender, data = .) %>% 
      tbl_regression(exponentiate = TRUE) %>%
      plot()

小时图


推荐阅读