首页 > 解决方案 > 无法设置线性回归的参考水平

问题描述

我目前面临的问题是 R 会自动为我无法更改的线性回归定义某些参考水平的因素。

为了说明,我正在使用 Gavin Simpson 在另一个问题中发布的解决方案(如何强制 R 在回归中使用指定的因子水平作为参考?)。

当我使用以下代码时:

set.seed(123)
x <- rnorm(100)
DF <- data.frame(x = x,
                 y = 4 + (1.5*x) + rnorm(100, sd = 2),
                 b = gl(5, 20))
head(DF)
str(DF)

m1 <- lm(y ~ x + b, data = DF)
summary(m1)

R 使用 'b' 的第五级作为基线:

Call:
lm(formula = y ~ x + b, data = DF)

Residuals:
   Min     1Q Median     3Q    Max 
-3.974 -1.301 -0.164  1.053  6.091 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept)   3.7907     0.1962  19.323  < 2e-16 ***
x             1.4359     0.2189   6.561 2.89e-09 ***
b1           -0.5004     0.3905  -1.281    0.203    
b2            0.1293     0.3916   0.330    0.742    
b3           -0.1305     0.3904  -0.334    0.739    
b4            0.5354     0.3931   1.362    0.176    
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 1.952 on 94 degrees of freedom
Multiple R-squared:  0.3243,    Adjusted R-squared:  0.2883 
F-statistic: 9.022 on 5 and 94 DF,  p-value: 4.954e-07

当我尝试使用以下代码重新调整级别时,我得到完全相同的结果:

DF <- within(DF, b <- relevel(b, ref = 3))
m2 <- lm(y ~ x + b, data = DF)
summary(m2)

所以 m1 和 m2 的系数都是:

coef(m1)
(Intercept)           x          b1          b2          b3          b4 
  3.7907058   1.4358520  -0.5003818   0.1293078  -0.1305475   0.5353815

我不知道如何改变这一点以及为什么我的 R 会这样。我在 macOS Catalina 10.15.7 上为 macOS 使用 RStudio Apricot Nasturtium”(aee44535,2020-09-17)

会话信息:

R version 4.0.3 (2020-10-10)
Platform: x86_64-apple-darwin17.0 (64-bit)
Running under: macOS Catalina 10.15.7

Matrix products: default
BLAS:   /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/4.0/Resources/lib/libRlapack.dylib

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] car_3.0-10        carData_3.0-4     rcompanion_2.3.25 tidyr_1.1.2       ez_4.4-0          ggplot2_3.3.2     readr_1.4.0       dplyr_1.0.2      

loaded via a namespace (and not attached):
 [1] splines_4.0.3      assertthat_0.2.1   expm_0.999-5       statmod_1.4.35     gld_2.6.2          lmom_2.8           stats4_4.0.3       coin_1.3-1         cellranger_1.1.0   pillar_1.4.6      
[11] lattice_0.20-41    glue_1.4.2         minqa_1.2.4        colorspace_1.4-1   sandwich_3.0-0     Matrix_1.2-18      plyr_1.8.6         pkgconfig_2.0.3    haven_2.3.1        EMT_1.1           
[21] purrr_0.3.4        mvtnorm_1.1-1      scales_1.1.1       openxlsx_4.2.2     rootSolve_1.8.2.1  rio_0.5.16         lme4_1.1-23        tibble_3.0.3       mgcv_1.8-33        generics_0.0.2    
[31] ellipsis_0.3.1     TH.data_1.0-10     pacman_0.5.1       withr_2.3.0        cli_2.0.2          survival_3.2-7     magrittr_1.5       crayon_1.3.4       readxl_1.3.1       fansi_0.4.1       
[41] nlme_3.1-149       MASS_7.3-53        forcats_0.5.0      foreign_0.8-80     class_7.3-17       tools_4.0.3        data.table_1.13.2  hms_0.5.3          lifecycle_0.2.0    matrixStats_0.57.0
[51] multcomp_1.4-14    stringr_1.4.0      Exact_2.1          munsell_0.5.0      zip_2.1.1          compiler_4.0.3     e1071_1.7-4        multcompView_0.1-8 rlang_0.4.8        grid_4.0.3        
[61] nloptr_1.2.2.2     rstudioapi_0.11    boot_1.3-25        DescTools_0.99.38  gtable_0.3.0       codetools_0.2-16   abind_1.4-5        curl_4.3           reshape2_1.4.4     R6_2.4.1          
[71] zoo_1.8-8          lubridate_1.7.9    utf8_1.1.4         nortest_1.0-4      libcoin_1.0-6      modeltools_0.2-23  stringi_1.5.3      parallel_4.0.3     Rcpp_1.0.5         vctrs_0.3.4       
[81] tidyselect_1.1.0   lmtest_0.9-38     

标签: rregressionlm

解决方案


推荐阅读