首页 > 解决方案 > 在具有交互项的 logit 模型中使用 statsmodels 的 get_margeff 命令计算 Python 中的边际效应

问题描述

我在使用 statsmodels 的 get_margeff 命令处理带有交互项的 logit 模型时遇到了问题。虽然在主效应模型中,效应被正确计算并对应于 Stata 和 R 结果,但在涉及交互项时情况并非如此。这里的效果是错误的,并且报告了交互项的边际效果,这没有意义。以下代码说明了这一点:

import pandas as pd
import statsmodels.formula.api as sm
import statsmodels.api as sm2

df=sm2.datasets.heart.load_pandas().data

regression = sm.logit(formula='censors~survival+age', data=df).fit()   
#only for illustration purposes; does not make real sense 

print(regression.get_margeff().summary()) 
# the calculation of marginal effects here is corrects and corresponds to Stata and R results

                dy/dx    std err          z      P>|z|      [0.025      0.975]
------------------------------------------------------------------------------
survival      -0.0004   7.95e-05     -4.672      0.000      -0.001      -0.000
age            0.0148      0.005      3.262      0.001       0.006       0.024
==============================================================================
regression = sm.logit(formula='censors~survival+age+survival*age', data=df).fit() 
print(regression.get_margeff().summary()) 
## effects for survival and age are not correct and a marginal effect for survival:age is reported which does not make sense
================================================================================
                  dy/dx    std err          z      P>|z|      [0.025      0.975]
--------------------------------------------------------------------------------
survival        -0.0009      0.001     -1.040      0.298      -0.003       0.001
age              0.0120      0.006      1.857      0.063      -0.001       0.025
survival:age   1.08e-05    1.8e-05      0.599      0.549   -2.45e-05    4.61e-05
================================================================================

有谁知道如何解决这个问题,以便第二个模型中生存和年龄的边际效应 [仅用于说明目的] 对应于 Stata 和 R 结果?

编辑,4 月 11 日:

作为对用户“StupidWolf”的回应,以下是各自的 Stata 结果:

use "heart.dta"
qui logit censors survival age
margins, dydx(*)


Average marginal effects                        Number of obs     =         69
Model VCE    : OIM

Expression   : Pr(censors), predict()
dy/dx w.r.t. : survival age

------------------------------------------------------------------------------
             |            Delta-method
             |      dy/dx   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
    survival |  -.0003716   .0000795    -4.67   0.000    -.0005275   -.0002157
         age |    .014813   .0045409     3.26   0.001     .0059131    .0237129
------------------------------------------------------------------------------


qui logit censors survival age c.survival#c.age
margins, dydx(*)


Average marginal effects                        Number of obs     =         69
Model VCE    : OIM

Expression   : Pr(censors), predict()
dy/dx w.r.t. : survival age

------------------------------------------------------------------------------
             |            Delta-method
             |      dy/dx   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
    survival |  -.0003816   .0000814    -4.68   0.000    -.0005412   -.0002219
         age |   .0162289   .0051163     3.17   0.002     .0062012    .0262567
------------------------------------------------------------------------------

关于为什么不应该为交互项计算边际效应的广泛讨论,例如参见: https ://www3.nd.edu/~rwilliam/stats/Margins01.pdf https://www.stata.com/statalist /archive/2013-01/msg00293.html

标签: pythonstatsmodelsinteraction

解决方案


推荐阅读