首页 > 解决方案 > ValueError:在权重中检测到 NaN、inf 或无效值,估计不可行

问题描述

我正在尝试训练一个模型来预测足球比分,并且我正在使用 statsmodels、glm Poisson 回归。代码如下所示:

import statsmodels.formula.api as smf
declared_inj_time_model = smf.glm(formula="team1_goals ~ competition_name + referee + team1_name + team2_name", data=df_train, family=sm.families.Poisson()).fit()
declared_inj_time_model.summary()

执行代码时出现以下错误:

/usr/local/lib/python3.7/dist-packages/statsmodels/genmod/families/links.py:521: RuntimeWarning: overflow encountered in exp
  return np.exp(z)
/usr/local/lib/python3.7/dist-packages/statsmodels/genmod/families/family.py:428: RuntimeWarning: invalid value encountered in true_divide
  endog_mu = self._clean(endog / mu)
/usr/local/lib/python3.7/dist-packages/statsmodels/genmod/families/family.py:134: RuntimeWarning: invalid value encountered in multiply
  return 1. / (self.link.deriv(mu)**2 * self.variance(mu))
/usr/local/lib/python3.7/dist-packages/statsmodels/genmod/families/family.py:134: RuntimeWarning: divide by zero encountered in true_divide
  return 1. / (self.link.deriv(mu)**2 * self.variance(mu))
/usr/local/lib/python3.7/dist-packages/statsmodels/genmod/generalized_linear_model.py:1163: RuntimeWarning: invalid value encountered in multiply
  - self._offset_exposure)
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-28-4a8de44e0149> in <module>()
----> 1 declared_inj_time_model = smf.glm(formula="team1_goals ~ competition_name + referee + team1_name + team2_name", data=df_train, family=sm.families.Poisson()).fit()
      2 declared_inj_time_model.summary()
      3 

2 frames
/usr/local/lib/python3.7/dist-packages/statsmodels/regression/_tools.py in __init__(self, endog, exog, weights, check_endog, check_weights)
     46         if check_weights:
     47             if not np.all(np.isfinite(w_half)):
---> 48                 raise ValueError(self.msg.format('weights'))
     49 
     50         if check_endog:

ValueError: NaN, inf or invalid value detected in weights, estimation infeasible.

从数据集中删除所有 NaN。我能做些什么?

该代码用于执行良好并预测目标,但刚刚停止工作。此外,这段代码是预测上半场的目标,我写的完全一样,只是为了下半场,这仍然可以完美执行。

标签: pythonstatsmodelspoisson

解决方案


推荐阅读