首页 > 解决方案 > 子类化 rv_continuous 时的溢出错误

问题描述

import scipy.stats as st
import numpy as np  # generic math functions


# https://scicomp.stackexchange.com/q/1658
class LorentzGen(st.rv_continuous):
    """Lorentz distribution"""

    def _pdf(self, x):
        gamma = 0.27
        return 2 * gamma / (np.pi * (gamma ** 2 + x ** 2))


transverse_fields = LorentzGen(a=0)
gaussian_gen = st.norm()

L = 2

list_of_temps = np.linspace(1, 10, 40)

for T, temp in enumerate(list_of_temps):
    print(f"Run {T}")
    for t in range(5000):
        if t%500==0:
            print(f"Trial {t}")
        h_x = [[-transverse_fields.rvs(), xx] for xx in range(L)]  # OverflowError: (34, 'Result too large')
        # h_y = [[-gaussian_gen.rvs(), xx] for xx in range(L)]  # Works

在上面的代码中,我已经实现了我自己的概率分布(基本上是半洛伦兹,x∈[0,∞]),从我称之为scicomp.SEtransverse_fields的答案修改而来。

我需要从中生成一大堆值transverse_fields,在嵌套的 For 循环中使用它们。问题是,超过一定数量的运行,这里“运行 1 次试用 ~3500”,我得到一堆错误:

C:\ProgramData\Anaconda3\lib\site-packages\scipy\integrate\quadpack.py:385: IntegrationWarning: The integral is probably divergent, or slowly convergent.
  warnings.warn(msg, IntegrationWarning)
C:\ProgramData\Anaconda3\lib\site-packages\numpy\lib\function_base.py:2831: RuntimeWarning: overflow encountered in ? (vectorized)
  outputs = ufunc(*inputs)
Traceback (most recent call last):
  File "C:/<redacted>/stackoverflow.py", line 26, in <module>
    h_x = [[-transverse_fields.rvs(), xx] for xx in range(L)]  # Result Overflow
  File "C:/<redacted>/stackoverflow.py", line 26, in <listcomp>
    h_x = [[-transverse_fields.rvs(), xx] for xx in range(L)]  # Result Overflow
  File "C:\ProgramData\Anaconda3\lib\site-packages\scipy\stats\_distn_infrastructure.py", line 954, in rvs
    vals = self._rvs(*args)
  File "C:\ProgramData\Anaconda3\lib\site-packages\scipy\stats\_distn_infrastructure.py", line 889, in _rvs
    Y = self._ppf(U, *args)
  File "C:\ProgramData\Anaconda3\lib\site-packages\scipy\stats\_distn_infrastructure.py", line 902, in _ppf
    return self._ppfvec(q, *args)
  File "C:\ProgramData\Anaconda3\lib\site-packages\numpy\lib\function_base.py", line 2755, in __call__
    return self._vectorize_call(func=func, args=vargs)
  File "C:\ProgramData\Anaconda3\lib\site-packages\numpy\lib\function_base.py", line 2831, in _vectorize_call
    outputs = ufunc(*inputs)
  File "C:\ProgramData\Anaconda3\lib\site-packages\scipy\stats\_distn_infrastructure.py", line 1587, in _ppf_single
    while self._ppf_to_solve(right, q, *args) < 0.:
  File "C:\ProgramData\Anaconda3\lib\site-packages\scipy\stats\_distn_infrastructure.py", line 1569, in _ppf_to_solve
    return self.cdf(*(x, )+args)-q
  File "C:\ProgramData\Anaconda3\lib\site-packages\scipy\stats\_distn_infrastructure.py", line 1745, in cdf
    place(output, cond, self._cdf(*goodargs))
  File "C:\ProgramData\Anaconda3\lib\site-packages\scipy\stats\_distn_infrastructure.py", line 1621, in _cdf
    return self._cdfvec(x, *args)
  File "C:\ProgramData\Anaconda3\lib\site-packages\numpy\lib\function_base.py", line 2755, in __call__
    return self._vectorize_call(func=func, args=vargs)
  File "C:\ProgramData\Anaconda3\lib\site-packages\numpy\lib\function_base.py", line 2831, in _vectorize_call
    outputs = ufunc(*inputs)
  File "C:\ProgramData\Anaconda3\lib\site-packages\scipy\stats\_distn_infrastructure.py", line 1618, in _cdf_single
    return integrate.quad(self._pdf, self.a, x, args=args)[0]
  File "C:\ProgramData\Anaconda3\lib\site-packages\scipy\integrate\quadpack.py", line 341, in quad
    points)
  File "C:\ProgramData\Anaconda3\lib\site-packages\scipy\integrate\quadpack.py", line 448, in _quad
    return _quadpack._qagse(func,a,b,args,full_output,epsabs,epsrel,limit)
  File "C:/<redacted>/stackoverflow.py", line 11, in _pdf
    return 2 * gamma / (np.pi * (gamma ** 2 + x ** 2))
OverflowError: (34, 'Result too large')

Process finished with exit code 1

请注意,如果我将试验次数提高t到较小的数字(例如 50),则不会发生错误,如果list_of_temps值较少(例如np.linspace(1,10,4). 即使在原始问题中,使用np.linspace(1,10,40),在运行 1 期间弹出错误。

使用原始设置,当我使用标准高斯分布函数时也没有溢出错误scipy.stats

我看到的关于 SO的类似问题将此归因于 for 循环运行的范围太大。但我在这里看不到这一点?而且我不太明白如何按照链接问题的答案中的建议实施 Decimal ,无论如何。

我怎样才能解决这个问题?

我在 64 位 Windows 10 上运行 Python 3.6.5 和 Anaconda。

标签: python-3.x

解决方案


我用transverse_fields/声明概率分布的方式似乎是一个问题LorentzGen

我的解决方案是使用 中的内置cauchy发行版scipy.stats,并修改比例。另外,因为我想要一个半洛伦兹,所以我在np.abs(...)transverse_fields.

import scipy.stats as st
import numpy as np  # generic math functions

transverse_fields = st.cauchy(scale=0.27)
L = 2

list_of_temps = np.linspace(1, 10, 40)

for T, temp in enumerate(list_of_temps):
    print(f"Run {T}")
    for t in range(5000):
        if t%500==0:
            print(f"Trial {t}")
        h_x = [[-np.abs(transverse_fields.rvs()), xx] for xx in range(L)]  # Now works

现在这对我来说已经足够令人满意了,但我仍然希望有人解释为什么我的子类化方式rv_continuous给了我上述错误。


推荐阅读