python - ARIMA 模型抛出“LinAlgError(“SVD 未收敛”)”错误
问题描述
我的最终目标是预测给定服务器的 磁盘使用百分比(如 df -h 中的使用百分比:
$ df -h
Filesystem Size Used Avail Use% Mounted on
/dev/sda3 39G 31G 6.9G 82% /
)
将在未来使用机器学习。我正在利用 statsmodels python 库中的 ARIMA 模型进行时间序列预测。
我有一个数组,其中包含我记录了很多天的每日元组,格式为(epoch_timestamp,used_pct):
results = [(1545346993, 80), (1545433403, 79), (1545519793, 80), (1545606202, 82), (1545692596, 80), (1545779002, 83), (1545865397, 77), (1545951799, 76), (1546038202, 75), (1546124601, 73), (1546210994, 73), (1546297394, 73), (1546383797, 73), (1546470197, 74), (1546556595, 74)]
如果您看到下面的最小代码示例,它会出错
“LinAlgError:SVD 没有收敛”。
有趣的是,如果你修改这个特定的元组 - results[6] 来自:
(1545865397, 77)
至:
(1545865397, 80)
然后代码按预期工作,输出如下:
predicted=77.635415, expected=73.000000
predicted=69.932872, expected=73.000000
predicted=76.074475, expected=73.000000
predicted=73.698213, expected=73.000000
predicted=72.721116, expected=74.000000
predicted=73.054932, expected=74.000000
Test MSE: 7.227
所以这告诉我某些特定于数据的原因导致了这种情况。通过研究其他问题,我确保没有 np.Nan 值或其他不良数据。无论如何,这是使用“坏”元组的完整示例:
import pandas as pd
import numpy as np
from sklearn.metrics import mean_squared_error
from statsmodels.tsa.arima_model import ARIMA
time_list = []
used_pct_list = []
results = [(1545346993, 80), (1545433403, 79), (1545519793, 80), (1545606202, 82), (1545692596, 80), (1545779002, 83), (1545865397, 77), (1545951799, 76), (1546038202, 75), (1546124601, 73), (1546210994, 73), (1546297394, 73), (1546383797, 73), (1546470197, 74), (1546556595, 74)]
for time, used_pct in results:
time_list.append(time)
used_pct_list.append(used_pct)
data = np.array(used_pct_list)
series = pd.Series(data,index=time_list)
X = series.values
size = int(len(X) * 0.66)
train, test = X[0:size], X[size:len(X)]
history = [x for x in train]
predictions = list()
for t in range(len(test)):
model = ARIMA(history, order=(5,1,0))
model_fit = model.fit(disp=0)
output = model_fit.forecast()
yhat = output[0]
predictions.append(yhat)
obs = test[t]
history.append(obs)
print('predicted=%f, expected=%f' % (yhat, obs))
error = mean_squared_error(test, predictions)
print('Test MSE: %.3f' % error)
以及它抛出的错误:
runfile('C:/Users/user/.spyder-py3/blarg_bad.py', wdir='C:/Users/user/.spyder-py3')
C:\Users\user\Anaconda3\lib\site-packages\scipy\signal\signaltools.py:1341: FutureWarning: Using a non-tuple sequence for multidimensional indexing is deprecated; use `arr[tuple(seq)]` instead of `arr[seq]`. In the future this will be interpreted as an array index, `arr[np.array(seq)]`, which will result either in an error or a different result.
out_full[ind] += zi
C:\Users\user\Anaconda3\lib\site-packages\scipy\signal\signaltools.py:1344: FutureWarning: Using a non-tuple sequence for multidimensional indexing is deprecated; use `arr[tuple(seq)]` instead of `arr[seq]`. In the future this will be interpreted as an array index, `arr[np.array(seq)]`, which will result either in an error or a different result.
out = out_full[ind]
C:\Users\user\Anaconda3\lib\site-packages\scipy\signal\signaltools.py:1350: FutureWarning: Using a non-tuple sequence for multidimensional indexing is deprecated; use `arr[tuple(seq)]` instead of `arr[seq]`. In the future this will be interpreted as an array index, `arr[np.array(seq)]`, which will result either in an error or a different result.
zf = out_full[ind]
C:\Users\user\Anaconda3\lib\site-packages\statsmodels\tsa\tsatools.py:607: RuntimeWarning: invalid value encountered in true_divide
....
File "C:\Users\user\Anaconda3\lib\site-packages\numpy\linalg\linalg.py", line 1562, in svd
u, s, vh = gufunc(a, signature=signature, extobj=extobj)
File "C:\Users\user\Anaconda3\lib\site-packages\numpy\linalg\linalg.py", line 98, in _raise_linalgerror_svd_nonconvergence
raise LinAlgError("SVD did not converge")
LinAlgError: SVD did not converge
我是机器学习的新手,一直在使用这个网站作为 arima 模型的指南:https ://machinelearningmastery.com/arima-for-time-series-forecasting-with-python/
解决方案
推荐阅读
- sql - 为什么这条 SQL 语句这么快?
- c# - Unity PlayerPrefs 没有更新我的“高分”
- css - 中心导航栏品牌和右拉“帮助”图标
- python - 如何打印特定级别的所有节点?
- reactjs - 如何在我的 JS 应用程序中为 Google Drive 文件夹创建搜索栏?
- javascript - 在 mousemove 函数上获取光标坐标的问题
- laravel - Laravel S3 文件上传 - PutObject 操作需要非空参数:Bucket
- django - Django - 异常值:if 标记中的表达式意外结束
- for-loop - 为什么我的 for 循环增加超过它应该停止的位置?
- javascript - 即使用户点击了他们的偏好并关闭了框,Cookie 通知也会继续打开