首页 > 解决方案 > 如何使用 Python 解释自相关和偏自相关图

问题描述

我正在尝试使用ARIMA. 我的数据框中有两列:monthlydatesells.

time         sell
1/31/2014   273033
2/29/2014   203019
3/31/2014   225844
4/30/2014   236374
5/31/2014   189666
6/30/2014   242742
7/31/2014   191682
8/31/2014   208270
9/30/2014   236533
10/31/2014  188010
11/30/2014  245185
12/31/2014  224990
1/31/2015   186733
2/28/2015   296641
3/31/2015   234317
4/30/2015   160818
5/31/2015   214937
6/30/2015   226710
7/31/2015   176030
8/31/2015   160991
9/30/2015   205668
10/31/2015  183680
11/30/2015  194428
12/31/2015  643302
1/31/2016   1306566
2/28/2016   2031110
3/31/2016   1756328
4/30/2016   1703885
5/31/2016   1620547
6/30/2016   1862650
7/31/2016   1742188
8/31/2016   1441375
9/30/2016   1666798
10/31/2016  1992165
11/30/2016  1965643
12/31/2016  1315753
1/31/2017   1676141
2/28/2017   1572417
3/31/2017   1442843
4/30/2017   1337359
5/31/2017   1350256
6/30/2017   1090291
7/31/2017   1329138
8/31/2017   1245024
9/30/2017   1246177
10/31/2017  1361814
11/30/2017  1574517
12/31/2017  1035892
1/31/2018   1358912
2/29/2018   1408371
3/31/2018   1239371
4/30/2018   874519
5/31/2018   1025873

在运行 ARIMA 模型之前,我需要弄清楚ARIMA(p,d,q)需要三个参数的参数,并且传统上是手动配置的。

我开始在 python 中绘制ACFPACF图,这是输出。我不明白它表示什么,我们如何使用这个图来构建ARIMA模型?

在此处输入图像描述

在此处输入图像描述 许多教科书是这样说的:

Autoregression Intuition Consider a time series that was generated by an autoregression (AR) process with a lag of k.

We know that the ACF describes the autocorrelation between an observation and another observation at a prior time step that includes direct and indirect dependence information.

This means we would expect the ACF for the AR(k) time series to be strong to a lag of k and the inertia of that relationship would carry on to subsequent lag values, trailing off at some point as the effect was weakened.

We know that the PACF only describes the direct relationship between an observation and its lag. This would suggest that there would be no correlation for lag values beyond k.

它很难理解。能用通俗的语言解释吗?

如何解释以上情节?如何使用 python 找到最优的 p,d,f 参数?

标签: python-3.xtime-seriesarima

解决方案


如果您想使用 ACF 和 PACF 来确定滞后长度,您需要根据 PACF 的截止值选择 AR 项,根据 ACF 的截止值选择 MA 项。尽管您必须小心不要选择过多的 AR 和 MA 术语。

这里这里已经回答了一个类似的问题。有一个很好的免费在线资源

查找 ARIMA 参数的另一种方法是使用信息标准

import statsmodels.api as sm
result = {}
for p in range(5):
    for q in range(5):
        arma = sm.tsa.ARIMA(y, order=(p,0,q))
        arma_fit = arma.fit()
        result[(p,q)] = arma_fit.aic

p,q = min(result, key=result.get)

推荐阅读