首页 > 解决方案 > 以多个变量作为输入的 Forecasting.ForecastBySsa

问题描述

我有这个代码来预测时间序列。我想根据价格的时间序列和相关指标进行预测。

因此,与要预测的值一起,我想传递一个边值,但我不明白是否考虑到这一点,因为无论有没有它,预测都不会改变。我需要以哪种方式告诉算法如何考虑这些参数?

public static TimeSeriesForecast PerformTimeSeriesProductForecasting(List<TimeSeriesData> listToForecast)
{
    var mlContext = new MLContext(seed: 1);  //Seed set to any number so you have a deterministic environment
    var productModelPath = $"product_month_timeSeriesSSA.zip";

    if (File.Exists(productModelPath))
    {
        File.Delete(productModelPath);
    }

    IDataView productDataView = mlContext.Data.LoadFromEnumerable<TimeSeriesData>(listToForecast);
    var singleProductDataSeries = mlContext.Data.CreateEnumerable<TimeSeriesData>(productDataView, false).OrderBy(p => p.Date);
    TimeSeriesData lastMonthProductData = singleProductDataSeries.Last();

    const int numSeriesDataPoints = 2500; //The underlying data has a total of 34 months worth of data for each product

    // Create and add the forecast estimator to the pipeline.
    IEstimator<ITransformer> forecastEstimator = mlContext.Forecasting.ForecastBySsa(
        outputColumnName: nameof(TimeSeriesForecast.NextClose),
        inputColumnName: nameof(TimeSeriesData.Close), // This is the column being forecasted.
        windowSize: 22, // Window size is set to the time period represented in the product data cycle; our product cycle is based on 12 months, so this is set to a factor of 12, e.g. 3.
        seriesLength: numSeriesDataPoints, // This parameter specifies the number of data points that are used when performing a forecast.
        trainSize: numSeriesDataPoints, // This parameter specifies the total number of data points in the input time series, starting from the beginning.
        horizon: 5, // Indicates the number of values to forecast; 2 indicates that the next 2 months of product units will be forecasted.
        confidenceLevel: 0.98f, // Indicates the likelihood the real observed value will fall within the specified interval bounds.
        confidenceLowerBoundColumn: nameof(TimeSeriesForecast.ConfidenceLowerBound), //This is the name of the column that will be used to store the lower interval bound for each forecasted value.
        confidenceUpperBoundColumn: nameof(TimeSeriesForecast.ConfidenceUpperBound)); //This is the name of the column that will be used to store the upper interval bound for each forecasted value.

    // Fit the forecasting model to the specified product's data series.
    ITransformer forecastTransformer = forecastEstimator.Fit(productDataView);

    // Create the forecast engine used for creating predictions.
    TimeSeriesPredictionEngine<TimeSeriesData, TimeSeriesForecast> forecastEngine = forecastTransformer.CreateTimeSeriesEngine<TimeSeriesData, TimeSeriesForecast>(mlContext);

    // Save the forecasting model so that it can be loaded within an end-user app.
    forecastEngine.CheckPoint(mlContext, productModelPath);
    ITransformer forecaster;
    using (var file = File.OpenRead(productModelPath))
    {
        forecaster = mlContext.Model.Load(file, out DataViewSchema schema);
    }

    // We must create a new prediction engine from the persisted model.
    TimeSeriesPredictionEngine<TimeSeriesData, TimeSeriesForecast> forecastEngine2 = forecaster.CreateTimeSeriesEngine<TimeSeriesData, TimeSeriesForecast>(mlContext);

    // Get the prediction; this will include the forecasted product units sold for the next 2 months since this the time period specified in the `horizon` parameter when the forecast estimator was originally created.
    prediction = forecastEngine.Predict();
    return prediction;
}

TimeSeriesData具有多个属性,不仅是我要预测的系列的值。只是想知道在预测时是否考虑到它们。有没有更好的方法来预测这种类型的序列,比如 LMST?这种方法在 ML.NET 中可用吗?

标签: machine-learningml.net

解决方案


有一张新的增强票:Multivariate Time based series forecasting to ML.Net

见票:github.com/dotnet/machinelearning/issues/5638


推荐阅读