首页 > 解决方案 > 是否有任何方法可以在测试数据集范围之外生成股票价格预测?

问题描述

我正在尝试生成股票价格的预测值,其中测试数据集包含大约 15 个值,但我说我不想依赖测试数据集上的值。我只想在测试数据集的范围之外进行预测。

使用 Quandl 的数据集。

我尝试更改循环的范围,但这只是导致元组的重塑超出范围。

    import numpy as np
    import matplotlib.pyplot as plt
    import pandas as pd

    dataset_train = pd.read_csv('EOD-AAPL.csv')
    training_set = dataset_train.iloc[:, 1:2].values

    print(dataset_train.head())

    from sklearn.preprocessing import MinMaxScaler
    sc = MinMaxScaler(feature_range = (0, 1))
    training_set_scaled = sc.fit_transform(training_set)


    X_train = []
    y_train = []
    for i in range(60, 2047):
        X_train.append(training_set_scaled[i-60:i, 0])
        y_train.append(training_set_scaled[i, 0])
    X_train, y_train = np.array(X_train), np.array(y_train)

    X_train = np.reshape(X_train, (X_train.shape[0], X_train.shape[1], 1))

    from keras.models import Sequential
    from keras.layers import Dense
    from keras.layers import LSTM
    from keras.layers import Dropout

    regressor = Sequential()

    regressor.add(LSTM(units = 50, return_sequences = True, input_shape = (X_train.shape[1], 1)))
    regressor.add(Dropout(0.2))

    regressor.add(LSTM(units = 50, return_sequences = True))
    regressor.add(Dropout(0.2))

    regressor.add(LSTM(units = 50, return_sequences = True))
    regressor.add(Dropout(0.2))

    regressor.add(LSTM(units = 50))
    regressor.add(Dropout(0.2))

    regressor.add(Dense(units = 1))

    regressor.compile(optimizer = 'adam', loss = 'mean_squared_error')

    regressor.fit(X_train, y_train, epochs = 100, batch_size = 32)


    dataset_test = pd.read_csv('EOD-AAPLtest.csv')
    real_stock_price = dataset_test.iloc[:, 1:2].values


    dataset_total = pd.concat((dataset_train['Open'], dataset_test['Open']), axis = 0)
    inputs = dataset_total[len(dataset_total) - len(dataset_test) - 60:].values
    inputs = inputs.reshape(-1,1)
    inputs = sc.transform(inputs)
    X_test = []

这是我要加强的部分

    for i in range(60, 75):
        X_test.append(inputs[i-60:i, 0])
    X_test = np.array(X_test)
    X_test = np.reshape(X_test, (X_test.shape[0], X_test.shape[1], 1))
    predicted_stock_price = regressor.predict(X_test)
    predicted_stock_price = sc.inverse_transform(predicted_stock_price)

    '''


    plt.plot(real_stock_price, color = 'black', label = 'APPLE Stock Price')
    plt.plot(predicted_stock_price, color = 'green', label = 'Predicted APPLE Stock Price')
    plt.title('APPLE Stock Price Prediction')
    plt.xlabel('Time')
    plt.ylabel('APPLE Stock Price')
    plt.legend()
    plt.show()

我打算在接下来的 30 天内生成输出,而不依赖于测试数据集中存在的数据值。

测试数据集包含:

Date,Open,High,Low,Close,Volume,Dividend,Split,Adj_Open,Adj_High,Adj_Low,Adj_Close,Adj_Volume

2019-04-05,196.45,197.1,195.93,197.0,18526644.0,0.0,1.0,196.45,197.1,195.93,197.0,18526644.0

2019-04-04,194.79,196.37,193.14,195.69,19114275.0,0.0,1.0,194.79,196.37,193.14,195.69,19114275.0
2019-04-03,193.25,196.5,193.15,195.35,23271830.0,0.0,1.0,193.25,196.5,193.15,195.35,23271830.0
2019-04-02,191.09,194.46,191.05,194.02,22765732.0,0.0,1.0,191.09,194.46,191.05,194.02,22765732.0
2019-04-01,191.64,191.68,188.38,191.24,27861964.0,0.0,1.0,191.64,191.68,188.38,191.24,27861964.0
2019-03-29,189.83,190.08,188.54,189.95,23563961.0,0.0,1.0,189.83,190.08,188.54,189.95,23563961.0
2019-03-28,188.95,189.559,187.53,188.72,20780363.0,0.0,1.0,188.95,189.559,187.53,188.72,20780363.0
2019-03-27,188.75,189.76,186.55,188.47,29848427.0,0.0,1.0,188.75,189.76,186.55,188.47,29848427.0
2019-03-26,191.664,192.88,184.58,186.79,49800538.0,0.0,1.0,191.664,192.88,184.58,186.79,49800538.0
2019-03-25,191.51,191.98,186.6,188.74,43845293.0,0.0,1.0,191.51,191.98,186.6,188.74,43845293.0
2019-03-22,195.34,197.69,190.78,191.05,42407666.0,0.0,1.0,195.34,197.69,190.78,191.05,42407666.0
2019-03-21,190.02,196.33,189.81,195.09,51034237.0,0.0,1.0,190.02,196.33,189.81,195.09,51034237.0
2019-03-20,186.23,189.49,184.73,188.16,31035231.0,0.0,1.0,186.23,189.49,184.73,188.16,31035231.0
2019-03-19,188.35,188.99,185.92,186.53,31646369.0,0.0,1.0,188.35,188.99,185.92,186.53,31646369.0
2019-03-18,185.8,188.39,185.79,188.02,26219832.0,0.0,1.0,185.8,188.39,185.79,188.02,26219832.0

所以我想预测到 2019 年 4 月 30 日的数据。由于 X_test 将 'EOD-AAPLtest.csv' 作为输入。

标签: pythontensorflowkeraslstm

解决方案


推荐阅读