首页 > 解决方案 > 时间序列预测中的数据形状不匹配

问题描述

使用 LSTM 模型来预测股票价格,我按照教程完成了每一步,但与教程不同的是,我的代码遇到了错误。

这是我正在使用的代码:

df = pd.read_csv(f'D:\\algo\\all\\EURUSD_15M.csv')
df = df.loc[:, ~df.columns.str.contains('^Unnamed',)]
training_set = df.iloc[:-int(len(df)/10), 4:5].values

sc = MinMaxScaler(feature_range= (0, 1))
training_set_scaled = sc.fit_transform(training_set)
x_train , y_train = [], []

for i in range(60, len(training_set)):
    x_train.append(training_set_scaled[i-60:i, 0])
    y_train.append(training_set_scaled[i, 0])

x_train, y_train = np.array(x_train), np.array(y_train)
x_train = np.reshape(x_train, (x_train.shape[0], x_train.shape[1], 1))

model  = Sequential()

model.add(LSTM(units=50, return_sequences=True, input_shape = (x_train.shape[1],1)))
model.add(Dropout(0.2))

model.add(LSTM(units=50, return_sequences=True))
model.add(Dropout(0.2))

model.add(LSTM(units=50, return_sequences=True))
model.add(Dropout(0.2))

model.add(LSTM(units=50, return_sequences=True))
model.add(Dropout(0.2))

model.add(Dense(units=1))

model.compile(optimizer='adam', loss='mean_squared_error', metrics=['Accuracy'])
model.fit(x_train, y_train, epochs=5, batch_size=64)

所以应该发生的事情是取一个价格的 60 个周期并预测第 61 个周期。

但我最终面临以下错误:

ValueError: A target array with shape (379319, 1) was passed for an output of shape (None, 60, 1) while using as loss `mean_squared_error`. This loss expects targets to have the same shape as the output.

我究竟做错了什么?

标签: pythontensorflowmachine-learningkeraslstm

解决方案


作为数据集尝试使用 TimeseriesGenerator(tf.keras.preprocessing.sequence.TimeseriesGenerator) 而不是您的自定义列表 - > https://www.tensorflow.org/api_docs/python/tf/keras/preprocessing/sequence/TimeseriesGenerator

前任。

train_gen = TimeseriesGenerator(Xtrain, Xtrain, n_steps, batch_size=24*7)
valid_gen = TimeseriesGenerator(Xvalid, Xvalid, n_steps, batch_size=24*7)
test_gen = TimeseriesGenerator(Xtest, Xtest, n_steps, batch_size=24*7)

推荐阅读