首页 > 解决方案 > 具有多种特征的 LSTM

问题描述

我有一个像这样的单列的工作示例。

import matplotlib.pyplot as plt
import pandas as pd

dataset_train = pd.read_csv('train.csv')
training_set = dataset_train.iloc[:, 3:4].values

from sklearn.preprocessing import MinMaxScaler
sc = MinMaxScaler(feature_range = (0, 1))
training_set_scaled = sc.fit_transform(training_set)

X_train = []
y_train = []
for i in range(60, 180):
    X_train.append(training_set_scaled[i-60:i, 0])
    y_train.append(training_set_scaled[i, 0])
X_train, y_train = np.array(X_train), np.array(y_train)

X_train = np.reshape(X_train, (X_train.shape[0], X_train.shape[1], 1))

from keras.models import Sequential
from keras.layers import Dense
from keras.layers import LSTM
from keras.layers import Dropout

regressor = Sequential()

regressor.add(LSTM(units = 50, return_sequences = True, input_shape = (X_train.shape[1], 1)))
regressor.add(Dropout(0.2))

regressor.add(LSTM(units = 50, return_sequences = True))
regressor.add(Dropout(0.2))

regressor.add(LSTM(units = 50, return_sequences = True))
regressor.add(Dropout(0.2))

regressor.add(LSTM(units = 50))
regressor.add(Dropout(0.2))

regressor.add(Dense(units = 1))

regressor.compile(optimizer = 'adam', loss = 'mean_squared_error')

regressor.fit(X_train, y_train, epochs = 5, batch_size = 32)

dataset_test = pd.read_csv('test.csv')
real_value = dataset_test.iloc[:, 3:4].values

dataset_total = pd.concat((dataset_train['Price'], dataset_test['Price']), axis = 0)
inputs = dataset_total[len(dataset_total) - len(dataset_test) - 60:].values
inputs = inputs.reshape(-1,1)
inputs = sc.transform(inputs)
X_test = []
for i in range(60, 70):
    X_test.append(inputs[i-60:i, 0])
X_test = np.array(X_test)

X_test = np.reshape(X_test, (X_test.shape[0], X_test.shape[1], 1))
predicted_value = regressor.predict(X_test)
predicted_value = sc.inverse_transform(predicted_value)

当我添加多个列时,我得到一个错误。我尝试了不同的方法,但无法使其正常工作。重塑时出现错误inputs

ValueError: non-broadcastable output operand with shape (225,1) doesn't match the broadcast shape (225,3)

dataset_train = pd.read_csv('train.csv')
training_set = dataset_train.iloc[:, 1:4].values

#
#
#

dataset_test = pd.read_csv('test.csv')
real_value = dataset_test.iloc[:, 3:4].values

dataset_total = pd.concat((dataset_train.iloc[:, 1:4], dataset_test.iloc[:, 1:4]), axis = 0)
inputs = dataset_total[len(dataset_total) - len(dataset_test) - 60:].values

inputs = inputs.reshape(-1,1)
inputs = sc.transform(dataset_train.iloc[:, 1:4].values)
X_test = []
for i in range(60, 70):
    X_test.append(inputs[i-60:i, 0])
X_test = np.array(X_test)

X_test = np.reshape(X_test, (X_test.shape[0], X_test.shape[1], 1))
predicted_value = regressor.predict(X_test)
predicted_value = sc.inverse_transform(predicted_value)

更新:训练集看起来像这样。

[[ 9.585000e-01  1.000000e+00  6.325263e+04]
 [ 9.674000e-01  1.000000e+00  6.296912e+04]
 [ 9.245000e-01  1.000000e+00  6.355444e+04]
 [ 9.928000e-01  1.000000e+00  5.983474e+04]
 [ 9.195000e-01  1.000000e+00  5.996487e+04]
 [ 9.654000e-01  1.000000e+00  5.977400e+04]
 [ 9.925000e-01  1.000000e+00  5.810258e+04]
 [ 9.492000e-01  1.000000e+00  5.804859e+04]
....
 [-9.262000e-01 -1.000000e+00  1.307637e+04]
 [ 9.325000e-01  1.000000e+00  1.303677e+04]
 [ 9.460000e-01  1.000000e+00  1.312846e+04]
 [-8.176000e-01 -1.000000e+00  1.294452e+04]
 [ 9.577000e-01  1.000000e+00  1.299025e+04]
 [ 7.845000e-01  1.000000e+00  1.283156e+04]
 [ 7.269000e-01  1.000000e+00  1.192546e+04]
 [ 3.612000e-01  1.000000e+00  1.175816e+04]
 [-2.580000e-02 -1.000000e+00  1.150820e+04]]

标签: pythonpandasnumpytensorflow

解决方案


因为我没有矩阵,所以很难确定原因,但是您应该为 NN 提供具有以下维度的三维矩阵:[样本、时间步长、特征]

LSTM 层应该是这样的:

LSTM(units = 50, return_sequences = True, input_shape = (num of time steps, num of features))

如需更多帮助,您可以查看此链接: https ://machinelearningmastery.com/reshape-in​​put-data-long-short-term-memory-networks-keras/


推荐阅读