python - 使用时间序列进行预测的多变量和多步 LSTM 模型
问题描述
我有 10 个指标来预测 1 个指标,数据按时间序列排列,并且每月给出。
我想我应该使用 LSTM。
以下是堆叠数据的拆分方式
stacked = hstack((x1,x2,x3,x4,x5,x6,x7,x8,x9,x10,yy))
print ("stacked.shape" , stacked.shape)
# split a multivariate sequence into samples
def split_sequences(sequences, n_steps_in, n_steps_out):
X, y = list(), list()
for i in range(len(sequences)):
# find the end of this pattern
end_ix = i + n_steps_in
out_end_ix = end_ix + n_steps_out-1
# check if we are beyond the dataset
if out_end_ix > len(sequences):
break
# gather input and output parts of the pattern
seq_x, seq_y = sequences[i:end_ix, :-1], sequences[end_ix-1:out_end_ix, -1]
X.append(seq_x)
y.append(seq_y)
return array(X), array(y)
n_steps_in, n_steps_out = 24 , 12
# covert into input/output
X, yy = split_sequences(stacked, n_steps_in, n_steps_out)
print ("X.shape" , X.shape)
n_datasets,n_steps_in,n_features = X.shape
print ("y.shape" , yy.shape)
n_datasets,n_steps_out = yy.shape
# spliting data
# 30 year train : 5 year valid: left of 4 year for test
split_point = 12*30
split_point2 = 12*5+splitpoint
train_X , train_y = X[:split_point, :] , yy[:split_point, :]
valid_X , valid_y = X[split_point:split_point2, :] , yy[split_point:split_point2, :]
#test_X, test_y = X[split_point2:, :], yy[split_point2:, :]
所以我打算使用 10 个变量(X)的 24 个月数据来预测一个变量(y)
stacked.shape (597, 11)
X.shape (563, 24, 10)
y.shape (563, 12)
train_x shape: (360, 24, 10)
train_y shape: (360, 12)
valid_x shape: (0, 24, 10)
valid_y shape: (0, 12)
然后我使用带有亚当优化器的 LSTM 训练模型
np.random.seed(42)
tf.random.set_seed(42)
#optimizer learning rate
opt = keras.optimizers.Adam(learning_rate=0.001)
# define model
model = Sequential()
model.add(LSTM(50, activation='relu', return_sequences=True, input_shape=(n_steps_in, n_features)))
model.add(LSTM(50, activation='relu'))
model.add(Dense(n_steps_out))
model.compile(loss='mse' , optimizer=opt , metrics=['mse'])
print(model.summary())
# Fit network
history = model.fit(train_X , train_y ,
epochs=40 , verbose=0 ,
validation_data=(valid_X, valid_y) ,shuffle=False)
然后我在一个函数中运行预测:
def prep_data(x_test, y_test , start , end , last):
#prepare test data X
dataset_test_X = x_test[start:end, :]
print("dataset_test_X :",dataset_test_X.shape)
test_X_new = dataset_test_X.reshape(1,dataset_test_X.shape[0],dataset_test_X.shape[1])
print("test_X_new :",test_X_new.shape)
#prepare past and groundtruth
past_data = y_test[:end, :]
dataset_test_y = y_test[end-1:last-1 , :]
scaler1 = MinMaxScaler(feature_range=(0, 1))
scaler1.fit(dataset_test_y)
print("dataset_test_y :",dataset_test_y.shape)
print("past_data :",past_data.shape)
#predictions
y_pred = model.predict(test_X_new)
y_pred_inv = scaler1.inverse_transform(y_pred)
y_pred_inv = y_pred_inv.reshape(n_steps_out,1)
y_pred_inv = y_pred_inv[:,0]
print("y_pred :",y_pred.shape)
print("y_pred_inv :",y_pred_inv.shape)
return y_pred_inv , dataset_test_y , past_data
#start can be any point in the test data 4year, 0-24
start =20
end = start + n_steps_in
last = end + n_steps_out
stacked_x = hstack((x1,x2,x3,x4,x5,x6,x7,x8,x9,x10))
stacked_x = stacked_x[split_point2:, :]
print('stacked_x shape: ',stacked_x.shape)
y_test = y[split_point2:, :]
print('y test shape', y_test.shape)
y_pred_inv , dataset_test_y , past_data = prep_data(stacked_x , y_test , start , end , last)
我可以成功地训练模型并使用它,但输出不是很好。它通常只是一条几乎没有转弯的水平线。就算训练结果不好,也不应该至少从past_data结束的地方开始吧?我在这里做错了什么?
解决方案
推荐阅读
- qml - 在密码文本字段上,我想应用一个不隐藏显示密码的选项,供用户在 qml 中检查和确认
- angular - 将 debounceTime 添加到 HttpInterceptor
- python-3.x - 检查 Python 中是否存在带有 pathlib 的文件并返回假阴性
- javascript - 使用 React 从功能组件中的类组件访问 ref
- python-3.x - 如何使用也具有外键字段的外键字段进行更新
- android - 有没有办法查看 PC 的 Android 端到 Android 的连接?
- c++ - c ++如何在自定义容器中正确调用deallocate
- java - java.lang.IllegalArgumentException:无法在 Corda 中接收原始类型 [int]
- r - 使用 renderUI 在 R Shiny 中插入额外的 tabPanel
- javascript - 当对象在另一个对象内时,如何通过其 id 找到对象