首页 > 解决方案 > 训练 LSTM 比 GRU 快,但不应该吗?

问题描述

我已经为时间序列预测实现了一个简单的 LSTM 和 GRU 网络:

def LSTM1(T0, tau0, tau1, optimizer):
    model = Sequential()
    model.add(Input(shape=(T0,tau0), dtype="float32", name="Input"))
    model.add(LSTM(units=tau1, activation="tanh", recurrent_activation="tanh", name="LSTM1"))
    model.add(Dense(units=1, activation="exponential", name="Output"))
    model.compile(optimizer=optimizer, loss="mse")
    return model

def GRU1(T0, tau0, tau1, optimizer):
    model = Sequential()
    model.add(Input(shape=(T0,tau0), dtype="float32", name="Input"))
    model.add(GRU(units=tau1, activation="tanh", recurrent_activation="tanh", reset_after=False, name="GRU1"))
    model.add(Dense(units=1, activation="exponential", name="Output"))
    model.compile(optimizer=optimizer, loss="mse")
return model

LSTM 模型比 GRU 模型具有明显更多的参数:

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
LSTM1 (LSTM)                 (None, 5)                 180       
_________________________________________________________________
Output (Dense)               (None, 1)                 6         
=================================================================
Total params: 186
Trainable params: 186
Non-trainable params: 0
_________________________________________________________________

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
GRU1 (GRU)                   (None, 5)                 135      
_________________________________________________________________
Output (Dense)               (None, 1)                 6         
=================================================================
Total params: 141
Trainable params: 141
Non-trainable params: 0
_________________________________________________________________

因此,我希望训练 GRU 模型需要更少的时间。

T0        = 10     # lookback period
tau0      = 3      # dimension of x_t 
tau1      = 5      # dimension of the outputs first RNN layer
optimizer = "Adam"

# Create model
model_gru1 = GRU1(T0, tau0, tau1, optimizer)
model_lstm1 = LSTM1(T0, tau0, tau1, optimizer)

但是,采用以下训练数据:

x_train = np.random.rand(100,T0,tau0)
x_valid = np.random.rand(100,T0,tau0)
y_train = np.random.rand(100)
y_valid = np.random.rand(100)

并训练我的模型

# Train LSTM1 model
tf.random.set_seed(32)

start = timer()
model_lstm1.fit(x=x_train, y=y_train, 
          validation_data=(x_valid,y_valid), 
          verbose=1, 
          batch_size=10, epochs=500
         )
end = timer()
time_lstm1 = round(end-start,0)


# Train GRU1 model
tf.random.set_seed(32)

start = timer()
model_gru1.fit(x=x_train, y=y_train, 
          validation_data=(x_valid,y_valid), 
          verbose=1, 
          batch_size=10, epochs=500
         )
end = timer()
time_gru1 = round(end-start,0)

LSTM 需要更少的时间:

print("training time GRU1 {} vs. training time LSTM1 {}".format(time_gru1,time_lstm1))

training time GRU1 80.0 vs. training time LSTM1 62.0

我在 CPU 上使用 Tensorflow 2.0.0 版。

有任何想法吗?

标签: pythonperformancetensorflowkeraslstm

解决方案


推荐阅读