首页 > 解决方案 > How to extract the predicted classes when using the return_sequence and TimeDistributed parameters in an LSTM model?

问题描述

I am using this LSTM model to classify my data into two classes:

model = Sequential()
model.add(LSTM(units=120, activation='tanh', return_sequences=True, input_shape=(X_train.shape[1], X_train.shape[2])))
model.add(TimeDistributed(Dense(1, activation='sigmoid')))
model.compile(loss='binary_crossentropy', optimizer=Adam(lr=0.0001, beta_1=0.9, beta_2=0.999, amsgrad=False), metrics=['accuracy'])
model.fit(X_train, train_target, batch_size=64, epochs=1000, validation_split=0.2)
loss, acc = model.evaluate(X_test, test_target)

My data shape is [168 (samples), 402 (timesteps), 1000 (features)]. Train set [134 (samples), 402 (timesteps), 1000 (features)]) and Test set [34 (samples), 402 (timesteps), 1000 (features)]). I am using return_sequences=True and TimeDistributed to obtain the classification decision of each timestep.

model.predict_classes(X_test) gives a decision matrix of (34(samples), 402 (timesteps)) which corresponds to the decision of each timestep.

How, in addition to the decision for each timestep I can have the global decision for each of the 34 samples with this same model?

model.summary()

Layer (type)                 Output Shape              Param #   
=================================================================
lstm_24 (LSTM)               (None, 402, 120)          538080    
_________________________________________________________________
time_distributed_24 (TimeDis (None, 402, 1)            121       
=================================================================
Total params: 538,201
Trainable params: 538,201
Non-trainable params: 0

标签: pythonkerasdeep-learninglstm

解决方案


IIUC,您正在尝试映射(34, 402, 1000) -> (34, 1),而不是(34, 402, 1000) -> (34, 402, 1)随着时间的推移分配输出Dense层。

您可以在训练期间使用附加LSTM的 withreturn_sequences=False和标准Dense层(无时间分布层)来进行此映射。然后model.predict应该得到你所需要的。

from tensorflow.keras import layers, Model, utils

inp = layers.Input((402,1000))
x = layers.LSTM(120, activation='tanh', return_sequences=True)(inp)
x = layers.LSTM(120, activation='tanh')(x)
out = layers.Dense(1, activation='sigmoid')(x)


model = Model(inp, out)
model.compile(loss='binary_crossentropy',
              optimizer='adam',
              metrics=['accuracy'])

model.summary()
Model: "model_8"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
input_10 (InputLayer)        [(None, 402, 1000)]       0         
_________________________________________________________________
lstm_10 (LSTM)               (None, 402, 120)          538080    
_________________________________________________________________
lstm_11 (LSTM)               (None, 120)               115680    
_________________________________________________________________
dense_6 (Dense)              (None, 1)                 121       
=================================================================
Total params: 653,881
Trainable params: 653,881
Non-trainable params: 0
_________________________________________________________________

推荐阅读