首页 > 解决方案 > LSTM 模型;无效的 input_h 形状:[1,10,128] [1,4,128]

问题描述

所以我试图通过 Keras 使用 LSTM (CuDNNLSTM) 创建一个音素分类器。问题是每次我尝试训练我的模型时都会收到此错误:

InvalidArgumentError:  Invalid input_h shape: [1,10,128] [1,4,128]
     [[node sequential_11/cu_dnnlstm_33/CudnnRNNV2 (defined at <ipython-input-26-8a407dd29428>:47) ]] [Op:__inference_test_function_71890]

这是我的数据集的形状:

x_train.shape= (10, 100, 16)
y_train.shape= (10, 100)
x_test.shape= (6, 100, 16)
y_test.shape= (6, 100)
x_validation.shape= (4, 100, 16)
y_validation.shape= (4, 100)

每个 y 标签对应一个 16 个浮点数的向量。

这是我的build_model功能

def build_model(input_shape=None, LR=.001, phone_count=61):
  
  #build the network
  model= Sequential()
  #RNN layer 1
  model.add(CuDNNLSTM(128, batch_input_shape=(input_shape), return_sequences=True))
  model.add(Dropout(0.2))
  model.add(BatchNormalization())
  
  #RNN layer 2
  model.add(CuDNNLSTM(128, input_shape=(input_shape), return_sequences=True))
  model.add(Dropout(0.2))
  model.add(BatchNormalization())
  
  #RNN layer 3
  model.add(CuDNNLSTM(128, input_shape=(input_shape), return_sequences=True))
  model.add(Dropout(0.2))
  model.add(BatchNormalization())
  
  
  model.add(keras.layers.Dense(32, activation='relu'))
  model.add(keras.layers.Dropout(0.3))

  #softmax layer
  model.add(keras.layers.Dense(phone_count, activation='softmax'))

  #compile the model
  the_optimizer= keras.optimizers.Adam(learning_rate=LR)
  model.compile(optimizer=the_optimizer, loss='sparse_categorical_crossentropy', metrics=['accuracy'])
  # model.compile(optimizer=the_optimizer, loss='categorical_crossentropy', metrics=['accuracy'])

  model.summary()

  return model

这是我的模型的摘要:

Model: "sequential_11"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
cu_dnnlstm_33 (CuDNNLSTM)    (10, 100, 128)            74752     
_________________________________________________________________
dropout_44 (Dropout)         (10, 100, 128)            0         
_________________________________________________________________
batch_normalization_33 (Batc (10, 100, 128)            512       
_________________________________________________________________
cu_dnnlstm_34 (CuDNNLSTM)    (10, 100, 128)            132096    
_________________________________________________________________
dropout_45 (Dropout)         (10, 100, 128)            0         
_________________________________________________________________
batch_normalization_34 (Batc (10, 100, 128)            512       
_________________________________________________________________
cu_dnnlstm_35 (CuDNNLSTM)    (10, 100, 128)            132096    
_________________________________________________________________
dropout_46 (Dropout)         (10, 100, 128)            0         
_________________________________________________________________
batch_normalization_35 (Batc (10, 100, 128)            512       
_________________________________________________________________
dense_22 (Dense)             (10, 100, 32)             4128      
_________________________________________________________________
dropout_47 (Dropout)         (10, 100, 32)             0         
_________________________________________________________________
dense_23 (Dense)             (10, 100, 61)             2013      
=================================================================
Total params: 346,621
Trainable params: 345,853
Non-trainable params: 768
_________________________________________________________________

这是我的主要功能

x_train, y_train, x_test, y_test, x_validation, y_validation= prep_data(Temp=True)


for i in range(y_train.__len__()):
  y_train[i]=y_train[i][0:100]
  x_train[i]=x_train[i][0:100]
  x_train[i]= np.array(x_train[i])
  y_train[i]= np.array(y_train[i])
for i in range(y_test.__len__()):
  y_test[i]=y_test[i][0:100]
  x_test[i]=x_test[i][0:100]
  x_test[i]= np.array(x_test[i])
  y_test[i]= np.array(y_test[i])
for i in range(y_validation.__len__()):
  y_validation[i]=y_validation[i][0:100]
  x_validation[i]=x_validation[i][0:100]
  x_validation[i]= np.array(x_validation[i])
  y_validation[i]= np.array(y_validation[i])

x_train= np.array(x_train)
y_train= np.array(y_train)
x_test= np.array(x_test)
y_test= np.array(y_test)
x_validation= np.array(x_validation)
y_validation= np.array(y_validation)

model= build_model(input_shape, phone_count=61)

#train the model  
model.fit(x_train, y_train, epochs=40, batch_size=10, validation_data=(x_validation, y_validation))  
# model.fit(x_train, y_train, epochs=40, batch_size=1, validation_split=0.2)  

#evaluate the model
error, accuracy= model.evaluate(x_test, y_test)
print(f"Test error: {error}, Test accuracy: {accuracy}")

当我尝试训练模型 ( model_fit) 时会出现问题。如果我成功地将批号更改model_fit 为模型训练,但是当我尝试在 line 评估模型时会发生1类似的错误。我的数据形状和我的参数似乎彼此一致,所以我看不到问题所在。如果有人有任何想法,请告诉我。Invalid input_h shape: [1,10,128] [1,6,128]error, accuracy= model.evaluate(x_test, y_test)

标签: pythontensorflowmachine-learningkeraslstm

解决方案


所以我看到的一个错误是您的 input_shape=(10,100,16) 表明您的模型需要 100 个时间步长,每个时间步长有 16 个值。但是你的第一层有 128 个神经元。您想添加一个Input_Layer或将第一层中的神经元数量减少到 16 个,因为每个时间步有 16 个特征


推荐阅读