python - LSTM 模型;无效的 input_h 形状:[1,10,128] [1,4,128]
问题描述
所以我试图通过 Keras 使用 LSTM (CuDNNLSTM) 创建一个音素分类器。问题是每次我尝试训练我的模型时都会收到此错误:
InvalidArgumentError: Invalid input_h shape: [1,10,128] [1,4,128]
[[node sequential_11/cu_dnnlstm_33/CudnnRNNV2 (defined at <ipython-input-26-8a407dd29428>:47) ]] [Op:__inference_test_function_71890]
这是我的数据集的形状:
x_train.shape= (10, 100, 16)
y_train.shape= (10, 100)
x_test.shape= (6, 100, 16)
y_test.shape= (6, 100)
x_validation.shape= (4, 100, 16)
y_validation.shape= (4, 100)
每个 y 标签对应一个 16 个浮点数的向量。
这是我的build_model
功能
def build_model(input_shape=None, LR=.001, phone_count=61):
#build the network
model= Sequential()
#RNN layer 1
model.add(CuDNNLSTM(128, batch_input_shape=(input_shape), return_sequences=True))
model.add(Dropout(0.2))
model.add(BatchNormalization())
#RNN layer 2
model.add(CuDNNLSTM(128, input_shape=(input_shape), return_sequences=True))
model.add(Dropout(0.2))
model.add(BatchNormalization())
#RNN layer 3
model.add(CuDNNLSTM(128, input_shape=(input_shape), return_sequences=True))
model.add(Dropout(0.2))
model.add(BatchNormalization())
model.add(keras.layers.Dense(32, activation='relu'))
model.add(keras.layers.Dropout(0.3))
#softmax layer
model.add(keras.layers.Dense(phone_count, activation='softmax'))
#compile the model
the_optimizer= keras.optimizers.Adam(learning_rate=LR)
model.compile(optimizer=the_optimizer, loss='sparse_categorical_crossentropy', metrics=['accuracy'])
# model.compile(optimizer=the_optimizer, loss='categorical_crossentropy', metrics=['accuracy'])
model.summary()
return model
这是我的模型的摘要:
Model: "sequential_11"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
cu_dnnlstm_33 (CuDNNLSTM) (10, 100, 128) 74752
_________________________________________________________________
dropout_44 (Dropout) (10, 100, 128) 0
_________________________________________________________________
batch_normalization_33 (Batc (10, 100, 128) 512
_________________________________________________________________
cu_dnnlstm_34 (CuDNNLSTM) (10, 100, 128) 132096
_________________________________________________________________
dropout_45 (Dropout) (10, 100, 128) 0
_________________________________________________________________
batch_normalization_34 (Batc (10, 100, 128) 512
_________________________________________________________________
cu_dnnlstm_35 (CuDNNLSTM) (10, 100, 128) 132096
_________________________________________________________________
dropout_46 (Dropout) (10, 100, 128) 0
_________________________________________________________________
batch_normalization_35 (Batc (10, 100, 128) 512
_________________________________________________________________
dense_22 (Dense) (10, 100, 32) 4128
_________________________________________________________________
dropout_47 (Dropout) (10, 100, 32) 0
_________________________________________________________________
dense_23 (Dense) (10, 100, 61) 2013
=================================================================
Total params: 346,621
Trainable params: 345,853
Non-trainable params: 768
_________________________________________________________________
这是我的主要功能
x_train, y_train, x_test, y_test, x_validation, y_validation= prep_data(Temp=True)
for i in range(y_train.__len__()):
y_train[i]=y_train[i][0:100]
x_train[i]=x_train[i][0:100]
x_train[i]= np.array(x_train[i])
y_train[i]= np.array(y_train[i])
for i in range(y_test.__len__()):
y_test[i]=y_test[i][0:100]
x_test[i]=x_test[i][0:100]
x_test[i]= np.array(x_test[i])
y_test[i]= np.array(y_test[i])
for i in range(y_validation.__len__()):
y_validation[i]=y_validation[i][0:100]
x_validation[i]=x_validation[i][0:100]
x_validation[i]= np.array(x_validation[i])
y_validation[i]= np.array(y_validation[i])
x_train= np.array(x_train)
y_train= np.array(y_train)
x_test= np.array(x_test)
y_test= np.array(y_test)
x_validation= np.array(x_validation)
y_validation= np.array(y_validation)
model= build_model(input_shape, phone_count=61)
#train the model
model.fit(x_train, y_train, epochs=40, batch_size=10, validation_data=(x_validation, y_validation))
# model.fit(x_train, y_train, epochs=40, batch_size=1, validation_split=0.2)
#evaluate the model
error, accuracy= model.evaluate(x_test, y_test)
print(f"Test error: {error}, Test accuracy: {accuracy}")
当我尝试训练模型 ( model_fit
) 时会出现问题。如果我成功地将批号更改model_fit
为模型训练,但是当我尝试在 line 评估模型时会发生1
类似的错误。我的数据形状和我的参数似乎彼此一致,所以我看不到问题所在。如果有人有任何想法,请告诉我。Invalid input_h shape: [1,10,128] [1,6,128]
error, accuracy= model.evaluate(x_test, y_test)
解决方案
所以我看到的一个错误是您的 input_shape=(10,100,16) 表明您的模型需要 100 个时间步长,每个时间步长有 16 个值。但是你的第一层有 128 个神经元。您想添加一个Input_Layer
或将第一层中的神经元数量减少到 16 个,因为每个时间步有 16 个特征
推荐阅读
- java - Java 内置 JSON 解析器
- javascript - 未捕获的类型错误:无法读取 null XMLHttpRequest JAVASCRIPT 的属性“项目”
- r - 我对 glm 的 anova 测试的软件 R 有问题
- python - def() Python 中超过 2 条返回规则
- javascript - Angular js代码在使用错误后终止
- rxjs - ngx-select-dropdown 如何包含 debounceTime?
- spring-boot - 当spring boot应用程序在linux中部署为服务时,如何动态传递或指定application.yml?
- c++ - 在引用初始化中使用已删除的复制构造函数进行复制初始化
- jwt - 如何配置并告诉 Auth0 使用 Authorities (JWT) 返还身份验证令牌
- c# - 将另一个对象的值分配给这个 c#