首页 > 解决方案 > Python 中用于推文分类的循环神经网络 (LSTM) 中的错误

问题描述

我正在尝试通过 LSTM 改进结果。在我的部分项目中,我为 RNN 做了以下工作:

以下是用于训练模型的快速方法:

    def threshold_search(y_true, y_proba, average = None):
        best_threshold = 0
        best_score = 0
        for threshold in [i * 0.01 for i in range(100)]:
            score = f1_score(y_true=y_true, y_pred=y_proba > threshold, average=average)
            if score > best_score:
                best_threshold = threshold
                best_score = score
        search_result = {'threshold': best_threshold, 'f1': best_score}
        return search_result
    def train(model, 
              X_train, y_train, X_test, y_test, 
              checkpoint_path='model.hdf5', 
              epcohs = 25, 
              batch_size = DEFAULT_BATCH_SIZE, 
              class_weights = None, 
              fit_verbose=2,
              print_summary = True
             ):
        m = model()
        if print_summary:
            print(m.summary())
        m.fit(
            X_train, 
            y_train, 
            #this is bad practice using test data for validation, in a real case would use a seperate validation set
            validation_data=(X_test, y_test),
            epochs=epcohs, 
            batch_size=batch_size,
            class_weight=class_weights,
             #saves the most accurate model, usually you would save the one with the lowest loss
            callbacks= [
                ModelCheckpoint(checkpoint_path, monitor='val_acc', verbose=1, save_best_only=True),
                EarlyStopping(patience = 2)
            ],
            verbose=fit_verbose
        ) 
        print("\n\n****************************\n\n")
        print('Loading Best Model...')
        m.load_weights(checkpoint_path)
        predictions = m.predict(X_test, verbose=1)
        print('Validation Loss:', log_loss(y_test, predictions))
        print('Test Accuracy', (predictions.argmax(axis = 1) == y_test.argmax(axis = 1)).mean())
        print('F1 Score:', f1_score(y_test.argmax(axis = 1), predictions.argmax(axis = 1), average='weighted'))
        plot_confusion_matrix(y_test.argmax(axis = 1), predictions.argmax(axis = 1), classes=encoder.classes_)
        plt.show()    
        return m #returns best performing model

然后我使用了 LSTM 的简单实现。其中图层如下:

def model_1():
    model = Sequential()
    model.add(Embedding(input_dim = (len(tokenizer.word_counts) + 1), output_dim = 128, input_length = MAX_SEQ_LEN))
    model.add(LSTM(128))
    model.add(Dense(64, activation='relu'))
    model.add(Dense(3, activation='softmax'))
    model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
    return model

m1 = train(model_1, 
           train_text_vec,
           y_train,
           test_text_vec,
           y_test,
           checkpoint_path='model_1.h5',
           class_weights= model.any(cws))

但我得到以下输出和错误:

错误截图

正如您在屏幕截图中看到的,错误是:

ValueError:具有多个元素的数组的真值不明确。使用 a.any() 或 a.all()

你能帮我解决这个错误吗?

标签: pythonmachine-learningdeep-learningnlplstm

解决方案


根据 Keras 文档和这个问题class_weights期望有一个字典将整数类索引映射到由浮点数表示的权重。

我不确定该行model.any(cws)应该做什么,但通常一个.any()函数返回一个布尔值或布尔值数组。由于 class_weights 需要一个 dict,它会恐慌并抛出一个 ValueError。

我的猜测是,您将模型权重(构成模型的数字)与类别权重(您要预测的事物的相对重要性)混淆了。如果是这种情况,将model_weights设置保留为默认值应该可以解决您的问题。


推荐阅读