首页 > 解决方案 > 使用带有深度学习的python预处理训练函数的ktrain这个错误的含义是什么

问题描述

我正在尝试使用深度学习模型来创建情感分析项目。为此,我正在使用ktrain 包,但问题出在preprocess_train()

上述函数作为参数def preprocess_train(texts, y=None, mode='train', verbose=1)

Args:
    texts (list of strings): text of documents
    y: labels
    mode (str):  If 'train' and prepare_for_learner=False,
                 a tf.Dataset will be returned with repeat enabled
                 for training with fit_generator
    verbose(bool): verbosity
Returns:
  TransformerDataset if self.use_with_learner = True else tf.Dataset

根据 ktrain 用户指南,我执行了以下操作:

代码:

import ktrain
from ktrain import text
from sklearn.metrics import accuracy_score,classification_report,confusion_matrix
from sklearn import metrics

MODEL_NAME = 'aubmindlab/bert-base-arabertv01'
t = text.Transformer(MODEL_NAME, maxlen=128)
trn = t.preprocess_train(X_train_smote.Tweet.values, y_train_smote)
val = t.preprocess_test(X_test.Tweet.values, y_test)
model = t.get_classifier()
learner = ktrain.get_learner(model, train_data=trn, val_data=val, batch_size=32)

在哪里:

X_train_smote.Tweet.values--> 数组([1830, 471, 1100, ..., 1308, 930, 868])

type(X_train_smote.Tweet.values)--> numpy ndarray

y_train_smote--> array(['NEGATIVE', 'NEGATIVE', 'POSITIVE', ..., 'POSITIVE', 'POSITIVE', 'POSITIVE'], dtype=object) type(y_train_smote)--> numpy ndarray

系统崩溃并显示以下错误:

preprocessing train...
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-81-78dde2289830> in <module>()
      6 MODEL_NAME = 'aubmindlab/bert-base-arabertv01'# using the Arabert
      7 t = text.Transformer(MODEL_NAME, maxlen=128)
----> 8 trn = t.preprocess_train(X_train_smote.Tweet.values, y_train_smote)
      9 val = t.preprocess_test(X_test.Tweet.values, y_test)
     10 model = t.get_classifier()

2 frames
/usr/local/lib/python3.7/dist-packages/ktrain/text/preprocessor.py in detect_text_format(texts)
    231         is_pair = _is_sentence_pair(peek)
    232         if not is_pair and not isinstance(peek, str):
--> 233             raise ValueError(err_msg)
    234     return is_array, is_pair
    235 

ValueError: invalid text format: texts should be list of strings or list of sentence pairs in form of tuples (str, str)

标签: pythondeep-learningktrain

解决方案


推荐阅读