python - 使用带有深度学习的python预处理训练函数的ktrain这个错误的含义是什么
问题描述
我正在尝试使用深度学习模型来创建情感分析项目。为此,我正在使用ktrain 包,但问题出在preprocess_train()
上述函数作为参数def preprocess_train(texts, y=None, mode='train', verbose=1)
Args:
texts (list of strings): text of documents
y: labels
mode (str): If 'train' and prepare_for_learner=False,
a tf.Dataset will be returned with repeat enabled
for training with fit_generator
verbose(bool): verbosity
Returns:
TransformerDataset if self.use_with_learner = True else tf.Dataset
根据 ktrain 用户指南,我执行了以下操作:
代码:
import ktrain
from ktrain import text
from sklearn.metrics import accuracy_score,classification_report,confusion_matrix
from sklearn import metrics
MODEL_NAME = 'aubmindlab/bert-base-arabertv01'
t = text.Transformer(MODEL_NAME, maxlen=128)
trn = t.preprocess_train(X_train_smote.Tweet.values, y_train_smote)
val = t.preprocess_test(X_test.Tweet.values, y_test)
model = t.get_classifier()
learner = ktrain.get_learner(model, train_data=trn, val_data=val, batch_size=32)
在哪里:
X_train_smote.Tweet.values
--> 数组([1830, 471, 1100, ..., 1308, 930, 868])
type(X_train_smote.Tweet.values)
--> numpy ndarray
y_train_smote
--> array(['NEGATIVE', 'NEGATIVE', 'POSITIVE', ..., 'POSITIVE', 'POSITIVE', 'POSITIVE'], dtype=object)
type(y_train_smote)
--> numpy ndarray
系统崩溃并显示以下错误:
preprocessing train...
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-81-78dde2289830> in <module>()
6 MODEL_NAME = 'aubmindlab/bert-base-arabertv01'# using the Arabert
7 t = text.Transformer(MODEL_NAME, maxlen=128)
----> 8 trn = t.preprocess_train(X_train_smote.Tweet.values, y_train_smote)
9 val = t.preprocess_test(X_test.Tweet.values, y_test)
10 model = t.get_classifier()
2 frames
/usr/local/lib/python3.7/dist-packages/ktrain/text/preprocessor.py in detect_text_format(texts)
231 is_pair = _is_sentence_pair(peek)
232 if not is_pair and not isinstance(peek, str):
--> 233 raise ValueError(err_msg)
234 return is_array, is_pair
235
ValueError: invalid text format: texts should be list of strings or list of sentence pairs in form of tuples (str, str)
解决方案
推荐阅读
- reactjs - 如何实现 react-native-data-table?
- python - 使用 Python 在 Twitter 上发布 Selenium。我无法获得带有任何硒定位元素的按钮
- c - Can't get while loop to work 2 times? (C)
- spring - Spring MVC form:select selection from database issue
- jpa - 使用带有 jpa NamedStoredProcedureQuery 的 sybase 存储过程
- javascript - 如何使用 javascript 将带有数组的对象加入到单个数组中?
- python - Python 使用带有 for 循环的三元运算符
- php - php glob - 只获取文件名,没有扩展名
- php - 在前端显示管理员使用 ACF 插件手动创建的用户的用户配置文件字段
- python - 使用python缺少json信息