python - BertModel训练期间的Python“无法腌制本地对象”异常
问题描述
我正在使用 simpletransformers.classification 来训练一个 Bert 模型来对一些文本输入进行分类。这是我的代码。
from simpletransformers.classification import ClassificationModel
import torch
import numpy as np # linear algebra
import pandas as pd # data processing, CSV file I/O (e.g. pd.read_csv)
from simpletransformers.classification import ClassificationModel
from pytorch_pretrained_bert import BertTokenizer, BertModel, BertForMaskedLM, BertForSequenceClassification
import parallelTestModule
# Lets import the csv file in pandas dataframe first
train_df = pd.read_csv('D:\\7allV03Small.csv', encoding='utf-8', header=None, names=['cat', 'text'])
# Check the df
print(train_df.head())
# unique categories
print(train_df.cat.unique())
print("Total categories",len(train_df.cat.unique()))
# convert string labels to integers
train_df['labels'] = pd.factorize(train_df.cat)[0]
print(train_df.head())
# Let's create a train and test set
from sklearn.model_selection import train_test_split
train, test = train_test_split(train_df, test_size=0.2, random_state=42)
print('Eğitim veri seti boyutu : ' + str(train.shape), ' Test eğitim seti : ' + str(test.shape))
if __name__ == "__main__":
from multiprocessing import freeze_support
model = ClassificationModel('bert', 'bert-base-multilingual-uncased', use_cuda=False, num_labels=8, args={'reprocess_input_data': True, 'overwrite_output_dir': True, 'num_train_epochs': 1,'train_batch_size':1})
freeze_support()
# Now lets fine tune bert with the train set
model.train_model(train)
一切看起来都很好,它开始训练。但在训练结束时,它会出现如下错误。
Traceback (most recent call last):
File "c:/Users/arslanom/Desktop/text/try.py", line 45, in <module>
model.train_model(train)
File "C:\Users\arslanom\AppData\Roaming\Python\Python36\site-packages\simpletransformers\classification\classification_model.py", line 269, in train_model
**kwargs,
File "C:\Users\arslanom\AppData\Roaming\Python\Python36\site-packages\simpletransformers\classification\classification_model.py", line 544, in train
self._save_model(output_dir_current, optimizer, scheduler, model=model)
File "C:\Users\arslanom\AppData\Roaming\Python\Python36\site-packages\simpletransformers\classification\classification_model.py", line 1113, in _save_model
torch.save(scheduler.state_dict(), os.path.join(output_dir, "scheduler.pt"))
File "C:\Users\arslanom\AppData\Roaming\Python\Python36\site-packages\torch\serialization.py", line 209, in save
return _with_file_like(f, "wb", lambda f: _save(obj, f, pickle_module, pickle_protocol))
File "C:\Users\arslanom\AppData\Roaming\Python\Python36\site-packages\torch\serialization.py", line 134, in _with_file_like
return body(f)
File "C:\Users\arslanom\AppData\Roaming\Python\Python36\site-packages\torch\serialization.py", line 209, in <lambda>
return _with_file_like(f, "wb", lambda f: _save(obj, f, pickle_module, pickle_protocol))
File "C:\Users\arslanom\AppData\Roaming\Python\Python36\site-packages\torch\serialization.py", line 282, in _save
pickler.dump(obj)
AttributeError: Can't pickle local object 'get_linear_schedule_with_warmup.<locals>.lr_lambda'
听起来这个问题与 worker_count 有关,因为它使用多线程运行。但我找不到任何解决方案。
操作系统:Windows 10
内存:16 Gb
解决方案
推荐阅读
- macos - 如果迭代次数太高 Mac 终端不输出
- slack - 多个开发人员如何开发一个 Slack 应用程序?
- python - gradle 与 conda 和 pyenv 的交互
- python-3.x - 将 PySpark 数据帧转换为 pandas 数据帧的时间长度
- java - 从 Java DateTimeFormat 获取模式以创建漂亮的错误消息
- powerbi - 将 Notion API 连接到 Power BI 时出现错误 400
- python - 有没有办法将文件名更改为用户文本?在 tkinter gui 中?
- android - 打开android应用程序并通过命令行运行它
- postgresql - Sequelize 使用 uuid_generate_v4 设置默认值会给出语法错误
- python - 如果响应太大,Django Rest Framework 无法返回响应。http 代码 0