python - AssertionError: Padding_idx 必须在 num_embeddings 内
问题描述
我的一些旧代码在过去 2 个月内运行良好,直到今天突然开始出现此错误。我不知道发生了什么变化,因为我没有接触过这段代码。但我不明白这个新错误:
INFO:pytorch_transformers.tokenization_utils:loading file https://s3.amazonaws.com/models.huggingface.co/bert/xlm-clm-ende-1024-vocab.json from cache at /root/.cache/torch/pytorch_transformers/6e42a59f5e60f1efc6116fd1a2c05a72ecf713a3022b9c274b727ed6469e6ac1.2c29a4b393decdd458e6a9744fa1d6b533212e4003a4012731d2bc2261dc35f3
INFO:pytorch_transformers.tokenization_utils:loading file https://s3.amazonaws.com/models.huggingface.co/bert/xlm-mlm-ende-1024-merges.txt from cache at /root/.cache/torch/pytorch_transformers/85d878ffb1bc2c3395b785d10ce7fc91452780316140d7a26201d7a912483e44.42fa32826c068642fdcf24adbf3ef8158b3b81e210a3d03f3102cf5a899f92a0
---------------------------------------------------------------------------
AssertionError Traceback (most recent call last)
<ipython-input-31-c365f437b895> in <module>()
9 tokenizer = tokenizer_class.from_pretrained(args['model_name'])
10
---> 11 model = model_class.from_pretrained(args['model_name'])
12 model.to(device);
13
3 frames
/usr/local/lib/python3.6/dist-packages/pytorch_transformers/modeling_utils.py in from_pretrained(cls, pretrained_model_name_or_path, *model_args, **kwargs)
534
535 # Instantiate model.
--> 536 model = cls(config, *model_args, **model_kwargs)
537
538 if state_dict is None and not from_tf:
/usr/local/lib/python3.6/dist-packages/pytorch_transformers/modeling_xlm.py in __init__(self, config)
842 self.num_labels = config.num_labels
843
--> 844 self.transformer = XLMModel(config)
845 self.sequence_summary = SequenceSummary(config)
846
/usr/local/lib/python3.6/dist-packages/pytorch_transformers/modeling_xlm.py in __init__(self, config)
543 if config.n_langs > 1 and config.use_lang_emb:
544 self.lang_embeddings = nn.Embedding(self.n_langs, self.dim)
--> 545 self.embeddings = nn.Embedding(self.n_words, self.dim, padding_idx=self.pad_index)
546 self.layer_norm_emb = nn.LayerNorm(self.dim, eps=config.layer_norm_eps)
547
/usr/local/lib/python3.6/dist-packages/torch/nn/modules/sparse.py in __init__(self, num_embeddings, embedding_dim, padding_idx, max_norm, norm_type, scale_grad_by_freq, sparse, _weight)
86 if padding_idx is not None:
87 if padding_idx > 0:
---> 88 assert padding_idx < self.num_embeddings, 'Padding_idx must be within num_embeddings'
89 elif padding_idx < 0:
90 assert padding_idx >= -self.num_embeddings, 'Padding_idx must be within num_embeddings'
AssertionError: Padding_idx must be within num_embeddings
有人可以阐明可能发生的事情吗?
非常感谢!
解决方案
事实证明,一些类已从 pytorch_transformers 包中移出到转换器中。我仍然不得不将 pytorch_transformers 用于其他类。只需要在下面的import语句中替换包名
from pytorch_transformers import XLMConfig, XLMTokenizer, ...
推荐阅读
- mysql - 避免尝试获取锁时发现死锁;尝试在 MariaDB (MySQL) 上重新启动事务 INSERT ON DUPLICATE KEY UPDATE
- python - 在 Python pandas 中使用正则表达式查找组合数字和字母的特定字符序列
- python - Tensorflow 2.0 如何从 Keras 模型中导出预测和训练模式
- python - 无法在 python 中安装 pyaudio 库
- java - 如何在 mac 的 SWT 上处理鼠标坐标?
- angular - 在自定义控件范围内如何访问在反应形式组中声明的验证器?
- python - 绘制连续购买与否之间的月数(分箱)。在 python 中的每个 bin 中落下的客户
- selenium - Selenium - 可以在除悬停之外的元素上单击、获取文本等
- react-native - 如何在我的地图上渲染图像?
- python - 循环遍历更新的列表