nlp - 如何在变压器和火炬中使用句子 bert
问题描述
我想使用sentence_transformers
但由于政策限制我无法安装 package sentence-transformers
我有变压器和手电筒包。
我去了这个页面并尝试运行以下代码
在此之前,我去了页面并下载了所有文件
import os
path="/yz/sentence-transformers/multi-qa-mpnet-base-dot-v1/" #local path where I have stored files
os.listdir(path)
['.dominokeep',
'config.json',
'data_config.json',
'modules.json',
'sentence_bert_config.json',
'special_tokens_map.json',
'tokenizer_config.json',
'train_script.py',
'vocab.txt',
'tokenizer.json',
'config_sentence_transformers.json',
'README.md',
'gitattributes',
'9e1e76b7a067f72e49c7f571cd8e811f7a1567bec49f17e5eaaea899e7bc2c9e']
我运行的代码是
from transformers import AutoTokenizer, AutoModel
import torch
# Load model from HuggingFace Hub
path="/yz/sentence-transformers/multi-qa-mpnet-base-dot-v1/"
"""tokenizer = AutoTokenizer.from_pretrained("sentence-transformers/multi-qa-mpnet-base-dot-v1")
model = AutoModel.from_pretrained("sentence-transformers/multi-qa-mpnet-base-dot-v1")"""
tokenizer = AutoTokenizer.from_pretrained(path)
model = AutoModel.from_pretrained(path)
我得到的错误如下
---------------------------------------------------------------------------
KeyError Traceback (most recent call last)
<ipython-input-18-bb33f7c519e0> in <module>
32 model = AutoModel.from_pretrained("sentence-transformers/multi-qa-mpnet-base-dot-v1")"""
33
---> 34 tokenizer = AutoTokenizer.from_pretrained(path)
35 model = AutoModel.from_pretrained(path)
36
/usr/local/anaconda/lib/python3.6/site-packages/transformers/models/auto/tokenization_auto.py in from_pretrained(cls, pretrained_model_name_or_path, *inputs, **kwargs)
308 config = kwargs.pop("config", None)
309 if not isinstance(config, PretrainedConfig):
--> 310 config = AutoConfig.from_pretrained(pretrained_model_name_or_path, **kwargs)
311
312 if "bert-base-japanese" in str(pretrained_model_name_or_path):
/usr/local/anaconda/lib/python3.6/site-packages/transformers/models/auto/configuration_auto.py in from_pretrained(cls, pretrained_model_name_or_path, **kwargs)
342
343 if "model_type" in config_dict:
--> 344 config_class = CONFIG_MAPPING[config_dict["model_type"]]
345 return config_class.from_dict(config_dict, **kwargs)
346 else:
KeyError: 'mpnet'
我的问题:
- 我应该如何解决这个错误?
- 有没有办法对MiniLM-L6-H384-uncased -使用相同的方法。我想使用它,因为它似乎更快
=============================== 软件包版本如下 -
transformers - 4.0.0
torch - 1.4.0
解决方案
答案很简单,您不能在 pytorch 1.4.0 中使用“MiniLM-L6-H384-uncased”模型
print(torch.__version__)
# 1.4.0
torch.load("/content/MiniLM-L6-H384-uncased/pytorch_model.bin", location="cpu")
"""RuntimeError: version_ <= kMaxSupportedFileFormatVersion INTERNAL ASSERT FAILED
at /pytorch/caffe2/serialize/inline_container.cc:132, please report a bug to
PyTorch. Attempted to read a PyTorch file with version 3, but the maximum
supported version for reading is 2. Your PyTorch installation may be too old.
(init at /pytorch/caffe2/serialize/inline_container.cc:132)"""
推荐阅读
- javascript - tinymce 正在从云端加载脚本
- c# - 如何在 XAML Plot 中定义自定义控制器
- c - 在虚拟内存系统上强制执行介质类型
- java - JAVA-将主机名验证程序添加到使用 wsimport 生成的肥皂客户端(修复 ssl 证书错误中的通用名称不匹配)
- google-cloud-messaging - 需要permission.C2D_MESSAGE的细节或描述
- reactjs - 从函数反应渲染身体
- reactjs - 如何在 React 中拥有一个 onClick 方法并分派到两个不同的页面“/admin”和“/user”
- c# - Visual studio Windows Form designer error: "Construction of frame content failed."
- emacs - dockerfile-mode,突然奇怪的缩进
- kapacitor - 在 kapacitor 滴答脚本中处理多个条件