nlp - RuntimeError:张量 a (546) 的大小必须与非单维 1 的张量 b (512) 的大小相匹配
问题描述
我正在使用悬脸变压器的 BertForQuestionAnswering。我遇到了张量大小问题。我尝试使用 BertConfig 设置配置。但这并没有解决问题
这是我的代码
import torch
from transformers import BertForQuestionAnswering
from transformers import BertTokenizer, BertConfig
import time
#Model
config = BertConfig(vocab_size_or_config_json_file=32000, hidden_size=768,
num_hidden_layers=12, num_attention_heads=12, intermediate_size=6072,
torchscript=True, max_position_embeddings=6144)
model_bert = BertForQuestionAnswering(config)
model_bert = model_bert.from_pretrained('bert-large-uncased-whole-word-masking-finetuned-squad')
#Tokenizer
tokenizer_bert = BertTokenizer.from_pretrained('bert-large-uncased-whole-word-masking-finetuned-squad')
encoding = tokenizer_bert.encode_plus(text=question,text_pair=paragraph, add_special=True)
inputs = encoding['input_ids'] #Token embeddings
sentence_embedding = encoding['token_type_ids'] #Segment embeddings
tokens = tokenizer_bert.convert_ids_to_tokens(inputs) #input tokens
start_scores, end_scores = model_bert(input_ids=torch.tensor([inputs]),
token_type_ids=torch.tensor([sentence_embedding]))
start_index = torch.argmax(start_scores)
end_index = torch.argmax(end_scores)
answer = ' '.join(tokens[start_index:end_index+1])
数据(问题和文本):
question = '''who is Sundar Pichai'''
paragraph = ''' Google, LLC is an American multinational technology company that specializes in Internet-related services and products, which include online advertising technologies, a search engine, cloud computing, software, and hardware. It is considered one of the Big Five technology companies in the U.S. information technology industry, alongside Amazon, Facebook, Apple, and Microsoft. Google was founded in September 1998 by Larry Page and Sergey Brin while they were Ph.D. students at Stanford University in California. Together they own about 14 percent of its shares and control 56 percent of the stockholder voting power through supervoting stock. They incorporated Google as a California privately held company on September 4, 1998, in California. Google was then reincorporated in Delaware on October 22, 2002.[12] An initial public offering (IPO) took place on August 19, 2004, and Google moved to its headquarters in Mountain View, California, nicknamed the Googleplex. In August 2015, Google announced plans to reorganize its various interests as a conglomerate called Alphabet Inc. Google is Alphabet's leading subsidiary and will continue to be the umbrella company for Alphabet's Internet interests. Sundar Pichai was appointed CEO of Google, replacing Larry Page, who became the CEO of Alphabet. The company's rapid growth since incorporation has triggered a chain of products, acquisitions, and partnerships beyond Google's core search engine (Google Search). It offers services designed for work and productivity (Google Docs, Google Sheets, and Google Slides), email (Gmail), scheduling and time management (Google Calendar), cloud storage (Google Drive), instant messaging and video chat (Duo, Hangouts, Chat, and Meet), language translation (Google Translate), mapping and navigation (Google Maps, Waze, Google Earth, and Street View), podcast hosting (Google Podcasts), video sharing (YouTube), blog publishing (Blogger), note-taking (Google Keep and Google Jamboard), and photo organizing and editing (Google Photos). The company leads the development of the Android mobile operating system, the Google Chrome web browser, and Chrome OS, a lightweight operating system based on the Chrome browser. Google has moved increasingly into hardware; from 2010 to 2015, it partnered with major electronics manufacturers in the production of its Nexus devices, and it released multiple hardware products in October 2016, including the Google Pixel smartphone, Google Home smart speaker, Google Wifi mesh wireless router, and Google Daydream virtual reality headset. Google has also experimented with becoming an Internet carrier (Google Fiber, Google Fi, and Google Station)'''
错误:
RuntimeError Traceback (most recent call
last)
<ipython-input-5-48d08888656c> in <module>()
6 tokens = tokenizer_bert.convert_ids_to_tokens(inputs) #input tokens
7
----> 8 start_scores, end_scores = model_bert(input_ids=torch.tensor([inputs]),token_type_ids=torch.tensor([sentence_embedding]))
9
10
/usr/local/lib/python3.6/dist-packages/transformers/modeling_bert.py in forward(self, input_ids, token_type_ids, position_ids, inputs_embeds)
199 token_type_embeddings = self.token_type_embeddings(token_type_ids)
200
--> 201 embeddings = inputs_embeds + position_embeddings + token_type_embeddings
202 embeddings = self.LayerNorm(embeddings)
203 embeddings = self.dropout(embeddings)
RuntimeError: The size of tensor a (546) must match the size of tensor b (512) at non-singleton dimension 1
我知道输入文本的大小大于默认设置张量大小 512。但我不知道如何手动设置该值。
解决方案
推荐阅读
- java - 我想创建一个 bean 并在服务启动时让它活着(2 个组件的上下文)
- mysql - 如何计算多个类别中的品牌
- android - TimePicker 对话框未显示 AM PM 选择的颜色
- continuous-integration - 我可以从 Gitlab 中的未解决问题创建功能分支吗?
- python-3.x - python3函数检查参数是否是2的幂,带有while循环并且不使用数学函数
- ios - 未调用 iOS 13 UIPanGestureRecognizer 选择器
- c++ - c++17 filesystem::remove_all 带通配符路径
- r - 警告消息:29192 未能在 lubridate 中解析
- javascript - Javascript Owl 轮播,如何从 API 动态设置 autoplayTimeout 字段?
- go - ioutil.ReadAll 替代方案,只消耗数据,不复制字节数组