首页 > 解决方案 > RuntimeError:张量 a (546) 的大小必须与非单维 1 的张量 b (512) 的大小相匹配

问题描述

我正在使用悬脸变压器的 BertForQuestionAnswering。我遇到了张量大小问题。我尝试使用 BertConfig 设置配置。但这并没有解决问题

这是我的代码

import torch
from transformers import BertForQuestionAnswering
from transformers import BertTokenizer, BertConfig
import time
#Model
config = BertConfig(vocab_size_or_config_json_file=32000, hidden_size=768,
      num_hidden_layers=12, num_attention_heads=12, intermediate_size=6072, 
      torchscript=True, max_position_embeddings=6144)
model_bert = BertForQuestionAnswering(config)
model_bert = model_bert.from_pretrained('bert-large-uncased-whole-word-masking-finetuned-squad')

#Tokenizer
tokenizer_bert = BertTokenizer.from_pretrained('bert-large-uncased-whole-word-masking-finetuned-squad')
encoding = tokenizer_bert.encode_plus(text=question,text_pair=paragraph, add_special=True)
inputs = encoding['input_ids']  #Token embeddings
sentence_embedding = encoding['token_type_ids']  #Segment embeddings
tokens = tokenizer_bert.convert_ids_to_tokens(inputs) #input tokens

start_scores, end_scores = model_bert(input_ids=torch.tensor([inputs]), 
token_type_ids=torch.tensor([sentence_embedding]))

start_index = torch.argmax(start_scores)
end_index = torch.argmax(end_scores)

answer = ' '.join(tokens[start_index:end_index+1])

数据(问题和文本):

question = '''who is Sundar Pichai'''

paragraph = ''' Google, LLC is an American multinational technology company that specializes in Internet-related services and products, which include online advertising technologies, a search engine, cloud computing, software, and hardware. It is considered one of the Big Five technology companies in the U.S. information technology industry, alongside Amazon, Facebook, Apple, and Microsoft. Google was founded in September 1998 by Larry Page and Sergey Brin while they were Ph.D. students at Stanford University in California. Together they own about 14 percent of its shares and control 56 percent of the stockholder voting power through supervoting stock. They incorporated Google as a California privately held company on September 4, 1998, in California. Google was then reincorporated in Delaware on October 22, 2002.[12] An initial public offering (IPO) took place on August 19, 2004, and Google moved to its headquarters in Mountain View, California, nicknamed the Googleplex. In August 2015, Google announced plans to reorganize its various interests as a conglomerate called Alphabet Inc. Google is Alphabet's leading subsidiary and will continue to be the umbrella company for Alphabet's Internet interests. Sundar Pichai was appointed CEO of Google, replacing Larry Page, who became the CEO of Alphabet. The company's rapid growth since incorporation has triggered a chain of products, acquisitions, and partnerships beyond Google's core search engine (Google Search). It offers services designed for work and productivity (Google Docs, Google Sheets, and Google Slides), email (Gmail), scheduling and time management (Google Calendar), cloud storage (Google Drive), instant messaging and video chat (Duo, Hangouts, Chat, and Meet), language translation (Google Translate), mapping and navigation (Google Maps, Waze, Google Earth, and Street View), podcast hosting (Google Podcasts), video sharing (YouTube), blog publishing (Blogger), note-taking (Google Keep and Google Jamboard), and photo organizing and editing (Google Photos). The company leads the development of the Android mobile operating system, the Google Chrome web browser, and Chrome OS, a lightweight operating system based on the Chrome browser. Google has moved increasingly into hardware; from 2010 to 2015, it partnered with major electronics manufacturers in the production of its Nexus devices, and it released multiple hardware products in October 2016, including the Google Pixel smartphone, Google Home smart speaker, Google Wifi mesh wireless router, and Google Daydream virtual reality headset. Google has also experimented with becoming an Internet carrier (Google Fiber, Google Fi, and Google Station)'''

错误:

RuntimeError                              Traceback (most recent call 
last)
<ipython-input-5-48d08888656c> in <module>()
  6 tokens = tokenizer_bert.convert_ids_to_tokens(inputs) #input tokens
  7 
----> 8 start_scores, end_scores = model_bert(input_ids=torch.tensor([inputs]),token_type_ids=torch.tensor([sentence_embedding]))
  9 
 10 

/usr/local/lib/python3.6/dist-packages/transformers/modeling_bert.py in forward(self, input_ids, token_type_ids, position_ids, inputs_embeds)
199         token_type_embeddings = self.token_type_embeddings(token_type_ids)
200 
--> 201         embeddings = inputs_embeds + position_embeddings + token_type_embeddings
202         embeddings = self.LayerNorm(embeddings)
203         embeddings = self.dropout(embeddings)

RuntimeError: The size of tensor a (546) must match the size of tensor b (512) at non-singleton dimension 1

我知道输入文本的大小大于默认设置张量大小 512。但我不知道如何手动设置该值。

标签: nlpbert-language-modelhuggingface-transformersquestion-answering

解决方案


推荐阅读