首页 > 解决方案 > nn.Linear 如何在 shape(batch_size, seq_length, hidden_​​size) 中工作?

问题描述

self.classifier中,我认为应该对令牌应用不同的权重。下面是拥抱脸的实现。

def __init__(self, config, num_labels=2):
    super(BertForTokenClassification, self).__init__(config)
    self.num_labels = num_labels
    self.bert = BertModel(config)
    self.dropout = nn.Dropout(config.hidden_dropout_prob)
    self.classifier = nn.Linear(config.hidden_size, num_labels)
    self.apply(self.init_bert_weights)

def forward(self, input_ids, token_type_ids=None, attention_mask=None, labels=None):
    sequence_output, _ = self.bert(input_ids, token_type_ids, attention_mask, output_all_encoded_layers=False)
    sequence_output = self.dropout(sequence_output)
    logits = self.classifier(sequence_output)

    if labels is not None:
        loss_fct = CrossEntropyLoss()
        # Only keep active parts of the loss
        if attention_mask is not None:
            active_loss = attention_mask.view(-1) == 1
            active_logits = logits.view(-1, self.num_labels)[active_loss]
            active_labels = labels.view(-1)[active_loss]
            loss = loss_fct(active_logits, active_labels)
        else:
            loss = loss_fct(logits.view(-1, self.num_labels), labels.view(-1))
        return loss
    else:
        return logits

标签: pythonpytorch

解决方案


推荐阅读