python - 从 BERT 编码器（张量流）获取序列输出

问题描述

我正在尝试按照官方指南在 tensorflow 中微调 BERT，目的是将输出进一步输入 LSTM/GRU。我能够进行微调，但我从中得到的输出形状bert_encoder是[num_samples, hidden_units]和[num_samples, 1, 768]。我相信这些分别是汇集输出和序列输出，但我很困惑为什么序列输出不是[num_samples, max_seq_length, hidden_units]。

bert_classifier替换为bert_encoderon compile and fit后运行此代码：

bert_encoder([glue_train["input_word_ids"][0:10],
              glue_train["input_mask"][0:10],
              glue_train["input_type_ids"][0:10]])

产生：

[<tf.Tensor: shape=(10, 1, 768), dtype=float32, numpy= ...>, <tf.Tensor: shape=(10, 768), dtype=float32, numpy= ...>]

由于我要传递给序列模型，因此我需要获取序列输出，但我一直只获得 1 个形状的序列长度。我一直试图理解为什么，但找不到任何东西。任何帮助和澄清将不胜感激。谢谢！

标签： pythontensorflowbert-language-model

python - 从 BERT 编码器（张量流）获取序列输出

问题描述

解决方案

推荐阅读