首页 > 解决方案 > NotFoundError:[_Derived_]没有为操作定义梯度:Tensorflow 1.15.0 上的 StatefulPartitionedCall

问题描述

我正在使用嵌入层运行tensorflow模型。BERT我发现这个类似的问题没有答案。老实说,我不明白为什么会发生错误,因为该模型对于另一个数据集运行良好。

当我打电话时model.fit

  train_history = model.fit(
    train_input, train_labels,
    validation_split=0.2,
    epochs=3,
    batch_size=8
)

我收到此错误:

  NotFoundError:  [_Derived_]No gradient defined for op: StatefulPartitionedCall
     [[{{node Func/_4}}]]
     [[PartitionedCall/gradients/StatefulPartitionedCall_grad/PartitionedCall/gradients/StatefulPartitionedCall_grad/SymbolicGradient]] [Op:__inference_distributed_function_28080]

Function call stack:
distributed_function

模型:

    def build_model(bert_layer, max_len=512):
        input_word_ids = Input(shape=(max_len,), dtype=tf.int32, name="input_word_ids")
        input_mask = Input(shape=(max_len,), dtype=tf.int32, name="input_mask")
        segment_ids = Input(shape=(max_len,), dtype=tf.int32, name="segment_ids")

        _, sequence_output = bert_layer([input_word_ids, input_mask, segment_ids])
        clf_output = sequence_output[:, 0, :]
        out = Dense(1, activation='sigmoid')(clf_output)
        
        model = Model(inputs=[input_word_ids, input_mask, segment_ids], outputs=out)
        model.compile(Adam(lr=2e-6), loss='categorical_crossentropy', metrics=['accuracy'])
        
        return model

从 TensorFlow hub 加载 BERT

    %%time
    module_url = "https://tfhub.dev/tensorflow/bert_multi_cased_L-12_H-768_A-12/2"
    bert_layer = hub.KerasLayer(module_url, trainable=True)

加载分词器和编码文本

    vocab_file = bert_layer.resolved_object.vocab_file.asset_path.numpy()
    do_lower_case = bert_layer.resolved_object.do_lower_case.numpy()
    tokenizer = tokenization.FullTokenizer(vocab_file, do_lower_case)

    train_input = bert_encode(train.text.values, tokenizer, max_len=160)
    test_input = bert_encode(test.text.values, tokenizer, max_len=160)
    train_labels = train['label']

    train_labels = to_categorical(np.asarray(train_labels.factorize()[0]))

    type(train_input)
    >> tuple

    type(train_labels)
    >> numpy.ndarray

运行模型

model = build_model(bert_layer, max_len=160)
model.summary()


 Model: "model"
__________________________________________________________________________________________________
Layer (type)                    Output Shape         Param #     Connected to                     
==================================================================================================
input_word_ids (InputLayer)     [(None, 160)]        0                                            
__________________________________________________________________________________________________
input_mask (InputLayer)         [(None, 160)]        0                                            
__________________________________________________________________________________________________
segment_ids (InputLayer)        [(None, 160)]        0                                            
__________________________________________________________________________________________________
keras_layer (KerasLayer)        [(None, 768), (None, 177853441   input_word_ids[0][0]             
                                                                 input_mask[0][0]                 
                                                                 segment_ids[0][0]                
__________________________________________________________________________________________________
tf_op_layer_strided_slice (Tens [(None, 768)]        0           keras_layer[0][1]                
__________________________________________________________________________________________________
dense (Dense)                   (None, 8)            6152        tf_op_layer_strided_slice[0][0]  
==================================================================================================
Total params: 177,859,593
Trainable params: 177,859,592
Non-trainable params: 1
__________________________

标签: tensorflownlp

解决方案


这是因为使用了旧版本的tfwith bert。我错过了回答这个问题的这个问题。


推荐阅读