首页 > 解决方案 > 如何调试 Keras ValueError:没有为任何变量提供渐变?

问题描述

我该如何处理此错误以及我的代码的哪一部分可能导致此错误?我尝试查找现有问题和 SO 线程,但它们大多指向失败的自定义损失计算层,但我使用的是 Keras 的内置损失。

我的代码:

dimension = 300
n_neighbors = 10
n_unique_candidate_pos = 500 # Defines number of unique positions a candidate can take in a given invoice. Hyper-parameter
running_first_time = True

tf.keras.backend.set_floatx('float64')
tf.keras.backend.clear_session()

for index, row in data.iterrows():
    
    print('Epoch {} started.'.format(index))
    
    image_path = row.pngFileLoc # getting path to image
    image_json = eval(row.json_file) # getting json
    
    image = Image.open(PATH_TO_DATA_FOLDER + image_path)

    # Getting actual class label    
    actual_invoice_date = row.InvDate

    # 1. Getting candidates
    candidate_gen = CandidateGenerator(image, row)
    candidates = candidate_gen.generate_date()
    
    for c in candidates:
        '''
        Feed each candidate through the model by icrementally training it.
        '''
        
        # 2. Getting neighbors
        neighbors_gen = Neighbors(image, row, n_neighbors = n_neighbors)
        neighbors = neighbors_gen.get_neighbors(c)
        
        # 3. vectorizing the neighbors
        vect = Vectorize(dimension=300)
        neighbors_embedded = vect.vectorize(neighbors)
        
        # 4. getting absolute candidate position
        #absolute_cand_pos = get_absolute_cand_pos(c, image_json)
        candidate_normalized_vertices = get_normalized_cand_vertices(c, image_json, image)
        absolute_candidate_pos = neighbors_gen.get_centroid(candidate_normalized_vertices)
        
        # 5. is already done
        
        # 6. Maxpooling the neighbors
        neighborhood_encoding = NeighborhoodEncoding()
        neighbors_encoded = neighborhood_encoding.encode_neighbors(np.array(neighbors_embedded))
            
        # Trainable layers start here
        
        # 7. Embedding candidate absolute position
        input_candidate_pos_layer = Input(shape=(2,), name='input_layer')
        embedding_candidate_pos_layer = Embedding(input_dim = n_unique_candidate_pos, output_dim = dimension // 2, name='candidate_position_embedding_layer')(input_candidate_pos_layer)
        #We will get [2, dimension/2] output above as we are using 2d co-ordinates so flatten it out into [dimension]
        flatten_cand_pos_layer = Flatten(name = 'flatten_candidate_position_embedding_layer')(embedding_candidate_pos_layer)

        # 8. Concatenate neighborhood encoding and candidate position embedding
        sliced_flatten_cand_pos_layer = SliceLayer(index=0)(flatten_cand_pos_layer)
        concat_neighbor_candidate = Concatenate(name='concat_neighbors_candpos_layer')([tf.convert_to_tensor(neighbors_encoded, dtype='float64'), sliced_flatten_cand_pos_layer]) #I honestly have no idea why it requires me to slice the tensor
        
        #reshaping this
        reshape_concat_neighbor = Reshape((1, ), input_shape=concat_neighbor_candidate.shape)(concat_neighbor_candidate)
        transposed_reshape_concat_neighbor = TransposeLayer()(reshape_concat_neighbor)
        
        # 9. Reduce dimensionality of candidate encoding
        dense_dim_reduce_layer = Dense(units = dimension, activation = 'relu', name='dense_dim_reduc_layer')(transposed_reshape_concat_neighbor)
        flatten_dense_dim_reduce_layer = Flatten(name='flatten_dense_dim_reduc_layer')(dense_dim_reduce_layer)

        # 10. Compute cosine similarity between field_embedding and candidate_encoding and 11. do sigmoid
        sliced_flatten_dense_dim_layer = SliceLayer(index=0)(flatten_dense_dim_reduce_layer)
        cosine_sim_layer = CosineSimilarityLayer(name='cosine_sim_layer')(sliced_flatten_dense_dim_layer, field_embedded[0])
        
        # 12. Compute loss
        #y_pred = Output(name='output_layer')(tf.convert_to_tensor(cosine_sim_layer, dtype='float64'))
        y_pred = Output(name='output_layer')(tf.convert_to_tensor([cosine_sim_layer], dtype='float64'))
        y_actual = int(actual_invoice_date == c)
        
        if running_first_time:
            model = Model(inputs = input_candidate_pos_layer, outputs = y_pred)
            model.compile(loss='binary_crossentropy')
            running_first_time = False
            print('model initialized successfully.')
            print(model.summary())
        
        model.fit([np.asarray([absolute_candidate_pos]).astype('float64'), np.asarray([y_actual]).astype('float64')])

它能做什么:

这是我调用 model.fit() 时引发的错误:

ValueError: No gradients provided for any variable: ['candidate_position_embedding_layer/embeddings:0', 'dense_dim_reduc_layer/kernel:0', 'dense_dim_reduc_layer/bias:0'].

标签: pythonkerasdeep-learning

解决方案


推荐阅读