首页 > 解决方案 > Deeplabv3,可视化期间从检查点恢复失败

问题描述

我正在使用 deeplabv3 和 xception_71 作为主干训练自定义数据集。我可以毫无问题地训练模型,但无法使用 vis.py 对其进行可视化。似乎问题在于恢复权重。但我不知道为什么,有什么想法吗?谢谢!

火车配置:

python deeplab/train.py \
    --logtostderr \
    --num_clones=1 \
    --training_number_of_steps=50000 \
    --learning_rate_decay_step=500 \
    --train_split="train" \
    --model_variant="xception_71" \
    --atrous_rates=6 \
    --atrous_rates=12 \
    --atrous_rates=18 \
    --output_stride=16 \
    --decoder_output_stride=4 \
   --train_crop_size="321,321" \
    --train_batch_size=5\
    --dataset="mydata" \
    --fine_tune_batch_norm=False \
    --initialize_last_layer=False \
    --last_layers_contain_logits_only=True\
    --tf_initial_checkpoint="/mnt/af13d04a-f110-4c8a-a92f-a25808fe65a6/yukming/Tensorflow_ecg_deeplab/deeplab/datasets/xception_71/model.ckpt" \
    --train_logdir="/mnt/af13d04a-f110-4c8a-a92f-a25808fe65a6/yukming/Tensorflow_ecg_deeplab/train_log5" \
    --dataset_dir='/mnt/af13d04a-f110-4c8a-a92f-a25808fe65a6/yukming/Tensorflow_ecg_deeplab/data/tfrecord'

vis.py 配置

python deeplab/vis.py \
    --logtostderr \
    --vis_split="val" \
    --model_variant="xception_71" \
    --atrous_rates=6 \
    --atrous_rates=12 \
    --atrous_rates=18 \
    --output_stride=16 \
    --decoder_output_stride=4 \
    --vis_crop_size="513,513" \
    --dataset="mydata" \
    --colormap_type="pascal" \
    --checkpoint_dir="/mnt/af13d04a-f110-4c8a-a92f-a25808fe65a6/yukming/Tensorflow_ecg_deeplab/train_log5" \
    --vis_logdir="/mnt/af13d04a-f110-4c8a-a92f-a25808fe65a6/yukming/Tensorflow_ecg_deeplab/vis_log5" \
    --dataset_dir="/mnt/af13d04a-f110-4c8a-a92f-a25808fe65a6/yukming/Tensorflow_ecg_deeplab/data/tfrecord"

错误代码:

tensorflow.python.framework.errors_impl.InvalidArgumentError: Restoring from checkpoint failed. This is most likely due to a mismatch between the current graph and the graph from the checkpoint. Please ensure that you have not altered the graph expected based on the checkpoint. Original error:

2 root error(s) found.
  (0) Invalid argument: Assign requires shapes of both tensors to match. lhs shape= [1,1,1280,256] rhs shape= [1,1,2048,256]
         [[node save/Assign_39 (defined at /home/medicine/Envs/tf_segmentation/lib/python3.6/site-packages/tensorflow_core/python/framework/ops.py:1748) ]]
         [[save/RestoreV2/_720]]
  (1) Invalid argument: Assign requires shapes of both tensors to match. lhs shape= [1,1,1280,256] rhs shape= [1,1,2048,256]
         [[node save/Assign_39 (defined at /home/medicine/Envs/tf_segmentation/lib/python3.6/site-packages/tensorflow_core/python/framework/ops.py:1748) ]]
0 successful operations.
0 derived errors ignored.

标签: tensorflow

解决方案


推荐阅读