tensorflow - Deeplabv3,可视化期间从检查点恢复失败
问题描述
我正在使用 deeplabv3 和 xception_71 作为主干训练自定义数据集。我可以毫无问题地训练模型,但无法使用 vis.py 对其进行可视化。似乎问题在于恢复权重。但我不知道为什么,有什么想法吗?谢谢!
火车配置:
python deeplab/train.py \
--logtostderr \
--num_clones=1 \
--training_number_of_steps=50000 \
--learning_rate_decay_step=500 \
--train_split="train" \
--model_variant="xception_71" \
--atrous_rates=6 \
--atrous_rates=12 \
--atrous_rates=18 \
--output_stride=16 \
--decoder_output_stride=4 \
--train_crop_size="321,321" \
--train_batch_size=5\
--dataset="mydata" \
--fine_tune_batch_norm=False \
--initialize_last_layer=False \
--last_layers_contain_logits_only=True\
--tf_initial_checkpoint="/mnt/af13d04a-f110-4c8a-a92f-a25808fe65a6/yukming/Tensorflow_ecg_deeplab/deeplab/datasets/xception_71/model.ckpt" \
--train_logdir="/mnt/af13d04a-f110-4c8a-a92f-a25808fe65a6/yukming/Tensorflow_ecg_deeplab/train_log5" \
--dataset_dir='/mnt/af13d04a-f110-4c8a-a92f-a25808fe65a6/yukming/Tensorflow_ecg_deeplab/data/tfrecord'
vis.py 配置
python deeplab/vis.py \
--logtostderr \
--vis_split="val" \
--model_variant="xception_71" \
--atrous_rates=6 \
--atrous_rates=12 \
--atrous_rates=18 \
--output_stride=16 \
--decoder_output_stride=4 \
--vis_crop_size="513,513" \
--dataset="mydata" \
--colormap_type="pascal" \
--checkpoint_dir="/mnt/af13d04a-f110-4c8a-a92f-a25808fe65a6/yukming/Tensorflow_ecg_deeplab/train_log5" \
--vis_logdir="/mnt/af13d04a-f110-4c8a-a92f-a25808fe65a6/yukming/Tensorflow_ecg_deeplab/vis_log5" \
--dataset_dir="/mnt/af13d04a-f110-4c8a-a92f-a25808fe65a6/yukming/Tensorflow_ecg_deeplab/data/tfrecord"
错误代码:
tensorflow.python.framework.errors_impl.InvalidArgumentError: Restoring from checkpoint failed. This is most likely due to a mismatch between the current graph and the graph from the checkpoint. Please ensure that you have not altered the graph expected based on the checkpoint. Original error:
2 root error(s) found.
(0) Invalid argument: Assign requires shapes of both tensors to match. lhs shape= [1,1,1280,256] rhs shape= [1,1,2048,256]
[[node save/Assign_39 (defined at /home/medicine/Envs/tf_segmentation/lib/python3.6/site-packages/tensorflow_core/python/framework/ops.py:1748) ]]
[[save/RestoreV2/_720]]
(1) Invalid argument: Assign requires shapes of both tensors to match. lhs shape= [1,1,1280,256] rhs shape= [1,1,2048,256]
[[node save/Assign_39 (defined at /home/medicine/Envs/tf_segmentation/lib/python3.6/site-packages/tensorflow_core/python/framework/ops.py:1748) ]]
0 successful operations.
0 derived errors ignored.