首页 > 技术文章 > 重现ssd遇到的问题

benbencoding798 2018-04-22 17:34 原文

首先是create_list.sh和create_data.sh中的data_dir的路径得修改.

然后是在create_data.sh文件调用$caffe_root下的scripts目录中的create_annoset.py时产生的错误:

Traceback (most recent call last):
File "/opt/xuben-project/caffe/data/VOC0712/../../scripts/create_annoset.py", line 105, in <module>
label_map = caffe_pb2.LabelMap()
AttributeError: 'module' object has no attribute 'LabelMap'
Traceback (most recent call last):
File "/opt/xuben-project/caffe/data/VOC0712/../../scripts/create_annoset.py", line 105, in <module>
label_map = caffe_pb2.LabelMap()
AttributeError: 'module' object has no attribute 'LabelMap'

原因应该是没有加入PYTHONPATH路径.

参考网址:https://github.com/manutdzou/KITTI_SSD/issues/5

参考这个网址加入PYTHONPATH:https://blog.csdn.net/jasonzzj/article/details/53941147

这里我选择的是在~/.bashrc文件中加入PYTHONPATH,由于我的计算机中还有其他目录中含有caffe工程,所以下次我用别的caffe目录,可能需要修改PYTHONPATH的路径.

接下来就是正常处理.日志如下

/opt/xuben-project/caffe/build/tools/convert_annoset --anno_type=detection --label_type=xml --label_map_file=/opt/xuben-project/caffe/data/VOC0712/../../data/VOC0712/labelmap_voc.prototxt --check_label=True --min_dim=0 --max_dim=0 --resize_height=0 --resize_width=0 --backend=lmdb --shuffle=False --check_size=False --encode_type=jpg --encoded=True --gray=False /opt/xuben-data/VOCdevkit/ /opt/xuben-project/caffe/data/VOC0712/../../data/VOC0712/test.txt /opt/xuben-data/VOCdevkit/VOC0712/lmdb/VOC0712_test_lmdb
I0422 17:20:58.777124 25860 convert_annoset.cpp:122] A total of 4952 images.
I0422 17:20:58.777395 25860 db_lmdb.cpp:35] Opened lmdb /opt/xuben-data/VOCdevkit/VOC0712/lmdb/VOC0712_test_lmdb
I0422 17:21:03.382318 25860 convert_annoset.cpp:195] Processed 1000 files.
I0422 17:21:07.988387 25860 convert_annoset.cpp:195] Processed 2000 files.
I0422 17:21:12.813705 25860 convert_annoset.cpp:195] Processed 3000 files.
I0422 17:21:17.298377 25860 convert_annoset.cpp:195] Processed 4000 files.
I0422 17:21:22.664110 25860 convert_annoset.cpp:201] Processed 4952 files.
link_dir:examples/VOC0712/VOC0712_test_lmdb
/opt/xuben-project/caffe/build/tools/convert_annoset --anno_type=detection --label_type=xml --label_map_file=/opt/xuben-project/caffe/data/VOC0712/../../data/VOC0712/labelmap_voc.prototxt --check_label=True --min_dim=0 --max_dim=0 --resize_height=0 --resize_width=0 --backend=lmdb --shuffle=False --check_size=False --encode_type=jpg --encoded=True --gray=False /opt/xuben-data/VOCdevkit/ /opt/xuben-project/caffe/data/VOC0712/../../data/VOC0712/trainval.txt /opt/xuben-data/VOCdevkit/VOC0712/lmdb/VOC0712_trainval_lmdb
I0422 17:21:23.231978 25883 convert_annoset.cpp:122] A total of 16551 images.
I0422 17:21:23.232414 25883 db_lmdb.cpp:35] Opened lmdb /opt/xuben-data/VOCdevkit/VOC0712/lmdb/VOC0712_trainval_lmdb
I0422 17:21:58.782371 25883 convert_annoset.cpp:195] Processed 1000 files.
I0422 17:22:39.531497 25883 convert_annoset.cpp:195] Processed 2000 files.
I0422 17:23:21.844856 25883 convert_annoset.cpp:195] Processed 3000 files.
I0422 17:24:00.439805 25883 convert_annoset.cpp:195] Processed 4000 files.
I0422 17:24:36.319861 25883 convert_annoset.cpp:195] Processed 5000 files.
I0422 17:25:12.599020 25883 convert_annoset.cpp:195] Processed 6000 files.
I0422 17:25:52.925842 25883 convert_annoset.cpp:195] Processed 7000 files.
I0422 17:26:35.024026 25883 convert_annoset.cpp:195] Processed 8000 files.
I0422 17:27:20.739751 25883 convert_annoset.cpp:195] Processed 9000 files.
I0422 17:28:06.118722 25883 convert_annoset.cpp:195] Processed 10000 files.
I0422 17:28:45.578575 25883 convert_annoset.cpp:195] Processed 11000 files.
I0422 17:29:17.399873 25883 convert_annoset.cpp:195] Processed 12000 files.

I0422 17:29:56.108283 25883 convert_annoset.cpp:195] Processed 13000 files.
I0422 17:30:34.113029 25883 convert_annoset.cpp:195] Processed 14000 files.
I0422 17:31:14.184615 25883 convert_annoset.cpp:195] Processed 15000 files.
I0422 17:31:54.871651 25883 convert_annoset.cpp:195] Processed 16000 files.
I0422 17:32:17.246522 25883 convert_annoset.cpp:201] Processed 16551 files.
link_dir:examples/VOC0712/VOC0712_trainval_lmdb

这样就顺利完成了Preparation阶段.

 

2.在Train/Eval阶段,第一步报错:

F0422 20:59:11.852633 4724 syncedmem.cpp:56] Check failed: error == cudaSuccess (2 vs. 0) out of memory
*** Check failure stack trace: ***
@ 0x7f1c0a97f5cd google::LogMessage::Fail()
@ 0x7f1c0a981433 google::LogMessage::SendToLog()
@ 0x7f1c0a97f15b google::LogMessage::Flush()
@ 0x7f1c0a981e1e google::LogMessageFatal::~LogMessageFatal()
@ 0x7f1c0b0e52c0 caffe::SyncedMemory::to_gpu()
@ 0x7f1c0b0e4289 caffe::SyncedMemory::mutable_gpu_data()
@ 0x7f1c0b278f12 caffe::Blob<>::mutable_gpu_data()
@ 0x7f1c0b2e3b18 caffe::CuDNNConvolutionLayer<>::Forward_gpu()
@ 0x7f1c0b0a91b2 caffe::Net<>::ForwardFromTo()
@ 0x7f1c0b0a92d7 caffe::Net<>::Forward()
@ 0x7f1c0b28f960 caffe::Solver<>::Step()
@ 0x7f1c0b2903ee caffe::Solver<>::Solve()
@ 0x40b9c4 train()
@ 0x407590 main
@ 0x7f1c098ef830 __libc_start_main
@ 0x407db9 _start
@ (nil) (unknown)
Aborted (core dumped)

这个可能是由于GPU显存小而产生的报错.

参考网址:https://github.com/BVLC/caffe/issues/5353

在ssd_pascal.py这个训练文件中,修改这两个变量后可以顺利训练.

batch_size = 8
accum_batch_size = 16

 

推荐阅读