首页 > 解决方案 > 释放内存时如何防止此段错误?

问题描述

注意:这个问题比我计划的要长一点。所以,总结一下:

  1. 这是一个“请帮我调试这段代码”的问题。
  2. 我的问题的第一部分是为什么if (ptr)在释放之前检查并不能阻止段错误
  3. 第二部分是我如何真正停止段错误

我首先呈现来自段错误的堆栈跟踪。然后我展示了我追溯的各种方法和类型定义。希望这个问题和我的思考过程很容易理解(和回答)。现在,对于实际问题......

我目前正在运行unittestsdeeplab-public-version2这是一个建立在Caffe. 当我运行其中一个时unittests,我得到以下段错误:

$ ./.build_release/test/test_layer_factory.testbin 
Cuda number of devices: 1
Current device id: 0
Current device name: GeForce GTX 1080 with Max-Q Design
[==========] Running 4 tests from 4 test cases.
[----------] Global test environment set-up.
[----------] 1 test from LayerFactoryTest/0, where TypeParam = caffe::CPUDevice<float>
[ RUN      ] LayerFactoryTest/0.TestCreateLayer
*** Aborted at 1580356126 (unix time) try "date -d @1580356126" if you are using GNU date ***
PC: @     0x7f74c84d298d cfree
*** SIGSEGV (@0xe28) received by PID 17549 (TID 0x7f74cacda680) from PID 3624; stack trace: ***
    @     0x7f74c883e890 (unknown)
    @     0x7f74c84d298d cfree
    @     0x7f74c8e00ff1 deallocate()
    @     0x7f74c8f3367c caffe::DenseCRFLayer<>::DeAllocateAllData()
    @     0x7f74c8f36738 caffe::DenseCRFLayer<>::~DenseCRFLayer()
    @     0x7f74c8f36c09 caffe::DenseCRFLayer<>::~DenseCRFLayer()
    @     0x55f92259abc7 caffe::LayerFactoryTest_TestCreateLayer_Test<>::TestBody()
    @     0x55f9225b892a testing::internal::HandleExceptionsInMethodIfSupported<>()
    @     0x55f9225b10ca testing::Test::Run()
    @     0x55f9225b11ac testing::TestInfo::Run()
    @     0x55f9225b12e5 testing::TestCase::Run()
    @     0x55f9225b17a0 testing::internal::UnitTestImpl::RunAllTests()
    @     0x55f9225b18e7 testing::UnitTest::Run()
    @     0x55f9225967c1 main
    @     0x7f74c845cb97 __libc_start_main
    @     0x55f922596bba _start
Segmentation fault (core dumped)

运行完整的单元测试套件时,我也遇到了这个错误。

查看堆栈跟踪,我相信问题是在解除分配DenseCRFLayer对象时出现的。deallocate()方法如下:

void deallocate(float*& ptr) {
  if (ptr)
#ifdef SSE_DENSE_CRF
    _mm_free( ptr );
#else
  delete[] ptr;
#endif
  ptr = NULL;
}

所以,我会认为该if (ptr)语句将确保代码只会尝试释放现有指针。怎么可能不是这样?

如果有任何帮助,该DeAllocateAllData方法如下所示:

template <typename Dtype>
void DenseCRFLayer<Dtype>::DeAllocateAllData() {
  deallocate(unary_);
  deallocate(current_);
  deallocate(next_);
  deallocate(tmp_);
}

其中,这 4 个变量的创建方式如下:

template <typename Dtype>
void DenseCRFLayer<Dtype>::AllocateAllData() {
  unary_   = allocate(unary_element_);
  current_ = allocate(unary_element_);
  next_    = allocate(unary_element_);
  tmp_     = allocate(unary_element_);  
}

分配如下:

float* allocate(size_t N) {
  float * r = NULL;
  if (N>0) {
#ifdef SSE_DENSE_CRF
    r = (float*)_mm_malloc( N*sizeof(float)+16, 16 );
#else
    r = new float[N];
#endif
  }

  memset( r, 0, sizeof(float)*N);
  return r;
}

在此方法中添加额外的检查以查看每个指针是否有资格进行释放是否有意义?

这里唯一令人费解的事情可能是一个促成因素,对象析构函数似乎被调用了两次(!?)。所以,如果它有任何相关性,这里是析构函数方法

template <typename Dtype>
DenseCRFLayer<Dtype>::~DenseCRFLayer() {
  ClearPairwiseFunctions();
  DeAllocateAllData();
}

template <typename Dtype>
void DenseCRFLayer<Dtype>::ClearPairwiseFunctions() {
  for (size_t i = 0; i < pairwise_.size(); ++i) {
    delete pairwise_[i];
  }
  pairwise_.clear();
}

其中pairwise具有以下类型:

std::vector<PairwisePotential*> pairwise_;

PairwisePotential 是以下虚拟类:

class PairwisePotential {
 public:
  virtual ~PairwisePotential();
  virtual void apply(float * out_values, const float * in_values, float * tmp, int value_size) const = 0;
};

最后,单元测试方法是这样的:

namespace caffe {

template <typename TypeParam>
class LayerFactoryTest : public MultiDeviceTest<TypeParam> {};

TYPED_TEST_CASE(LayerFactoryTest, TestDtypesAndDevices);

TYPED_TEST(LayerFactoryTest, TestCreateLayer) {
  typedef typename TypeParam::Dtype Dtype;
  typename LayerRegistry<Dtype>::CreatorRegistry& registry =
      LayerRegistry<Dtype>::Registry();
  shared_ptr<Layer<Dtype> > layer;
  for (typename LayerRegistry<Dtype>::CreatorRegistry::iterator iter =
       registry.begin(); iter != registry.end(); ++iter) {
    // Special case: PythonLayer is checked by pytest
    if (iter->first == "Python") { continue; }
    LayerParameter layer_param;
    // Data layers expect a DB
    if (iter->first == "Data") {
#ifdef USE_LEVELDB
      string tmp;
      MakeTempDir(&tmp);
      boost::scoped_ptr<db::DB> db(db::GetDB(DataParameter_DB_LEVELDB));
      db->Open(tmp, db::NEW);
      db->Close();
      layer_param.mutable_data_param()->set_source(tmp);
#else
      continue;
#endif  // USE_LEVELDB
    }
    layer_param.set_type(iter->first);
    layer = LayerRegistry<Dtype>::CreateLayer(layer_param);
    EXPECT_EQ(iter->first, layer->type());
  }
}

}  // namespace caffe

注意:我不太明白这个对象是如何构造的:当我查看hpp文件时,我看到了:

/**
 * @brief The DenseCRF layer performs mean-field inference under a
 *  fully-connected CRF model with Gaussian potentials.
 *
 */
template <typename Dtype>
class DenseCRFLayer : public Layer<Dtype> {
 public:
  explicit DenseCRFLayer(const LayerParameter& param)
      : Layer<Dtype>(param) {}
  virtual ~DenseCRFLayer();

所以,我认为没有使用默认构造函数。然而,当我grep在父目录中执行以下操作时,我看到了这个输出,它没有显示非默认构造函数的证据:

$ grep -r 'DenseCRFLayer(' .
./include/caffe/layers/densecrf_layer.hpp:  explicit DenseCRFLayer(const LayerParameter& param)
./include/caffe/layers/densecrf_layer.hpp:  virtual ~DenseCRFLayer();
./src/caffe/layers/densecrf_layer.cpp:DenseCRFLayer<Dtype>::~DenseCRFLayer() {
$

我看到声明构造函数和析构函数的位置,但我只看到定义的析构函数......

标签: c++pointersmemory-management

解决方案


推荐阅读