内存泄漏可能是由于递归函数造成的



我在 c++ 中实现了随机森林,我通过 mex 在 matlab 中运行。它运行平稳,直到到达下面的功能,在那里它被卡住并开始消耗内存,直到计算机冻结。

void MyFunction(
  const IDataPointCollection& data,
  std::vector<std::vector<int> >& leafNodeIndices,
  ProgressStream* progress=0 ) const
{
  ProgressStream defaultProgressStream(std::cout, Interest);
  progress = (progress==0)?&defaultProgressStream:progress;
  leafNodeIndices.resize(TreeCount());
  tbb::parallel_for<int>(0,TreeCount(),[&](int t)
  {
    leafNodeIndices[t].resize(data.Count());
    (*progress)[Interest] << "rApplying tree " << t << "...";
    trees_[t]->Apply(data, leafNodeIndices[t]);
  });
  (*progress)[Interest] << "STUCK HERE" << std::endl;
  return;
}

通过上面trees_[t]->Apply()的代码,我能够将其缩小到下面的递归函数:

void ApplyNode(
  int nodeIndex,
  const IDataPointCollection& data,
  std::vector<unsigned int>& dataIndices,
  int i0,
  int i1,
  std::vector<int>& leafNodeIndices,
  std::vector<float>& responses_)
{
std::cout<<"applying node"<<std::endl;
  assert(nodes_[nodeIndex].IsNull()==false);
  Node<F,S>& node = nodes_[nodeIndex];
  if (node.IsLeaf())
  {
    for (int i = i0; i < i1; i++)
      leafNodeIndices[dataIndices[i]] = nodeIndex;
    return;
  }
  else if (i0 == i1)   // No samples left
    return;
  else 
  {
        for (int i = i0; i < i1; i++)
            responses_[i] = node.Feature.GetResponse(data, dataIndices[i]);
        int ii = Partition(responses_, dataIndices, i0, i1, node.Threshold);
        // Recurse for child nodes.
        ApplyNode(nodeIndex * 2 + 1, data, dataIndices, i0, ii, leafNodeIndices, responses_);
        ApplyNode(nodeIndex * 2 + 2, data, dataIndices, ii, i1, leafNodeIndices, responses_);
        return;
  }
    }

对递归函数的每次调用都有不同的计算时间node.Feature.GetResponse()具体取决于函数。如果我使所有递归调用的计算时间相同(通过更改GetResponse()),代码运行流畅。

    float AxisAlignedFeatureResponse::GetResponse(const IDataPointCollection& data, int index) const {
    double retArg;
    // retrieve DataManager object
    const DataManager& concreteData = (const DataManager&)(data);
//     // retrieve data point at index 
    DataPoint currDataPoint         = concreteData.getDataPoint(index);
//     
//     // get coordinates of data point
    Coordinate currCoordinates      = currDataPoint.getOrigPos();
//     
//     // get intensity image of the respective data point
    int imgIndex    = currDataPoint.getImageIndex();  
    Image currImg   = concreteData.getImage(imgIndex);    
    Image currFeatureImg = concreteData.getFeatureImage(imgIndex);
    // return respective feature
    int featureNumber = (int)(this->axis*(double)concreteData.getNumberOfFeatures());    
    if(featureNumber>=concreteData.getNumberOfFeatures()){
        cout<<"warning! trying to reach a feature that is not there!"<<endl;
        featureNumber=concreteData.getNumberOfFeatures()-1;
    }
    std::vector<Coordinate> feature = concreteData.getFeature(featureNumber);
    Coordinate tmp=currCoordinates+feature[0];
    if(feature[1].x == 0) {
        retArg = currCoordinates.x*feature[0].x+currCoordinates.y*feature[0].y+currCoordinates.z*feature[0].z;
    //retArg = 0;      //DOING THIS runs the code smoothly
    } 
    else if(feature[1].x == 2) {
       retArg = currFeatureImg.getValue(feature[0]);
    } else {
        retArg = currImg.mean(tmp,feature[1]);
    }
    return (float)(retArg);   
        //return (float) 0;
}

这看起来像是瓦尔格林德的工作。除非您的程序突然终止,否则我看不到内存泄漏的任何原因。内存泄漏通常是由堆变量未释放引起的("new"后面没有后面的"删除")。

Valgrind 会给你一个罪魁祸首和行号,你可以设置断点并使用 gdb 单步执行,看看到底发生了什么。也许发布运行"valgrind -v your_program"的详细输出。不要忘记使用 -g 选项进行编译,以便为 valgrind 和 gdb 提供完整的调试数据。

最新更新