如何正确处理浮点与openmp并行for



经过几个小时的调试和小规模的例子,试图找到我做错了什么,我终于得到了我的一个理论证实。浮点舍入误差。浮点计算在计算机中是非关联的

如何减少错误?

代码:

double energy = 0;
double contrast = 0;
double homogeneity = 0;
double entropy = 0;
double correlation = 0;
double shade = 0;
double prominence = 0;
double glcmMean = 0;
double sigma = 0;
double squaredVarianceIntensity = 0;
double A = 0;
double B = 0;

for(int c = 0; c <normalizedGlcm.cols; c++){
#pragma omp parallel for reduction(+:homogeneity,energy,contrast,entropy,glcmMean)
for(int r = 0; r<normalizedGlcm.rows; r++){
double pij = normalizedGlcm.at<double>(r,c,0);
double intensity = (double)img.at<uchar>(col,row,0);
if(pij != 0){
homogeneity += pij/(1.0+((c-r)*(c-r)));
energy += pij * pij;
contrast += (c-r)*(c-r)*pij;
entropy += -(log(pij)*pij); // pij will never be under 0
glcmMean += pij * intensity;
}
}
}

位之后有更多的循环和一些其他的计算与glcmMean变量。到目前为止,我只得到了glcmMean变量的错误。错误示例:

serial     -      parrallel
1.66905e+28 vs 1.55964e+30
4.09033e+28 vs 3.62704e+30
8.38877e+30 vs 3.35551e+31

根据注释中的信息,您可以简单地尝试下面的代码来累积glcmMean,以便累积的值与初始代码中的值在同一个数量级上。这假设normalizedGlcm.colsnormalizedGlcm.rows相对接近(例如。例如,不是2和2000)。

double energy = 0;
double contrast = 0;
double homogeneity = 0;
double entropy = 0;
double correlation = 0;
double shade = 0;
double prominence = 0;
double glcmMean = 0;
double sigma = 0;
double squaredVarianceIntensity = 0;
double A = 0;
double B = 0;
for(int c = 0; c <normalizedGlcm.cols; c++){
double local_homogeneity = 0;
double local_energy = 0;
double local_contrast = 0;
double local_entropy = 0;
double local_glcmMean = 0;
#pragma omp parallel for reduction(+:local_homogeneity,local_energy,local_contrast,local_entropy,local_glcmMean)
for(int r = 0; r<normalizedGlcm.rows; r++){
double pij = normalizedGlcm.at<double>(r,c,0);
double intensity = (double)img.at<uchar>(col,row,0);
if(pij != 0){
local_homogeneity += pij/(1.0+((c-r)*(c-r)));
local_energy += pij * pij;
local_contrast += (c-r)*(c-r)*pij;
local_entropy += -(log(pij)*pij); // pij will never be under 0
local_glcmLocalMean += pij * intensity;
}
}
homogeneity += local_homogeneity;
energy += local_energy;
contrast += local_contrast;
entropy += local_entropy;
glcmMean += local_glcmMean;
}

如果问题是由于FP饱和/不精确,那么它应该大大提高结果的准确性,特别是在顺序中。

最新更新