为球体的体积加速代码(嵌套while循环)



我正在努力加快计算球体体积的代码(请参阅下面的代码(。球体的这个体积是通过计算小体积段dv并将它们相加为体积vol.而产生的

事实上,在我将计算应用于其他具有对称属性的类似球体的对象之前,这段代码只是一次健全性检查,因此我应该能够通过在小体积上计算并乘以最终结果来提高代码的速度。

将while(phid<=(360.0/adstep((和while(thetad<=(180.0/adsstep((中的360和180分别替换为180和90,您可以将所需的计算量四分之一,这意味着您可以简单地将最终体积乘以4.0。

如果我将phi设置为并将theta保持在180,这将使计算减半。但它不喜欢我把θ设置为90。

输出:

Phi 360, Theta 180
Actual Volume       Calculated Volume   % Difference
4.18879020478639053 4.18878971565348923 0.00001167718922403
Phi 180, Theta 180
4.18879020478639053 4.18878971565618219 0.00001167712493440
Phi 180, Theta 90
4.18879020478639053 4.18586538829648180 0.06987363946500515

你可以在上面看到,前两个计算几乎相同(我认为差异是由于精度误差(,而最后一个计算给出了明显不同的结果。嵌套循环会导致问题吗?

如果有任何帮助,我将不胜感激,因为我在研究中没有发现任何东西(谷歌和堆栈溢出(来描述我遇到的问题。

#include <iostream>
#include <iomanip>
#include <cmath>
using namespace std;
int main()
{
double thetar, phir, maxr, vol, dv, vol2, arstep, adstep, rad, rad3, thetad, phid, ntheta, nphi;
cout << fixed << setprecision(17); // Set output precision to defined number of decimal places. Note Double has up to 15 decimal place accuracy
vol=0.0;            // Initialise volume and set at zero
adstep=0.1;     // Steps to rotate angles in degrees
arstep=(adstep/180.0)*M_PI;     // Angle steps in radians
phid=1.0;           // Phi in degrees starting at adstep
maxr = 1.0;         // Radius of the sphere
// Loop to calculate volume
while (phid<=(360.0/adstep))            // Loop over Phi divided by adstep. This scales the loop to the desired number of calculations.
{
    phir=((phid*adstep)/180.0)*M_PI;        // Phi in radians
    thetad=1.0;                             // Theta in degrees, reset to initial adstep value
    while (thetad<=(180.0/adstep))          // Loop over Theta divided by adstep. Like Phi loop, this scales the loop to the desired number of calculations
    {
        thetar=((thetad*adstep)/180.0)*M_PI;    // Convert theta degrees to radians
        dv = ((maxr*maxr*maxr) * sin(thetar) * arstep * arstep) / 3.0;      // Volume of current segment
        vol += dv;      // Summing all the dv value together to generate a global volume
        thetad+=1.0;    // Increase theta (degrees) by a single step
    }
    phid+=1.0;      // Increase phi (degrees) by a single step
}
vol = vol*1.0; // Volume compensated for any reduction in phi and theta
rad3 = (3.0*vol)/(4.0*M_PI);    // volume equivalent radius^3
rad = pow(rad3,(1.0/3.0));      // volume equivalent radius
vol2 = (4.0/3.0)*M_PI*(maxr*maxr*maxr);     // Calculated volume of a sphere given initial maxr
// Diagnostic output
cout << vol2 << " " << vol << " " << ((vol2-vol)/vol)*100.0 << endl;
}

编辑:将phid和thetad的起始值更正为1.0

编辑2:我只是想为未来的观众更新一下,使用Kahan求和算法(https://en.wikipedia.org/wiki/Kahan_summation_algorithm)实际上已经否定了我所有的精度误差,因为我把一个小数字和一个大数字相加。还有其他方法,但这是最简单的方法之一,可以完成我需要的工作。对于子孙后代来说,这是从维基百科页面上截取的主题为:的伪代码示例

function KahanSum(input)
 var sum = 0.0
 var c = 0.0                  // A running compensation for lost low-order bits.
 for i = 1 to input.length do
    var y = input[i] - c     // So far, so good: c is zero.
    var t = sum + y          // Alas, sum is big, y small, so low-order digits of y are lost.
    c = (t - sum) - y // (t - sum) recovers the high-order part of y; subtracting y recovers -(low part of y)
    sum = t           // Algebraically, c should always be zero. Beware overly-aggressive optimizing compilers!
    // Next time around, the lost low part will be added to y in a fresh attempt.
 return sum

就速度而言,我怀疑(在没有对其进行分析的情况下(在弧度和度数之间转换以及计算所有sin s时浪费了大量时间。AFAICT、thetar在外循环的每次迭代中都通过相同的值进行循环,因此在主循环之前预计算一次sin(thetar)可能会更有效,在你的内循环中做一个简单的查找。

至于数值稳定性,随着voldv越来越大,随着时间的推移,你会开始失去越来越多的精度。原则上,如果你能把所有的dv存储在一个数组中,然后用分治法而不是线性传递求和,你会得到更好的结果。然后我再次计算(仅(6 480 000的总迭代,所以我认为double累加器(保持15-17个有效基数-10位(实际上可以处理丢失6-7位的数字,而不会有太大麻烦。

很可能,您的问题是:在需要之前退出循环1迭代。您不应该比较浮点数是否相等。一种快速解决方法是添加一个小常数,例如

而(thetad<(180.0/adstep(+1e-8(

这不是一个非常彻底的分析,但可能会让您深入了解错误的来源。在您的代码中,您正在累积3240000个浮点数的值。随着vol值的增加,dvvol之间的比率也在增加——在加法中,你会失去越来越多的精度。

将多个值累积为单个值(称为归约和(时,减少精度损失的标准方法是在块中执行加法:例如,您可以将每8个值相加并存储到一个数组中,然后将该数组的每8个数值相加,等等,直到只剩下一个值。这可能会给你带来更好的结果。

此外,值得考虑的是,您在球面上进行线性步进,因此不会对球体进行均匀采样。这可能会影响你的最终结果。均匀采样球体的一种方法是在方位角phi从0到360度之间采取线性步长,并在极角theta的范围内采取-1到1的acos。有关更详细的解释,请参见有关球体点拾取的链接。

首先,我认为,您的函数中有几个错误。我认为phidthetad都应该初始化为01.0

其次,通过减少浮点乘法的次数,您可以获得相当大的收益。

在下面的代码中,我将main函数的内容移动到volume1,并创建了一个函数volume2,其中包含稍微优化的代码。

#include <iostream>
#include <iomanip>
#include <cmath>
#include <ctime>
using namespace std;
void volume1(int numSteps)
{
   double thetar, phir, maxr, vol, dv, vol2, arstep, adstep, rad, rad3, thetad, phid, ntheta, nphi;
   cout << fixed << setprecision(17); // Set output precision to defined number of decimal places. Note Double has up to 15 decimal place accuracy
   vol=0.0;            // Initialise volume and set at zero
   adstep=360.0/numSteps;     // Steps to rotate angles in degrees
   arstep=(adstep/180.0)*M_PI;     // Angle steps in radians
   phid=1.0;            // Phi in degrees starting at adstep
   maxr = 1.0;         // Radius of the sphere
   // Loop to calculate volume
   while (phid<=(360.0/adstep))            // Loop over Phi divided by adstep. This scales the loop to the desired number of calculations.
   {
      phir=((phid*adstep)/180.0)*M_PI;        // Phi in radians
      thetad=1.0;                              // Theta in degrees, reset to initial adstep value
      while (thetad<=(180.0/adstep))          // Loop over Theta divided by adstep. Like Phi loop, this scales the loop to the desired number of calculations
      {
         thetar=((thetad*adstep)/180.0)*M_PI;    // Convert theta degrees to radians
         dv = ((maxr*maxr*maxr) * sin(thetar) * arstep * arstep) / 3.0;      // Volume of current segment
         vol += dv;      // Summing all the dv value together to generate a global volume
         thetad+=1.0;    // Increase theta (degrees) by a single step
      }
      phid+=1.0;      // Increase phi (degrees) by a single step
   }
   vol = vol*1.0; // Volume compensated for any reduction in phi and theta
   rad3 = (3.0*vol)/(4.0*M_PI);    // volume equivalent radius^3
   rad = pow(rad3,(1.0/3.0));      // volume equivalent radius
   vol2 = (4.0/3.0)*M_PI*(maxr*maxr*maxr);     // Calculated volume of a sphere given initial maxr
   // Diagnostic output
   cout << vol2 << " " << vol << " " << ((vol2-vol)/vol)*100.0 << endl << endl;
}
void volume2(int numSteps)
{
   double thetar, maxr, vol, vol2, arstep, adstep, rad, rad3, thetad, phid, ntheta, nphi;
   cout << fixed << setprecision(17); // Set output precision to defined number of decimal places. Note Double has up to 15 decimal place accuracy
   vol=0.0;            // Initialise volume and set at zero
   adstep = 360.0/numSteps;
   arstep=(adstep/180.0)*M_PI;     // Angle steps in radians
   maxr = 1.0;         // Radius of the sphere
   double maxRCube = maxr*maxr*maxr;
   double arStepSquare = arstep*arstep;
   double multiplier = maxRCube*arStepSquare/3.0;
   // Loop to calculate volume
   int step = 1;
   for ( ; step <= numSteps; ++step )
   {
      int numInnerSteps = numSteps/2;
      thetad = adstep;                              // Theta in degrees, reset to initial adstep value
      for ( int innerStep = 1; innerStep <= numInnerSteps; ++innerStep )
      {
         thetar = innerStep*arstep;
         vol += multiplier * sin(thetar);          // Volume of current segment
      }
   }
   vol = vol*1.0; // Volume compensated for any reduction in phi and theta
   rad3 = (3.0*vol)/(4.0*M_PI);    // volume equivalent radius^3
   rad = pow(rad3,(1.0/3.0));      // volume equivalent radius
   vol2 = (4.0/3.0)*M_PI*(maxr*maxr*maxr);     // Calculated volume of a sphere given initial maxr
   // Diagnostic output
   cout << vol2 << " " << vol << " " << ((vol2-vol)/vol)*100.0 << endl << endl;
}
int main()
{
   int numSteps = 3600;
   clock_t start = clock();
   volume1(numSteps);
   clock_t end1 = clock();
   volume2(numSteps);
   clock_t end2 = clock();
   std::cout << "CPU time used: " << 1000.0 * (end1-start) / CLOCKS_PER_SEC << " msn";
   std::cout << "CPU time used: " << 1000.0 * (end2-end1) / CLOCKS_PER_SEC << " msn";
}

我得到的输出,使用g++4.7.3:

4.18879020478639053 4.1876255889293564 0.027810887858111534.18879020478639053 4.18878914146923176 0.00002538483372773使用的CPU时间:639.000000000000000000毫秒使用的CPU时间:359.000000000000000000毫秒

这会让你提高大约44%。

最新更新