使用标准 C 数学库精确计算朗伯 W 函数的主分支



Lambert W函数是f(w)=我们w的反函数。它是一个多值函数,在复数上具有无限多个分支,但在实数上只有两个分支,用 W0W-1表示。W0被视为主分支,具有输入 域 [-1/e, ∞] 和 W-1 的输入域为 (-1/e, 0)。相应的实现通常称为lambert_w0()lambert_wm1().

该函数的近亲首先由伦纳德·欧拉[1]在跟进约翰·海因里希·兰伯特[2]的工作时确定。欧拉研究了超越方程 x α-x β = (α - β) v x α+β 的解,并在此过程中考虑了一个简化的情况 ln x= v xα。在此过程中,他引入了一个辅助函数,其以下序列扩展在零附近:

y = 1 + (2 1/(1·2))u + (3 2/(1·2·3))u 2 + (4 3/(1·2·3·4))u3+ (54/((1·2... ·5))u4+ ...

在现代术语中,这个函数(未被高斯命名)表示W(-x)/x,lnx= v xα的解是 x=(-W(-αv)/-αv)(1/α)。

虽然Lambert W函数偶尔出现在文献中,例如[3]直到1990年代Robert Corless的开创性工作,它才被命名并被认为是一个重要的构建块,例如[4]。随后,通过持续的研究,朗伯W函数对数学和物理科学的适用性得到了扩展,[5]中给出了一些例子。

Lambert W 函数目前不是 ISO-C 标准数学库的一部分,这里似乎也没有任何立即添加它的计划。 *如何使用 ISO-C 标准数学库准确实现 Lambert W 函数的主分支 W0

忠实全面的实施可能过于雄心勃勃,但保持 4 ulp 误差界限(由英特尔数学库的 LA 配置文件选择)似乎是可以实现和可取的。可以假定支持 IEEE-754 (2008) 二进制浮点运算,并支持可通过fma()fmaf()标准库函数访问的融合乘加 (FMA) 运算。


[1] Leonard Euler, "De serie Lambertina plurimisque eius insignibus proprietatibus," (On Lambert's Series and its Many Unique properties)Acta Academiae Scientiarum Imperialis Petropolitanae pro Anno MDCCLXXIX, Tomus III, Pars II, (Proceedings of the ImperialAcademy of Sciences of the Year of the Year 1779, Volume 3, part 2, Jul. - Dec.), 圣彼得堡: 科学院 1783, 第 29-51 页(在慕尼黑巴伐利亚州立图书馆在线扫描)

[2] Johann Heinrich Lambert, "Observationes variae in mathesin puram" (各種觀察純數學)Acta Helveticae physico-mathematico-anatomico-botanico-medica, Vol. 3, Basel: J. R. Imhof 1758, pp. 128-168 (在生物多样性遗产图书馆在线扫描)

[3] F.N. Fritsch, R.E. Shafer, and W.P. Crowley, "Algorithm 443: Solution of the Transcendental Equationwex=x",Communications of the ACM, Vol. 16, No. 2, February 1973, pp. 123-124.

[4] R.M. Corless等人,"论朗伯特W函数",计算数学进展,第5卷,第1期,1996年12月,第329-359页

[5] Iordanis Kesisoglou,Garima Singh和Michael Nikolaou,"Lambert函数应该在工程数学工具箱中",计算机与化学工程,148,May 2021

从文献中可以清楚地看出,在实数上计算朗伯W函数的最常见方法是通过函数迭代。二阶牛顿方法是这些方案中最简单的:

wi+1= w i - (w i exp(wi) - x)/(exp (w i) + w iexp(wi)).

许多文献更喜欢高阶方法,例如Halley,Fritsch和Schroeder的方法。在探索性工作中,我发现当以有限精度浮点运算执行时,这些高阶迭代的数值性质并不像阶数所暗示的那样有利。举个特别的例子,三次牛顿迭代在准确性方面始终优于两次哈雷迭代。出于这个原因,我决定将牛顿迭代作为主要构建块,并使用exp()的自定义实现,这些实现只需要提供在 IEEE-754 术语中表示为正正态的结果即可恢复一些性能。

我进一步发现,对于大操作数,牛顿迭代只会缓慢收敛,特别是当起始近似不是很准确时。由于高迭代计数不利于良好的性能,我四处寻找替代方案,并在 Iacono 和 Boyd[*]的基于对数的迭代方案中发现了一个更好的候选者,该方案也具有二阶收敛性:

w i+1 = (w i/(1 + w i))* (1 + log (x/wi)

Lambert W 函数的许多实现似乎对输入域的不同部分使用不同的起始近似值。我更喜欢在整个输入域中使用单个起始近似,以便将来进行矢量化实现。

幸运的是,Iacono和Boyd也提供了一个通用的起始近似,该近似适用于W0的整个输入域,虽然没有完全实现其承诺,但表现非常好。我针对处理更窄输入域的单精度实现对此进行了微调,使用优化搜索启发式方法来实现最佳精度。我还采用了log()的自定义实现,这些实现只需要处理正正态的输入。

在起始近似和牛顿迭代中都必须小心,以避免中间计算中的溢出和下溢。这可以通过按 2 的适当幂进行缩放来轻松且廉价地实现。

虽然生成的迭代方案通常提供准确的结果,但会发生许多ulp的错误 对于接近零的参数和接近-1/e的参数≈ -0.367879。我通过使用泰勒级数展开的前几个项来解决前一个问题:x- x 2 + (3/2)x3。W 0 ≈ √(1+ex)-1 在 [-1/e,0] 上的事实表明使用了t=√(x+1/e) 的最小最大多项式近似p(t),结果证明它在-1/e附近工作得很好。我用雷梅兹算法生成了这个近似值。

IEEE-754binary32映射到float,IEEE-754binary64映射到double所达到的精度完全在指定的误差范围内。正半平面的最大误差小于 1.5 ulps,负半平面的最大误差低于 2.7 ulps。单精度代码经过详尽测试,而双精度代码则使用十亿个随机参数进行测试。


[*] Roberto Iacono和John P. Boyd。 "朗伯 W 函数主实值分支的新近似。"计算数学进展,第43卷,第6期,2017年12月,第1403-1436页。

Lambert W0函数的单精度实现如下所示:

float expf_scale_pos_normal (float a, int scale);
float logf_pos_normal (float a);
/*
Compute the principal branch of the Lambert W function, W_0. The maximum
error in the positive half-plane is 1.49874 ulps and the maximum error in 
the negative half-plane is 2.56002 ulps
*/
float lambert_w0f (float z) 
{
const float em1_fact_0 = 0.625529587f; // exp(-1)_factor_0
const float em1_fact_1 = 0.588108778f; // exp(-1)_factor_1
const float qe1 = 2.71828183f / 4.0f; //  0x1.5bf0a8p-01 // exp(1)/4
float e, w, num, den, rden, redz, y, r;
if (isnan (z) || (z == INFINITY) || (z == 0.0f)) return z + z;
if (fabsf (z) < 1.220703125e-4f) return fmaf (-z, z, z); // 0x1.0p-13
redz = fmaf (em1_fact_0, em1_fact_1, z); // z + exp(-1)
if (redz < 0.0625f) { // expansion at -(exp(-1))
r = sqrtf (redz);
w =             -1.23046875f;  // -0x1.3b0000p+0
w = fmaf (w, r,  2.17185670f); //  0x1.15ff66p+1
w = fmaf (w, r, -2.19554094f); // -0x1.19077cp+1
w = fmaf (w, r,  1.92107077f); //  0x1.ebcb4cp+0
w = fmaf (w, r, -1.81141856f); // -0x1.cfb920p+0
w = fmaf (w, r,  2.33162979f); //  0x1.2a72d8p+1
w = fmaf (w, r, -1.00000000f); // -0x1.000000p+0
} else {
/* Compute initial approximation. Based on: Roberto Iacono and John 
Philip Boyd, "New approximations to the principal real-valued branch
of the Lambert W function", Advances in Computational Mathematics, 
Vol. 43, No. 6, December 2017, pp. 1403-1436
*/
y = fmaf (2.0f, sqrtf (fmaf (qe1, z, 0.25f)), 1.0f);
y = logf_pos_normal (fmaf (1.15262585f, y, -0.15262585f) / 
fmaf (0.45906518f, logf_pos_normal (y), 1.0f));
w = fmaf (2.0390625f, y, -1.0f);
/* perform Newton iterations to refine approximation to full accuracy */
for (int i = 0; i < 3; i++) {
e = expf_scale_pos_normal (w, -3); // 0.125f * expf (w);
num = fmaf (w, e, -0.125f * z);
den = fmaf (w, e, e);
rden = 1.0f / den;
w = fmaf (-num, rden, w);
}
}
return w;
}
float uint32_as_float (uint32_t a)
{
float r;
memcpy (&r, &a, sizeof r);
return r;
}
uint32_t float_as_uint32 (float a)
{
uint32_t r;
memcpy (&r, &a, sizeof r);
return r;
}
/* exp(a) * 2**scale; positive normal results only! Maximum error 0.86565 ulp */
float expf_scale_pos_normal (float a, int scale)
{
const float flt_int_cvt = 12582912.0f; // 0x1.8p23 
float f, r, j, t;
uint32_t i;
/* exp(a) = 2**i * exp(f); i = rintf (a / log(2)) */
j = fmaf (1.442695f, a, flt_int_cvt); // // 0x1.715476p0 // log2(e)
t = j - flt_int_cvt; 
f = fmaf (t, -6.93145752e-1f, a); // -0x1.62e400p-1  // log_2_hi 
f = fmaf (t, -1.42860677e-6f, f); // -0x1.7f7d1cp-20 // log_2_lo 
i = float_as_uint32 (j);
/* approximate r = exp(f) on interval [-log(2)/2, +log(2)/2] */
r =             1.37805939e-3f;  // 0x1.694000p-10
r = fmaf (r, f, 8.37312452e-3f); // 0x1.125edcp-7
r = fmaf (r, f, 4.16695364e-2f); // 0x1.555b5ap-5
r = fmaf (r, f, 1.66664720e-1f); // 0x1.555450p-3
r = fmaf (r, f, 4.99999851e-1f); // 0x1.fffff6p-2
r = fmaf (r, f, 1.00000000e+0f); // 0x1.000000p+0
r = fmaf (r, f, 1.00000000e+0f); // 0x1.000000p+0
/* exp(a) = 2**(i+scale) * r; */
r = uint32_as_float (((i + scale) << 23) + float_as_uint32 (r));
return r;
}
/* compute natural logarithm of positive normals; maximum error: 0.85089 ulp */
float logf_pos_normal (float a)
{
const float ln2 = 0.693147182f; // 0x1.62e430p-1 // log(2)
const float two_to_m23 = 1.19209290e-7f; // 0x1.0p-23
float m, r, s, t, i, f;
int32_t e;
/* log(a) = log(m * 2**i) = i * log(2) + log(m) */
e = (float_as_uint32 (a) - float_as_uint32 (0.666666667f)) & 0xff800000;
m = uint32_as_float (float_as_uint32 (a) - e);
i = (float)e * two_to_m23;
/* log(m) = log1p(f) */
f = m - 1.0f;
s = f * f;
/* compute log1p(f) for f in [-1/3, 1/3] */
r =             -0.130310059f;  // -0x1.0ae000p-3
t =              0.140869141f;  //  0x1.208000p-3
r = fmaf (r, s, -0.121483363f); // -0x1.f1988ap-4
t = fmaf (t, s,  0.139814854f); //  0x1.1e5740p-3
r = fmaf (r, s, -0.166846141f); // -0x1.55b36ep-3
t = fmaf (t, s,  0.200120345f); //  0x1.99d8b2p-3
r = fmaf (r, s, -0.249996200f); // -0x1.fffe02p-3
r = fmaf (t, f, r);
r = fmaf (r, f,  0.333331972f); //  0x1.5554fap-2
r = fmaf (r, f, -0.500000000f); // -0x1.000000p-1
r = fmaf (r, s, f);
/* log(a) = i * log(2) + log(m) */
r = fmaf (i, ln2, r);
return r;
}

双精度实现在结构上等效于单精度实现,不同之处在于它使用了 Iacono-Boyd 迭代方案:

double exp_scale_pos_normal (double a, int scale);
double log_pos_normal (double a);
/* Compute the principal branch of the Lambert W function, W_0. Maximum error:
positive half-plane: 1.49210 ulp
negative half-plane: 2.67824 ulp
*/
double lambert_w0 (double z) 
{
const double em1_fact_0 = 0.57086272525975246; // 0x1.24481e7efdfcep-1 // exp(-1)_factor_0
const double em1_fact_1 = 0.64442715366299452; // 0x1.49f25b1b461b7p-1 // exp(-1)_factor_1
const double qe1 = 2.7182818284590452 * 0.25;  // 0x1.5bf0a8b145769p-1 // exp(1)/4
double e, r, t, w, y, num, den, rden, redz;
int i;

if (isnan (z) || (z == INFINITY) || (z == 0.0)) return z + z;
if (fabs (z) < 1.9073486328125e-6) return fma (fma (1.5, z, -1.) * z, z, z);
redz = fma (em1_fact_0, em1_fact_1, z); // z + exp(-1)
if (redz < 0.01025390625) { // expansion at -(exp(-1))
r = sqrt (redz);
w =            -7.8466654751155138;  // -0x1.f62fc463917ffp+2
w = fma (w, r, 10.0241581340373877); //  0x1.40c5e74773ef5p+3
w = fma (w, r, -8.1029379749359691); // -0x1.034b44947bba0p+3
w = fma (w, r,  5.8322883145113726); //  0x1.75443634ead5fp+2
w = fma (w, r, -4.1738796362609882); // -0x1.0b20d80dcb9acp+2
w = fma (w, r,  3.0668053943936471); //  0x1.888d14440efd0p+1
w = fma (w, r, -2.3535499689514934); // -0x1.2d41201913016p+1
w = fma (w, r,  1.9366310979331112); //  0x1.efc70e3e0a0eap+0
w = fma (w, r, -1.8121878855270763); // -0x1.cfeb8b968bd2cp+0
w = fma (w, r,  2.3316439815968506); //  0x1.2a734f5b6fd56p+1
w = fma (w, r, -1.0000000000000000); // -0x1.0000000000000p+0
return w;
}
/* Roberto Iacono and John Philip Boyd, "New approximations to the 
principal real-valued branch of the Lambert W function", Advances 
in Computational Mathematics, Vol. 43, No. 6, December 2017, 
pp. 1403-1436
*/
y = fma (2.0, sqrt (fma (qe1, z, 0.25)), 1.0);
y = log_pos_normal (fma (1.14956131, y, -0.14956131) / 
fma (0.4549574, log_pos_normal (y), 1.0));
w = fma (2.036, y, -1.0);
/* Use iteration scheme w = (w / (1 + w)) * (1 + log (z / w) from
Roberto Iacono and John Philip Boyd, "New approximations to the 
principal real-valued branch of the Lambert W function", Advances
in Computational Mathematics, Vol. 43, No. 6, December 2017, pp. 
1403-1436
*/
for (i = 0; i < 3; i++) {
t = w / (1.0 + w);
w = fma (log_pos_normal (z / w), t, t);
}
/* Fine tune approximation with a single Newton iteration */
e = exp_scale_pos_normal (w, -3); // 0.125 * exp (w)
num = fma (w, e, -0.125 *z);
den = fma (w, e, e);
rden = 1.0 / den;
w = fma (-num, rden, w);
return w;
}
int double2hiint (double a)
{
unsigned long long int t;
memcpy (&t, &a, sizeof t);
return (int)(t >> 32);
}
int double2loint (double a)
{
unsigned long long int t;
memcpy (&t, &a, sizeof t);
return (int)(unsigned int)t;
}
double hiloint2double (int hi, int lo) 
{
double r;
unsigned long long int t;
t = ((unsigned long long int)(unsigned int)hi << 32) | (unsigned int)lo;
memcpy (&r, &t, sizeof r);
return r;
}
/* exp(a) * 2**scale; pos. normal results only! Max. err. found: 0.89028 ulp */
double exp_scale_pos_normal (double a, int scale)
{
const double ln2_hi = 6.9314718055829871e-01; // 0x1.62e42fefa00000p-01
const double ln2_lo = 1.6465949582897082e-12; // 0x1.cf79abc9e3b3a0p-40
const double l2e = 1.4426950408889634; // 0x1.71547652b82fe0p+00 // log2(e)
const double cvt = 6755399441055744.0; // 0x1.80000000000000p+52 // 3*2**51
double f, r;
int i;
/* exp(a) = exp(i + f); i = rint (a / log(2)) */
r = fma (l2e, a, cvt);
i = double2loint (r);
r = r - cvt;
f = fma (r, -ln2_hi, a);
f = fma (r, -ln2_lo, f);
/* approximate r = exp(f) on interval [-log(2)/2,+log(2)/2] */
r =            2.5022018235176802e-8;  // 0x1.ade0000000000p-26
r = fma (r, f, 2.7630903497145818e-7); // 0x1.28af3fcbbf09bp-22
r = fma (r, f, 2.7557514543490574e-6); // 0x1.71dee623774fap-19
r = fma (r, f, 2.4801491039409158e-5); // 0x1.a01997c8b50d7p-16
r = fma (r, f, 1.9841269589068419e-4); // 0x1.a01a01475db8cp-13
r = fma (r, f, 1.3888888945916566e-3); // 0x1.6c16c1852b805p-10
r = fma (r, f, 8.3333333334557735e-3); // 0x1.11111111224c7p-7
r = fma (r, f, 4.1666666666519782e-2); // 0x1.55555555502a5p-5
r = fma (r, f, 1.6666666666666477e-1); // 0x1.5555555555511p-3
r = fma (r, f, 5.0000000000000122e-1); // 0x1.000000000000bp-1
r = fma (r, f, 1.0000000000000000e+0); // 0x1.0000000000000p+0
r = fma (r, f, 1.0000000000000000e+0); // 0x1.0000000000000p+0
/* exp(a) = 2**(i+scale) * r */
r = hiloint2double (double2hiint (r) + ((i + scale) << 20), 
double2loint (r));
return r;
}
/* compute natural logarithm of positive normals; max. err. found: 0.86902 ulp*/
double log_pos_normal (double a)
{
const double ln2_hi = 6.9314718055994529e-01; // 0x1.62e42fefa39efp-01
const double ln2_lo = 2.3190468138462996e-17; // 0x1.abc9e3b39803fp-56
double m, r, i, s, t, p, q;
int e;
/* log(a) = log(m * 2**i) = i * log(2) + log(m) */
e = (double2hiint (a) - double2hiint (0.70703125)) & 0xfff00000;
m = hiloint2double (double2hiint (a) - e, double2loint (a));
t = hiloint2double (0x41f00000, 0x80000000 ^ e);
i = t - (hiloint2double (0x41f00000, 0x80000000));
/* m now in [181/256, 362/256]. Compute q = (m-1) / (m+1) */
p = m + 1.0;
r = 1.0 / p;
q = fma (m, r, -r);
m = m - 1.0;
/* compute (2*atanh(q)/q-2*q) as p(q**2), q in [-75/437, 53/309] */
s = q * q;
r =            1.4794533702196025e-1;  // 0x1.2efdf700d7135p-3
r = fma (r, s, 1.5314187748152339e-1); // 0x1.39a272db730f7p-3
r = fma (r, s, 1.8183559141306990e-1); // 0x1.746637f2f191bp-3
r = fma (r, s, 2.2222198669309609e-1); // 0x1.c71c522a64577p-3
r = fma (r, s, 2.8571428741489319e-1); // 0x1.24924941c9a2fp-2
r = fma (r, s, 3.9999999999418523e-1); // 0x1.999999998006cp-2
r = fma (r, s, 6.6666666666667340e-1); // 0x1.5555555555592p-1
r = r * s;
/* log(a) = 2*atanh(q) + i*log(2) = ln2_lo*i + p(q**2)*q + 2q + ln2_hi * i.
Use K.C. Ng's trick to improve the accuracy of the computation, like so:
p(q**2)*q + 2q = p(q**2)*q + q*t - t + m, where t = m**2/2.
*/
t = m * m * 0.5;
r = fma (q, t, fma (q, r, ln2_lo * i)) - t + m;
r = fma (ln2_hi, i, r);
return r;
}

相关内容

  • 没有找到相关文章

最新更新