摘要

最近我遇到了一个关于LTO和CCD_ 1的奇怪问题；pow"；(在cmath中)调用，这取决于是否使用-flto。

环境：

$ g++ --version
g++ (GCC) 8.3.0
Copyright (C) 2018 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
$ ll /lib64/libc.so.6
lrwxrwxrwx 1 root root 12 Sep  3  2019 /lib64/libc.so.6 -> libc-2.17.so
$ ll /lib64/libm.so.6
lrwxrwxrwx 1 root root 12 Sep  3  2019 /lib64/libm.so.6 -> libm-2.17.so
$ cat /etc/redhat-release 
CentOS Linux release 7.5.1804 (Core)

最小示例

代码

fixed.hxx

#include <cstdint>
double Power10f(const int16_t power);

fixed.cxx

#include "fixed.hxx"
#include <cmath>
double Power10f(const int16_t power)
{
return pow(10.0, (double) power);
}

test.cxx

#include <iostream>
#include <cmath>
#include <iomanip>
#include <cstdint>
#include "fixed.hxx"
int main(int argc, char** argv)
{
if (argc >= 3) {
int64_t value = (int64_t)atoi(argv[1]);
int16_t power = (int16_t)atoi(argv[2]);
double x = Power10f(power);
std::cout.precision(17);
std::cout << std::scientific << x << std::endl;
std::cout << std::scientific << (double)value * x << std::endl;
return 0;   
}
return 1;
}

编译&Run

使用-ffast-math编译和使用/不使用-flto编译会产生不同的结果

With-flto将最终调用CCD_；准确的"；结果：

$ g++ -O3 -DNDEBUG -ffast-math -std=c++17 -flto  -o fixed.cxx.o -c fixed.cxx
$ g++ -O3 -DNDEBUG   -o fdtest fixed.cxx.o test.cxx
$ ./fdtest 81 20
1.00000000000000000e+20
8.10000000000000000e+21
$ objdump -DC fdtest > fdtest.dump
$ cat fdtest.dump
...
0000000000400930 <Power10f(short)>:
400930:       0f bf ff                movswl %di,%edi
400933:       66 0f ef c9             pxor   %xmm1,%xmm1
400937:       f2 0f 10 05 99 00 00    movsd  0x99(%rip),%xmm0        # 4009d8 <_IO_stdin_used+0x8>
40093e:       00 
40093f:       f2 0f 2a cf             cvtsi2sd %edi,%xmm1
400943:       e9 d8 fd ff ff          jmpq   400720 <__pow_finite@plt>
400948:       0f 1f 84 00 00 00 00    nopl   0x0(%rax,%rax,1)
40094f:       00
...

没有-ffast-math1最终调用__exp_finite(如果我猜对的话，作为-ffast-math启用的优化)；"不准确"；结果

$ g++ -O3 -DNDEBUG -ffast-math -std=c++17  -o fixed.cxx.o -c fixed.cxx
$ g++ -O3 -DNDEBUG   -o fdtest fixed.cxx.o test.cxx
$ ./fdtest 81 20
1.00000000000000786e+20
8.10000000000006396e+21
$ objdump -DC fdtest > fdtest.dump
$ cat fdtest.dump
...
0000000000400930 <Power10f(short)>:
400930:       0f bf ff                movswl %di,%edi
400933:       66 0f ef c0             pxor   %xmm0,%xmm0
400937:       f2 0f 2a c7             cvtsi2sd %edi,%xmm0
40093b:       f2 0f 59 05 95 00 00    mulsd  0x95(%rip),%xmm0        # 4009d8 <_IO_stdin_used+0x8>
400942:       00 
400943:       e9 88 fd ff ff          jmpq   4006d0 <__exp_finite@plt>
400948:       0f 1f 84 00 00 00 00    nopl   0x0(%rax,%rax,1)
40094f:       00
...

问题

上面的例子是预期的行为，还是我的代码有什么问题导致了这种意外的行为？

更新

同样的结果也可以在其他一些平台上观察到(例如，带有g++12.1和glibc 2.35的ArchLinux)。

man gcc:

要使用链接时间优化器，应在编译时和最终链接期间指定-flto和优化选项。建议使用相同的选项编译参与同一链接的所有文件，并在链接时指定这些选项。例如：
gcc -c -O2 -flto foo.c
gcc -c -O2 -flto bar.c
gcc -o myprog -flto -O2 foo.o bar.o

-ffast-math允许编译器出于任何原因不一致。由于选择了不同的优化策略，即使修改函数中名义上不相关的代码也很容易导致pow返回不同的结果。-flto在优化的方式/时间上有很大的变化，因此有很大的空间。

如果您关心数值精度、数值一致性或一般的数字，请不要使用-ffast-math。作为一名程序员，它执行的转换通常是可用的，如果你自己完成，你可以依赖它们的一致性。

奇怪的LTO行为与-fast数学

摘要

环境：

最小示例

代码

编译&Run

问题

更新

相关内容

最新更新

热门标签：