在阅读了为什么处理排序数组比处理未排序数组更快?之后,我在主循环中添加了一个额外的测试。似乎这个额外的测试使程序更快。
int main()
{
// Generate data
const unsigned arraySize = 32768;
int data[arraySize];
for (unsigned c = 0; c < arraySize; ++c)
data[c] = std::rand() % 256;
//Don't sort the array
//std::sort(data, data + arraySize);
// Test
clock_t start = clock();
long long sum = 0;
for (unsigned i = 0; i < 100000; ++i)
{
// Primary loop
for (unsigned c = 0; c < arraySize; ++c)
{
if (data[c] >= 128)
sum += data[c];
//With this additional test, execution becomes faster
if (data[c] < 128)
sum += data[c];
}
}
double elapsedTime = static_cast<double>(clock() - start) / CLOCKS_PER_SEC;
std::cout << elapsedTime << std::endl;
std::cout << "sum = " << sum << std::endl;
}
我通过附加测试获得大约 4.2 秒,在没有额外测试的情况下获得 18 秒。额外的测试不应该使程序变慢而不是更快吗?
由于该特定的附加测试,因此等效代码:
for (unsigned i = 0; i < 100000; ++i)
{
// Primary loop
for (unsigned c = 0; c < arraySize; ++c)
{
if (data[c] >= 128)
sum += data[c];
//With this additional test, execution becomes faster
if (data[c] < 128)
sum += data[c];
}
}
变成这样:
for (unsigned i = 0; i < 100000; ++i)
{
// Primary loop
for (unsigned c = 0; c < arraySize; ++c)
{
sum += data[c];//because exactly one condition is guaranteed to be
//true in each iteration (in your code)!
//the equivalent is as if there is no condition at all!
}
}
这就是为什么它变得更快。
正是由于不寻常的附加测试和相同的主体,编译器能够优化代码,消除if
条件。当你有一个if
时,编译器就不能这样做。
试着写这个:
sum -= data[c]; //the body is not identical anymore!
在if
条件之一。我相信编译器将无法优化代码。它现在应该发出较慢的机器代码。
请注意,外部循环可以完全省略,尽管它不太依赖于额外的测试:
// Primary loop
for (unsigned c = 0; c < arraySize; ++c)
{
sum += 100000 * data[c];
}
或者,这个:
// Primary loop
for (unsigned c = 0; c < arraySize; ++c)
{
sum += data[c];
}
sum = 100000 * sum; //multiple once!