使用clock()、time()、clock_gettimes()和rdtsc()内在函数对C中的进程进行计时,返回令人



我正在对一位C代码进行计时,并比较times()clock()clock_gettime()之间的输出,并使用rdtsc固有值来计算经过的时钟周期数。

注意到时钟总是给我一个比time((返回的值大0.02秒的值(在对用户+系统求和之后(。

当我在程序开始时运行时钟时,它已经显示了~2k CLOCKS_PER_SECOND,而不是0,如下面的输出所示。

操作系统为Ubuntu 20.04LTS处理器速度为3.50 GHz

输出示例前两行是在程序开始时打印的,最后两行在末尾。

clock() returns 2593 clocks-per sec (0.00 secs)
times() yields: user CPU: 0.00; system CPU: 0.00
clock() returns 8616 clocks-per sec (0.01 secs)
times() yields: user CPU: 0.00; system CPU: 0.00
clock() returns 2448 clocks-per sec (0.00 secs)
times() yields: user CPU: 0.00; system CPU: 0.00
clock() returns 8403 clocks-per sec (0.01 secs)
times() yields: user CPU: 0.00; system CPU: 0.00
clock() returns 2541 clocks-per sec (0.00 secs)
times() yields: user CPU: 0.00; system CPU: 0.00
clock() returns 5915366 clocks-per sec (5.92 secs)
times() yields: user CPU: 5.49; system CPU: 0.41

将rdtsc和clock_gettimes((添加到此比较

clock() returns 2341 clocks-per sec (0.00 secs)
times() yields: user CPU: 0.00; system CPU: 0.00
resolution:          0.000000001
clockTtime:     143071.541191700
clock() returns 11560076 clocks-per sec (11.56 secs)
times() yields: user CPU: 10.77; system CPU: 0.78
resolution:          0.000000001
clockTtime:     143083.561227466
RTDSC COUNTER: 41973477274 CPU cycles
clock() returns 2325 clocks-per sec (0.00 secs)
times() yields: user CPU: 0.00; system CPU: 0.00
resolution:          0.000000001
clockTtime:     143570.023404324
clock() returns 12039250 clocks-per sec (12.04 secs)
times() yields: user CPU: 11.00; system CPU: 1.03
resolution:          0.000000001
clockTtime:     143583.562080061
RTDSC COUNTER: 47277160370 CPU cycles

关键是,尽管我预计clock((返回的时间是在time((或clock_gettimes((之前测量的,所以它的值是最低的,但计时会偏离.02-0.1秒。

对于上述输出之一时钟返回:11.56秒返回次数:11.55秒clock_gettime((返回:12.020035766rdtsc返回41973477274个周期,对于具有3.5GHzz处理器的计算机来说,该周期=11.992420783秒,rdtsc是第一个和最后一个测量的,因此它应该具有最高值,因为它还包括clock((、time((和clock_gettime((调用的开销。

相关代码:

#include <stdio.h>
#include <stdlib.h>
#include <stdint.h>
#include <time.h>
#include <sys/time.h>
#include <sys/times.h>
#include <unistd.h>
#include <x86intrin.h>
void timeAlgorithm(const char* msg) {
// Three ways to get system process times - times(), clock(), clock_gettime()
struct tms t;
struct timespec tp;
static struct timespec res;
clock_t clockTime;
static long clockTicks = 0;
// Fetch clock ticks on first call
if (clockTicks == 0) {
clockTicks = sysconf(_SC_CLK_TCK);
if (clockTicks == -1) {
perror("Error getting sysconf(_SC_CLK_TCK) value program will now exit");
return exit(EXIT_FAILURE);
}
}
clockTime = clock();
if (clockTime == -1) {
perror("Error getting process clock time using clock(), Program will now exit");
return exit(EXIT_FAILURE);
}
printf("t clock() returns %ld clocks-per sec (%.2f secs)n",
(long)clockTime, (double) clockTime / CLOCKS_PER_SEC);
if (times(&t) == -1) {
perror("The time call failed, this program will now exit.");
return exit(EXIT_FAILURE);
}
printf("t times() yields: user CPU: %.2f; system CPU: %.2fn",
(double) t.tms_utime / clockTicks,
(double) t.tms_stime / clockTicks);
if (!res.tv_sec) {
if (clock_getres(CLOCK_MONOTONIC, &res) == -1) {
perror("clock_getres() call failed, this program will now exit");
return exit(EXIT_FAILURE);
}
}
if (clock_gettime(CLOCK_MONOTONIC, &tp) == -1) {
perror("clocl_gettime() call failed, this program will now exit.");
return exit(EXIT_FAILURE);
}
printf("tresolution: %10jd.%09ldn",
(intmax_t)res.tv_sec, res.tv_nsec);
printf("tclockTtime: %10jd.%09ldn",
(intmax_t)tp.tv_sec, tp.tv_nsec);
}
int main(int argc, char* argv[]) {
printf("CLOCKS_PER_SEC=%ldtsysconf(_SC_CLK_TCK)=%ldnn",
(long)CLOCKS_PER_SEC, sysconf(_SC_CLK_TCK));
uint64_t start = __rdtsc();
timeAlgorithm("At program start");
// 1 less than 4 gigs because 2^32 - 1 is the max value for a uint32_t and 2^32 = 4gb 
// Not 2 or 4 gigs right now so I can debug faster
uint32_t TWOGIGS = (uint32_t)(2147483648 / 1) - 1;
// Set seed for random
srandom(time(NULL));
uint32_t TWOGIGSOFLONGS = (uint32_t)(TWOGIGS / sizeof(long));
long* unsortedData = malloc(sizeof(long) * TWOGIGSOFLONGS);
for (uint32_t i = 0; i < TWOGIGSOFLONGS; ++i) {
unsortedData[i] = random();
//printf("%dt%ldn", i, unsortedData[i]);
}
timeAlgorithm("At program end");
uint64_t end = __rdtsc();
printf("RTDSC COUNTER: %lu CPU cyclesn", end - start);
return EXIT_SUCCESS;

这种差异在实践中并不重要,我可以选择使用其中一个函数并继续前进,但我很好奇是否有人知道为什么所有这些值都以这种奇怪的方式不同。

如果clock((的值低于time((,后者低于clock_gettimes((,而后者低于__rdtsc((,这是有道理的,因为这是测量这些时间的顺序,但事实并非如此,令人困惑。

如果其他人看到了这篇文章,我会看到这篇论文,并用它来计时我的代码https://www.intel.com/content/dam/www/public/us/en/documents/white-papers/ia-32-ia-64-benchmark-code-execution-paper.pdf

最新更新