计算 Linux 中几个"time"命令的平均值

我使用"time"命令在Linux上评测一个程序。问题是它的输出在统计上不太相关，因为它只运行了一次程序。有没有一种工具或方法可以平均跑几次"时间"？可能与偏差等统计信息结合在一起？

这是我编写的一个脚本，用于执行与您想要的类似的操作。它运行所提供的命令10次，将真实的、用户CPU和系统CPU时间记录到一个文件中，并在每个命令输出后回显tham。然后，它使用awk来提供文件中3列中每一列的平均值，但（还）不包括标准偏差。

#!/bin/bash
rm -f /tmp/mtime.$$
for x in {1..10}
do
  /usr/bin/time -f "real %e user %U sys %S" -a -o /tmp/mtime.$$ $@
  tail -1 /tmp/mtime.$$
done
awk '{ et += $2; ut += $4; st += $6; count++ } END {  printf "Average:nreal %.3f user %.3f sys %.3fn", et/count, ut/count, st/count }' /tmp/mtime.$$

使用超精细。

例如：

hyperfine 'sleep 0.3'

将多次运行命令sleep 0.3，然后输出如下内容：

hyperfine 'sleep 0.3'
Benchmark #1: sleep 0.3
  Time (mean ± σ):     306.7 ms ±   3.0 ms    [User: 2.8 ms, System: 3.5 ms]
  Range (min … max):   301.0 ms … 310.9 ms    10 runs

perf stat通过-r（-repeat=<n>）选项（具有平均值和方差）为您执行此操作。

例如，在awk中使用一个短循环来模拟一些工作，短到CPU频率上升和其他启动开销可能是一个因素（性能评估的惯用方法？），尽管我的CPU似乎很快上升到3.9GHz，平均3.82 GHz。

$ perf stat -r5 awk 'BEGIN{for(i=0;i<1000000;i++){}}'
 Performance counter stats for 'awk BEGIN{for(i=0;i<1000000;i++){}}' (5 runs):
             37.90 msec task-clock                #    0.968 CPUs utilized            ( +-  2.18% )
                 1      context-switches          #   31.662 /sec                     ( +-100.00% )
                 0      cpu-migrations            #    0.000 /sec                   
               181      page-faults               #    4.776 K/sec                    ( +-  0.39% )
       144,802,875      cycles                    #    3.821 GHz                      ( +-  0.23% )
       343,697,186      instructions              #    2.37  insn per cycle           ( +-  0.05% )
        93,854,279      branches                  #    2.476 G/sec                    ( +-  0.04% )
            29,245      branch-misses             #    0.03% of all branches          ( +- 12.79% )
           0.03917 +- 0.00182 seconds time elapsed  ( +-  4.63% )

（向右滚动查看差异。）

如果您有一个单线程任务并且希望最小化上下文切换，则可以使用taskset -c3 perf stat ...将任务固定到特定的核心（在这种情况下为#3）。

默认情况下，perf stat使用硬件性能计数器来评测指令、核心时钟周期（与现代CPU上的时间不同）和分支未命中等内容。这具有相当低的开销，特别是在计数器处于"0"的情况下；计数"；而不是CCD_ 8导致中断对事件的热点进行统计采样。

您可以使用-e task-clock只使用该事件，而不使用硬件性能计数器。（或者，如果你的系统在虚拟机中，或者你没有更改默认的/proc/sys/kernel/perf_event_paranoid，perf可能无论如何都无法要求内核编程任何。）

有关perf的更多信息，请参阅

https://www.brendangregg.com/perf.html
https://perf.wiki.kernel.org/index.php/Main_Page

对于打印输出的程序，它看起来是这样的：

$ perf stat -r5 echo hello
hello
hello
hello
hello
hello
 Performance counter stats for 'echo hello' (5 runs):
              0.27 msec task-clock                #    0.302 CPUs utilized            ( +-  4.51% )
...
          0.000890 +- 0.000411 seconds time elapsed  ( +- 46.21% )

对于单次运行（默认情况下没有-r），perf-stat将显示运行时间和user/sys。但由于某种原因，-r并没有达到平均水平。

就像上面提到的评论者一样，听起来你可能想使用循环多次运行你的程序，以获得更多的数据点。您可以使用带有-o选项的时间命令将时间命令的结果输出到文本文件中，如下所示：time-o output.txt myprog

相关内容

最新更新

热门标签：