Thread.sleep(1)的成本因繁忙循环的实现而异



假设我们在循环中执行Thread.sleep(1),迭代n次(这里和下面是Java 11):

@State(Scope.Thread)
@BenchmarkMode(Mode.AverageTime)
@OutputTimeUnit(TimeUnit.MILLISECONDS)
@Fork(jvmArgsAppend = {"-Xms1g", "-Xmx1g"})
public class ThreadSleep1Benchmark {
@Param({"5", "10", "50"})
long delay;
@Benchmark
public int sleep() throws Exception {
for (int i = 0; i < delay; i++) {
Thread.sleep(1);
}
return hashCode();
}
}

该基准测试展示了以下结果:

Benchmark                    (delay)  Mode  Cnt   Score   Error  Units
ThreadSleep1Benchmark.sleep        5  avgt   50   6,552 ± 0,071  ms/op
ThreadSleep1Benchmark.sleep       10  avgt   50  13,343 ± 0,227  ms/op
ThreadSleep1Benchmark.sleep       50  avgt   50  68,059 ± 1,441  ms/op

在这里,我们看到方法sleep()需要超过n毫秒,而直观地,我们预计它是~n,因为在每次迭代时,当前线程睡眠1毫秒。这个例子演示了使线程睡眠和唤醒线程的成本

现在让我们修改基准:

@State(Scope.Thread)
@BenchmarkMode(Mode.AverageTime)
@OutputTimeUnit(TimeUnit.MILLISECONDS)
@Fork(jvmArgsAppend = {"-Xms1g", "-Xmx1g"})
public class ThreadSleep2Benchmark {
private final ExecutorService executor = Executors.newFixedThreadPool(1);
volatile boolean flag;
@Param({"5", "10", "50"})
long delay;
@Setup(Level.Invocation)
public void setUp() {
flag = true;
startThread();
}
@TearDown(Level.Trial)
public void tearDown() {
executor.shutdown();
}
@Benchmark
public int sleep() throws Exception {
while (flag) {
Thread.sleep(1);
}
return hashCode();
}
private void startThread() {
executor.submit(() -> {
try {
Thread.sleep(delay);
flag = false;
} catch (InterruptedException e) {
Thread.currentThread().interrupt();
throw new RuntimeException(e);
}
});
}
}

在这里,我们运行一个后台线程,它等待n毫秒,并在sleep()方法迭代while(flag)循环时放下标志。一旦标志在n毫秒的延迟之后被放下,我们期望while循环迭代大约n次。

我们再次看到Thread.sleep(1)的成本,但对于5和10的delay,它们似乎几乎相同,对于delay为50的情况,它们明显更低。请注意,这里的差异不是线性的:5的差异约为0.1 ms,10的差异约1,2 ms,50的差异约13 ms。

Benchmark                    (delay)  Mode  Cnt   Score   Error  Units
ThreadSleep2Benchmark.sleep        5  avgt   50   6,760 ± 0,070  ms/op
ThreadSleep2Benchmark.sleep       10  avgt   50  12,496 ± 0,050  ms/op
ThreadSleep2Benchmark.sleep       50  avgt   50  54,727 ± 0,599  ms/op

在Java 18上的结果类似:

Benchmark                    (delay)  Mode  Cnt   Score   Error  Units
ThreadSleep1Benchmark.sleep        5  avgt   50   6,609 ± 0,105  ms/op
ThreadSleep1Benchmark.sleep       10  avgt   50  13,233 ± 0,148  ms/op
ThreadSleep1Benchmark.sleep       50  avgt   50  66,017 ± 0,714  ms/op
ThreadSleep2Benchmark.sleep        5  avgt   50   6,740 ± 0,067  ms/op
ThreadSleep2Benchmark.sleep       10  avgt   50  12,400 ± 0,112  ms/op
ThreadSleep2Benchmark.sleep       50  avgt   50  53,836 ± 0,250  ms/op

所以我的问题是:ThreadSleep2Benchmark中降低成本的效果是编译器的成就(循环展开等),还是关于我如何迭代循环?

UPD

对于Linux,我得到了以下结果:

Java 11
Linux
Benchmark                    (delay)  Mode  Cnt   Score   Error  Units
ThreadSleep1Benchmark.sleep        5  avgt   50   5.597 ± 0.038  ms/op
ThreadSleep1Benchmark.sleep       10  avgt   50  11.263 ± 0.069  ms/op
ThreadSleep1Benchmark.sleep       50  avgt   50  56.079 ± 0.267  ms/op
Benchmark                    (delay)  Mode  Cnt   Score   Error  Units
ThreadSleep2Benchmark.sleep        5  avgt   50   5.600 ± 0.032  ms/op
ThreadSleep2Benchmark.sleep       10  avgt   50  10.558 ± 0.052  ms/op
ThreadSleep2Benchmark.sleep       50  avgt   50  50.625 ± 0.049  ms/op
Java 18
Benchmark                    (delay)  Mode  Cnt   Score   Error  Units
ThreadSleep1Benchmark.sleep        5  avgt   50   5.581 ± 0.041  ms/op
ThreadSleep1Benchmark.sleep       10  avgt   50  11.069 ± 0.067  ms/op
ThreadSleep1Benchmark.sleep       50  avgt   50  55.719 ± 0.602  ms/op
Benchmark                    (delay)  Mode  Cnt   Score   Error  Units
ThreadSleep2Benchmark.sleep        5  avgt   50   5.574 ± 0.035  ms/op
ThreadSleep2Benchmark.sleep       10  avgt   50  10.918 ± 0.035  ms/op
ThreadSleep2Benchmark.sleep       50  avgt   50  50.823 ± 0.055  ms/op

如果你想对暂停Java线程有更多的控制,可以看看LockSupport.parkNanos。在Linux下,默认情况下,你可以获得50 us的分辨率。有关更多信息以及如何调整,请参阅https://hazelcast.com/blog/locksupport-parknanos-under-the-hood-and-the-curious-case-of-parking/

相关内容

  • 没有找到相关文章

最新更新