tbb的private_server和误报线程清理器数据竞赛



我们在频繁但不一致的基础上得到假阳性threadsanizer (tsan)数据竞争警告。虽然众所周知,tsan可以给出假阳性警告,其中一些可能通过TSAN_OPTIONS环境变量被抑制,但我们遇到的一类特殊警告似乎与英特尔的线程构建块(tbb)使用tbb::detail::r1::rml::private_server有关,如果我们能够以某种方式对停止private_server有更多的控制,这些警告似乎是可以预防的。以下是在Google测试运行期间遇到的一个误报tsan数据竞争警告:

WARNING: ThreadSanitizer: data race (pid=5244)
Write of size 1 at 0x7ffda4d64fd8 by main thread:
#0 std::shared_lock<std::shared_mutex>::shared_lock(std::shared_mutex&, std::defer_lock_t) /usr/local/foo-deps/20220316/include/c++/9.4.0/shared_mutex:639 (FooTest+0x68d162)
#1 FooProxy::buildTranslationMapToOtherProxy(FooProxy*, std::vector<foo::StringOpInfo, std::allocator<foo::StringOpInfo> > const&) const /home/jenkins-slave/workspace/core-tsan-gcc/Foo/FooProxy.cpp:323 (FooTest+0x68d162)
#2 FooProxy_BuildTranslationMapToPartialOverlapProxy_Test::TestBody() /home/jenkins-slave/workspace/core-tsan-gcc/Tests/FooTest.cpp:798 (FooTest+0x5c5284)
#3 void testing::internal::HandleSehExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) /home/jenkins-slave/workspace/core-tsan-gcc/ThirdParty/googletest/gmock-gtest-all.cc:3968 (FooTest+0x62d798)
#4 void testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) /home/jenkins-slave/workspace/core-tsan-gcc/ThirdParty/googletest/gmock-gtest-all.cc:4004 (FooTest+0x62d798)
#5 testing::Test::Run() /home/jenkins-slave/workspace/core-tsan-gcc/ThirdParty/googletest/gmock-gtest-all.cc:4043 (FooTest+0x618586)
#6 testing::TestInfo::Run() /home/jenkins-slave/workspace/core-tsan-gcc/ThirdParty/googletest/gmock-gtest-all.cc:4219 (FooTest+0x6187d4)
#7 testing::TestSuite::Run() /home/jenkins-slave/workspace/core-tsan-gcc/ThirdParty/googletest/gmock-gtest-all.cc:4351 (FooTest+0x618959)
#8 testing::internal::UnitTestImpl::RunAllTests() /home/jenkins-slave/workspace/core-tsan-gcc/ThirdParty/googletest/gmock-gtest-all.cc:6892 (FooTest+0x618e7e)
#9 bool testing::internal::HandleSehExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) /home/jenkins-slave/workspace/core-tsan-gcc/ThirdParty/googletest/gmock-gtest-all.cc:3968 (FooTest+0x62de38)
#10 bool testing::internal::HandleExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) /home/jenkins-slave/workspace/core-tsan-gcc/ThirdParty/googletest/gmock-gtest-all.cc:4004 (FooTest+0x62de38)
#11 testing::UnitTest::Run() /home/jenkins-slave/workspace/core-tsan-gcc/ThirdParty/googletest/gmock-gtest-all.cc:6479 (FooTest+0x619440)
#12 RUN_ALL_TESTS() /home/jenkins-slave/workspace/core-tsan-gcc/ThirdParty/googletest/gtest/gtest.h:11696 (FooTest+0x5b401a)
#13 main /home/jenkins-slave/workspace/core-tsan-gcc/Tests/FooTest.cpp:974 (FooTest+0x5b401a)
Previous read of size 8 at 0x7ffda4d64fd8 by thread T18:
[failed to restore the stack]
Location is stack of main thread.
Location is global '<null>' at 0x000000000000 ([stack]+0x00000001efd8)
Thread T18 (tid=5264, running) created by main thread at:
#0 pthread_create ../../.././libsanitizer/tsan/tsan_interceptors.cc:964 (libtsan.so.0+0x2cd6b)
#1 tbb::detail::r1::rml::private_server::wake_some(int) <null> (FooTest+0x8828ce)
#2 tbb::detail::d1::task* tbb::detail::r1::task_dispatcher::local_wait_for_all<false, tbb::detail::r1::external_waiter>(tbb::detail::d1::task*, tbb::detail::r1::external_waiter&) <null> (FooTest+0x88b1c2)
#3 tbb::detail::r1::task_arena_impl::execute(tbb::detail::d1::task_arena_base&, tbb::detail::d1::delegate_base&) <null> (FooTest+0x86e74c)
#4 Foo::getStringViews() const /home/jenkins-slave/workspace/core-tsan-gcc/Foo/Foo.cpp:1869 (FooTest+0x63612c)
#5 Foo_GetStringViews_Test::TestBody() /home/jenkins-slave/workspace/core-tsan-gcc/Tests/FooTest.cpp:141 (FooTest+0x5c625c)
#6 void testing::internal::HandleSehExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) /home/jenkins-slave/workspace/core-tsan-gcc/ThirdParty/googletest/gmock-gtest-all.cc:3968 (FooTest+0x62d798)
#7 void testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) /home/jenkins-slave/workspace/core-tsan-gcc/ThirdParty/googletest/gmock-gtest-all.cc:4004 (FooTest+0x62d798)
#8 testing::Test::Run() /home/jenkins-slave/workspace/core-tsan-gcc/ThirdParty/googletest/gmock-gtest-all.cc:4043 (FooTest+0x618586)
#9 testing::TestInfo::Run() /home/jenkins-slave/workspace/core-tsan-gcc/ThirdParty/googletest/gmock-gtest-all.cc:4219 (FooTest+0x6187d4)
#10 testing::TestSuite::Run() /home/jenkins-slave/workspace/core-tsan-gcc/ThirdParty/googletest/gmock-gtest-all.cc:4351 (FooTest+0x618959)
#11 testing::internal::UnitTestImpl::RunAllTests() /home/jenkins-slave/workspace/core-tsan-gcc/ThirdParty/googletest/gmock-gtest-all.cc:6892 (FooTest+0x618e7e)
#12 bool testing::internal::HandleSehExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) /home/jenkins-slave/workspace/core-tsan-gcc/ThirdParty/googletest/gmock-gtest-all.cc:3968 (FooTest+0x62de38)
#13 bool testing::internal::HandleExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) /home/jenkins-slave/workspace/core-tsan-gcc/ThirdParty/googletest/gmock-gtest-all.cc:4004 (FooTest+0x62de38)
#14 testing::UnitTest::Run() /home/jenkins-slave/workspace/core-tsan-gcc/ThirdParty/googletest/gmock-gtest-all.cc:6479 (FooTest+0x619440)
#15 RUN_ALL_TESTS() /home/jenkins-slave/workspace/core-tsan-gcc/ThirdParty/googletest/gtest/gtest.h:11696 (FooTest+0x5b401a)
#16 main /home/jenkins-slave/workspace/core-tsan-gcc/Tests/FooTest.cpp:974 (FooTest+0x5b401a)
SUMMARY: ThreadSanitizer: data race /usr/local/foo-deps/20220316/include/c++/9.4.0/shared_mutex:639 in std::shared_lock<std::shared_mutex>::shared_lock(std::shared_mutex&, std::defer_lock_t)

(一些名字已被更改为匿名。)事件摘要按时间顺序排列:

  1. 谷歌测试Foo.GetStringViews运行(线程T18帧#5)
    • 在此测试中,tbb::task_arenata实例调用ta.execute([&] { tbb::parallel_for(...); });
    • 这似乎运行tbb::detail::r1::rml::private_server::wake_some(int),它产生一个线程,在谷歌测试之间存活。
  2. 谷歌测试FooProxy.BuildTranslationMapToPartialOverlapProxy运行(主线程帧#2)
    • 此测试写入地址0x7ffda4d64fd8,该地址是由前一个测试读取的。

我们的TSAN_OPTIONS环境变量设置为

suppressions=/path/to/tsan.suppressions, history_size=7, second_deadlock_stack=1, halt_on_error=1

我们推测,误报的数据竞争警告是由于三个主要因素:

  • 两个独立的测试一个接一个同步运行,其中不可能出现数据竞争,但恰好读/写或写/写到相同的内存地址。
  • 一个线程的堆栈超过history_size=7的最大值,报告[failed to restore the stack]
  • 第一个线程生成一个tbb::detail::r1::rml::private_server,直到第二个测试。

这是因为来自第一个测试的tbb::detail::r1::rml::private_server仍然与第二个测试并发,这使tsan将其标记为数据竞争。

问题(s)

tbb::detail::r1::rml::private_server线程如何在每个测试的开始或结束时被杀死?

或者,如果这是不可能的,我们是否可以添加一些东西到我们的tsan.suppressions文件或TSAN_OPTIONS环境变量,专门抑制这个错误警告,而不隐藏可能发生的真实数据竞争?

为了在每次Google测试之后终止tbb::detail::r1::rml::private_server,我们覆盖了Test FixtureTearDown()方法:

void TearDown() override {
// Expected to kill tbb::detail::r1::rml::private_server after each test,
// which can otherwise trigger false positive tsan data race warnings.
auto handle = tbb::task_scheduler_handle::get();
tbb::finalize(handle, std::nothrow_t{});
}

在我们的TBB版本中,我们还必须使用#define TBB_PREVIEW_WAITING_FOR_WORKERS#include <tbb/global_control.h>

感谢Pavel Kumbrasev的建议。

您可以将Mach信号量替换为调度信号量来抑制警告。

参考下面的链接:https://developer.apple.com/documentation/dispatch/dispatch_semaphore

您还可以创建一个抑制文件来指定抑制运行时标志https://github.com/google/sanitizers/wiki/ThreadSanitizerSuppressions

如果这有帮助,您可以在编译时应用这些设置:

-fsanitize =线程-fsanitize-blacklist = sanitizer-thread-suppressions.txt

相关内容

  • 没有找到相关文章

最新更新