调试随机 SIGSEGV 崩溃



我们在分布式模式下运行Kafka Connect应用程序时遇到随机JVM崩溃。连接应用程序运行具有自定义任务实现的自定义连接器。该应用程序在以Alpine Linux作为基础映像的Docker容器上运行。崩溃是完全随机的,我的意思是:

  1. 错误日志不会指向每次崩溃的相同堆栈跟踪(见下文)
  2. 崩溃发生在不同计算机上的随机时间点
  3. 推送底层虚拟机(高 CPU 负载、高内存负载、高 IO 磁盘负载)对崩溃的频率没有任何影响

机器 1 崩溃

# A fatal error has been detected by the Java Runtime Environment:
#
#  SIGSEGV (0xb) at pc=0x00007fc1e92df777, pid=48, tid=0x00007fc1d899eae8
#
# JRE version: OpenJDK Runtime Environment (8.0_151-b12) (build 1.8.0_151-b12)
# Java VM: OpenJDK 64-Bit Server VM (25.151-b12 mixed mode linux-amd64 compressed oops)
# Derivative: IcedTea 3.6.0
# Distribution: Custom build (Tue Nov 21 11:22:36 GMT 2017)
# Problematic frame:
# V  [libjvm.so+0x4c4777]  JVM_FindSignal+0x52586
#
# Core dump written. Default location: /home/kafka/core or core.48

机器 2 崩溃

# A fatal error has been detected by the Java Runtime Environment:
#
#  SIGSEGV (0xb) at pc=0x00007f7825f1da82, pid=48, tid=0x00007f78179bbae8
#
# JRE version: OpenJDK Runtime Environment (8.0_151-b12) (build 1.8.0_151-b12)
# Java VM: OpenJDK 64-Bit Server VM (25.151-b12 mixed mode linux-amd64 compressed oops)
# Derivative: IcedTea 3.6.0
# Distribution: Custom build (Tue Nov 21 11:22:36 GMT 2017)
# Problematic frame:
# j  io.prometheus.jmx.shaded.io.prometheus.client.exporter.common.TextFormat.write004(Ljava/io/Writer;Ljava/util/Enumeration;)V+115
#
# Core dump written. Default location: /home/kafka/core or core.48

机器 3 崩溃

# A fatal error has been detected by the Java Runtime Environment:
#
#  SIGSEGV (0xb) at pc=0x00007fe4463fc2bc, pid=48, tid=0x00007fe435e06ae8
#
# JRE version: OpenJDK Runtime Environment (8.0_151-b12) (build 1.8.0_151-b12)
# Java VM: OpenJDK 64-Bit Server VM (25.151-b12 mixed mode linux-amd64 compressed oops)
# Derivative: IcedTea 3.6.0
# Distribution: Custom build (Tue Nov 21 11:22:36 GMT 2017)
# Problematic frame:
# C  [libjvm.so+0x27b2bc]
#
# Core dump written. Default location: /home/kafka/core or core.48

机器 4 崩溃

# A fatal error has been detected by the Java Runtime Environment:
#
#  SIGSEGV (0xb) at pc=0x00007f130fcb1b93, pid=48, tid=0x00007f130d65a700
#
# JRE version: OpenJDK Runtime Environment (8.0_212-b03) (build 1.8.0_212-8u212-b03-2~deb9u1-b03)
# Java VM: OpenJDK 64-Bit Server VM (25.212-b03 mixed mode linux-amd64 compressed oops)
# Problematic frame:
# V  [libjvm.so+0x760b93]
#
# Core dump written. Default location: /home/kafka/core or core.48

列表继续这样。 其他一些要提到的事情:

  • 无应用程序日志
  • 没有写入核心转储(检查了错误文件中提到的位置,但里面什么都没有)

到目前为止,我们尝试过但没有效果的事情:

  • 从基于 Alpine 的 docker 镜像切换到 Debian
  • 不包括普罗米修斯代理
  • 将 Open JDK 版本从 8.0.151 更新到 8.0.212

任何关于发现问题的提示将不胜感激!

似乎使用 JRE 11 运行应用程序已经解决了这个问题。该项目仍然使用 Java 8 构建,但使用 Java 11 运行它已经停止了崩溃。

相关内容

最新更新