我在 kubernetes 中部署了 kafka 和 zookeeper。如果我有动物园管理员的准备探测,我的 kafka 准备探测一直失败。如果我评论或删除 zookeeper 的准备情况探测器并再次部署,那么 kafka 服务器可以毫无问题地启动(并且 kafka 就绪不会失败(。
这是动物园管理员的准备探头:-
readinessProbe:
tcpSocket:
port: 2181
initialDelaySeconds: 20
periodSeconds: 20
timeoutSeconds: 5
successThreshold: 1
failureThreshold: 3
我的动物园管理员日志是
2018-06-18 11:27:24,863 [myid:0] - WARN [SendWorker:5135603447292250196:QuorumCnxManager$SendWorker@951] - Send worker leaving thread
2018-06-18 11:27:24,864 [myid:0] - INFO [kafka1-zookeeper-0.kafka1-zookeeper/172.30.99.87:3888:QuorumCnxManager$Listener@743] - Received connection request /10.186.58.164:57728
2018-06-18 11:27:24,864 [myid:0] - WARN [RecvWorker:1586112601866174465:QuorumCnxManager$RecvWorker@1025] - Connection broken for id 1586112601866174465, my id = 0, error =
java.io.IOException: Received packet with invalid packet: -66911279
at org.apache.zookeeper.server.quorum.QuorumCnxManager$RecvWorker.run(QuorumCnxManager.java:1012)
2018-06-18 11:27:24,865 [myid:0] - WARN [RecvWorker:1586112601866174465:QuorumCnxManager$RecvWorker@1028] - Interrupting SendWorker
2018-06-18 11:27:24,865 [myid:0] - WARN [SendWorker:1586112601866174465:QuorumCnxManager$SendWorker@941] - Interrupted while waiting for message on queue
java.lang.InterruptedException
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.reportInterruptAfterWait(AbstractQueuedSynchronizer.java:2025)
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2099)
at java.util.concurrent.ArrayBlockingQueue.poll(ArrayBlockingQueue.java:429)
at org.apache.zookeeper.server.quorum.QuorumCnxManager.pollSendQueue(QuorumCnxManager.java:1094)
at org.apache.zookeeper.server.quorum.QuorumCnxManager.access$700(QuorumCnxManager.java:74)
at org.apache.zookeeper.server.quorum.QuorumCnxManager$SendWorker.run(QuorumCnxManager.java:929)
2018-06-18 11:27:24,868 [myid:0] - WARN [SendWorker:1586112601866174465:QuorumCnxManager$SendWorker@951] - Send worker leaving thread
2018-06-18 11:30:54,282 [myid:0] - INFO [kafka1-zookeeper-0.kafka1-zookeeper/172.30.99.87:3888:QuorumCnxManager$Listener@743] - Received connection request /10.186.58.164:47944
2018-06-18 11:31:39,342 [myid:0] - WARN [kafka1-zookeeper-0.kafka1-zookeeper/172.30.99.87:3888:QuorumCnxManager@461] - Exception reading or writing challenge: java.net.SocketException: Connection reset
2018-06-18 11:31:39,342 [myid:0] - INFO [kafka1-zookeeper-0.kafka1-zookeeper/172.30.99.87:3888:QuorumCnxManager$Listener@743] - Received connection request /10.186.58.164:47946
2018-06-18 11:31:39,342 [myid:0] - WARN [RecvWorker:5135603447292250196:QuorumCnxManager$RecvWorker@1025] - Connection broken for id 5135603447292250196, my id = 0, error =
java.io.IOException: Received packet with invalid packet: 1414541105
at org.apache.zookeeper.server.quorum.QuorumCnxManager$RecvWorker.run(QuorumCnxManager.java:1012)
2018-06-18 11:31:39,343 [myid:0] - WARN [RecvWorker:5135603447292250196:QuorumCnxManager$RecvWorker@1028] - Interrupting SendWorker
2018-06-18 11:31:39,343 [myid:0] - WARN [SendWorker:5135603447292250196:QuorumCnxManager$SendWorker@941] - Interrupted while waiting for message on queue
java.lang.InterruptedException
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.reportInterruptAfterWait(AbstractQueuedSynchronizer.java:2025)
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2099)
at java.util.concurrent.ArrayBlockingQueue.poll(ArrayBlockingQueue.java:429)
at org.apache.zookeeper.server.quorum.QuorumCnxManager.pollSendQueue(QuorumCnxManager.java:1094)
at org.apache.zookeeper.server.quorum.QuorumCnxManager.access$700(QuorumCnxManager.java:74)
at org.apache.zookeeper.server.quorum.QuorumCnxManager$SendWorker.run(QuorumCnxManager.java:929)
2018-06-18 11:31:39,343 [myid:0] - WARN [SendWorker:5135603447292250196:QuorumCnxManager$SendWorker@951] - Send worker leaving thread
2018-06-18 11:31:44,433 [myid:0] - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory@215] - Accepted socket connection from /172.30.99.87:51010
2018-06-18 11:31:44,437 [myid:0] - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory@215] - Accepted socket connection from /172.30.99.87:51012
2018-06-18 11:31:44,439 [myid:0] - WARN [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@376] - Unable to read additional data from client sessionid 0x0, likely client has closed socket
2018-06-18 11:31:44,440 [myid:0] - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1040] - Closed socket connection for client /172.30.99.87:51012 (no session established for client)
2018-06-18 11:31:44,452 [myid:0] - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory@215] - Accepted socket connection from /172.30.99.87:51014
2018-06-18 11:31:49,438 [myid:0] - WARN [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@376] - Unable to read additional data from client sessionid 0x0, likely client has closed socket
2018-06-18 11:31:49,438 [myid:0] - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1040] - Closed socket connection for client /172.30.99.87:51010 (no session established for client)
2018-06-18 11:31:49,452 [myid:0] - WARN [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@376] - Unable to read additional data from client sessionid 0x0, likely client has closed socket
2018-06-18 11:31:49,453 [myid:0] - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1040] - Closed socket connection for client /172.30.99.87:51014 (no session established for client)
2018-06-18 11:33:59,669 [myid:0] - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory@215] - Accepted socket connection from /172.30.99.87:51148
2018-06-18 11:33:59,700 [myid:0] - WARN [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@376] - Unable to read additional data from client sessionid 0x0, likely client has closed socket
2018-06-18 11:33:59,700 [myid:0] - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1040] - Closed socket connection for client /172.30.99.87:51148 (no session established for client)
2018-06-18 11:33:59,713 [myid:0] - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory@215] - Accepted socket connection from /172.30.99.87:51150
2018-06-18 11:33:59,730 [myid:0] - WARN [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@376] - Unable to read additional data from client sessionid 0x0, likely client has closed socket
2018-06-18 11:33:59,730 [myid:0] - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1040] - Closed socket connection for client /172.30.99.87:51150 (no session established for client)
2018-06-18 11:34:00,274 [myid:0] - INFO [kafka1-zookeeper-0.kafka1-zookeeper/172.30.99.87:3888:QuorumCnxManager$Listener@743] - Received connection request /10.186.58.164:48860
2018-06-18 11:34:00,275 [myid:0] - WARN [RecvWorker:4616370699239609664:QuorumCnxManager$RecvWorker@1025] - Connection broken for id 4616370699239609664, my id = 0, error =
java.io.IOException: Received packet with invalid packet: -1200847881
at org.apache.zookeeper.server.quorum.QuorumCnxManager$RecvWorker.run(QuorumCnxManager.java:1012)
2018-06-18 11:34:00,275 [myid:0] - WARN [RecvWorker:4616370699239609664:QuorumCnxManager$RecvWorker@1028] - Interrupting SendWorker
2018-06-18 11:34:00,275 [myid:0] - WARN [SendWorker:4616370699239609664:QuorumCnxManager$SendWorker@941] - Interrupted while waiting for message on queue
java.lang.InterruptedException
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.reportInterruptAfterWait(AbstractQueuedSynchronizer.java:2025)
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2099)
at java.util.concurrent.ArrayBlockingQueue.poll(ArrayBlockingQueue.java:429)
at org.apache.zookeeper.server.quorum.QuorumCnxManager.pollSendQueue(QuorumCnxManager.java:1094)
at org.apache.zookeeper.server.quorum.QuorumCnxManager.access$700(QuorumCnxManager.java:74)
at org.apache.zookeeper.server.quorum.QuorumCnxManager$SendWorker.run(QuorumCnxManager.java:929)
2018-06-18 11:34:00,276 [myid:0] - WARN [SendWorker:4616370699239609664:QuorumCnxManager$SendWorker@951] - Send worker leaving thread
2018-06-18 11:34:00,277 [myid:0] - INFO [kafka1-zookeeper-0.kafka1-zookeeper/172.30.99.87:3888:QuorumCnxManager$Listener@743] - Received connection request /10.186.58.164:48862
2018-06-18 11:34:00,285 [myid:0] - WARN [kafka1-zookeeper-0.kafka1-zookeeper/172.30.99.87:3888:QuorumCnxManager@461] - Exception reading or writing challenge: java.net.SocketException: Connection reset
2018-06-18 11:40:10,712 [myid:0] - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory@215] - Accepted socket connection from /172.30.99.87:51522
2018-06-18 11:40:10,713 [myid:0] - WARN [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@376] - Unable to read additional data from client sessionid 0x0, likely client has closed socket
2018-06-18 11:40:10,713 [myid:0] - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1040] - Closed socket connection for client /172.30.99.87:51522 (no session established for client)
2018-06-18 11:40:10,782 [myid:0] - INFO [kafka1-zookeeper-0.kafka1-zookeeper/172.30.99.87:3888:QuorumCnxManager$Listener@743] - Received connection request /10.186.58.164:49556
2018-06-18 11:40:10,782 [myid:0] - WARN [kafka1-zookeeper-0.kafka1-zookeeper/172.30.99.87:3888:QuorumCnxManager@461] - Exception reading or writing challenge: java.net.SocketException: Connection reset
2018-06-18 16:07:03,456 [myid:0] - INFO [PurgeTask:DatadirCleanupManager$PurgeTask@138] - Purge task started.
2018-06-18 16:07:03,459 [myid:0] - INFO [PurgeTask:DatadirCleanupManager$PurgeTask@144] - Purge task completed
我有类似的问题。 在更改之后,帮助我克服了这一点。
# readinessProbe & livenessProbe
readinessProbe:
tcpSocket:
port: 9092
timeoutSeconds: 5
periodSeconds: 5
initialDelaySeconds: 45
livenessProbe:
exec:
command:
- sh
- -c
- "kafka-broker-api-versions.sh --bootstrap-server=localhost:9092"
timeoutSeconds: 5
periodSeconds: 5
initialDelaySeconds: 60
根据您的要求,您可以更新以下值。
初始延迟秒
你把你的卡夫卡绑定到动物园管理员,这不是一个好的做法。
我使用来自 apache 官方网站的 kafka,我发现有一个特定的脚本来制作准备探测:
readinessProbe:
exec:
command:
- sh
- -c
- "/opt/kafka/bin/kafka-broker-api-versions.sh --bootstrap-server=localhost:9092"