Kafka 控制台使用者错误"Offset commit failed on partition"



我正在使用kafka-console-consumer来探测kafka主题。

间歇性地,我收到此错误消息,然后是 2 个警告:

[2018-05-01 18:14:38,888] ERROR [Consumer clientId=consumer-1, groupId=console-consumer-56648] Offset commit failed on partition my-topic-0 at offset 444: The coordinator is not aware of this member. (org.apache.kafka.clients.consumer.internals.ConsumerCoordinator)
[2018-05-01 18:14:38,888] WARN [Consumer clientId=consumer-1, groupId=console-consumer-56648] Asynchronous auto-commit of offsets {my-topic-0=OffsetAndMetadata{offset=444, metadata=''}} failed: Commit cannot be completed since the group has already rebalanced and assigned the partitions to another member. This means that the time between subsequent calls to poll() was longer than the configured max.poll.interval.ms, which typically implies that the poll loop is spending too much time message processing. You can address this either by increasing the session timeout or by reducing the maximum size of batches returned in poll() with max.poll.records. (org.apache.kafka.clients.consumer.internals.ConsumerCoordinator)
[2018-05-01 18:14:38,888] WARN [Consumer clientId=consumer-1, groupId=console-consumer-56648] Synchronous auto-commit of offsets {my-topic-0=OffsetAndMetadata{offset=447, metadata=''}} failed: Commit cannot be completed since the group has already rebalanced and assigned the partitions to another member. This means that the time between subsequent calls to poll() was longer than the configured max.poll.interval.ms, which typically implies that the poll loop is spending too much time message processing. You can address this either by increasing the session timeout or by reducing the maximum size of batches returned in poll() with max.poll.records. (org.apache.kafka.clients.consumer.internals.ConsumerCoordinator)

它在警告日志中建议:

这意味着对 poll(( 的后续调用之间的时间比配置的 max.poll.interval.ms 长,这通常意味着轮询循环花费了太多时间处理消息。您可以通过增加会话超时或使用 max.poll.records 减少 poll(( 中返回的批处理的最大大小来解决此问题。

所以,我要么需要增加max.poll.interval.ms要么需要减少max.poll.records.

请告知每种方法的含义是什么,在不同情况下首选哪种方法?

如果增加"花时间处理大量记录是可以的"max.poll.interval.ms,并且如果可以比小批量更有效地处理较大的批次,您将获得吞吐量。

为了减少max.poll.records,"获取更少的记录,以便有足够的时间来处理它们",并且倾向于延迟而不是吞吐量。

还要考虑到两者都配置良好,但其他因素导致poll循环中的性能问题。在更改配置之前,我会先探索一下,这样您就不会掩盖更大的问题。

最新更新