在使用KafkaSink时,我的Flink应用程序有一个令人讨厌的行为。如果我的应用程序包含一个到Kafka的Sink (EXACTLY_ONCE交付),它需要很长时间才能启动,如果我删除Kafka Sink(留下其他的)或用print替换它,那么应用程序只需要几秒钟就可以启动。在任务管理器日志中,我看到成千上万的重复行,如下所示:
2023-04-05 14:01:25,828 INFO org.apache.kafka.clients.producer.internals.TransactionManager [] - [Producer clientId=cid, transactionalId=kafka-sink-0-1] ProducerId set to 14788 with epoch 9
2023-04-05 14:01:25,828 INFO org.apache.kafka.clients.producer.internals.TransactionManager [] - [Producer clientId=cid, transactionalId=kafka-sink-0-1] Invoking InitProducerId for the first time in order to acquire a producer ID
2023-04-05 14:01:25,829 INFO org.apache.kafka.clients.producer.internals.TransactionManager [] - [Producer clientId=cid, transactionalId=kafka-sink-0-1] Discovered transaction coordinator b-2.mykafkacluster.amazonaws.com:9096 (id: 2 rack: null)
2023-04-05 14:01:25,932 INFO org.apache.kafka.clients.producer.internals.TransactionManager [] - [Producer clientId=cid, transactionalId=kafka-sink-0-1] ProducerId set to 15843 with epoch 8
2023-04-05 14:01:25,932 INFO org.apache.kafka.clients.producer.internals.TransactionManager [] - [Producer clientId=cid, transactionalId=kafka-sink-0-1] Invoking InitProducerId for the first time in order to acquire a producer ID
2023-04-05 14:01:25,933 INFO org.apache.kafka.clients.producer.internals.TransactionManager [] - [Producer clientId=cid, transactionalId=kafka-sink-0-1] Discovered transaction coordinator b-3.mykafkacluster.amazonaws.com:9096 (id: 3 rack: null)
2023-04-05 14:01:26,035 INFO org.apache.kafka.clients.producer.internals.TransactionManager [] - [Producer clientId=cid, transactionalId=kafka-sink-0-1] ProducerId set to 16840 with epoch 7
2023-04-05 14:01:26,035 INFO org.apache.kafka.clients.producer.internals.TransactionManager [] - [Producer clientId=cid, transactionalId=kafka-sink-0-1] Invoking InitProducerId for the first time in order to acquire a producer ID
2023-04-05 14:01:26,036 INFO org.apache.kafka.clients.producer.internals.TransactionManager [] - [Producer clientId=cid, transactionalId=kafka-sink-0-1] Discovered transaction coordinator b-1.mykafkacluster.amazonaws.com:9096 (id: 1 rack: null)
2023-04-05 14:01:26,139 INFO org.apache.kafka.clients.producer.internals.TransactionManager [] - [Producer clientId=cid, transactionalId=kafka-sink-0-1] ProducerId set to 14813 with epoch 6
2023-04-05 14:01:26,139 INFO org.apache.kafka.clients.producer.internals.TransactionManager [] - [Producer clientId=cid, transactionalId=kafka-sink-0-1] Invoking InitProducerId for the first time in order to acquire a producer ID
2023-04-05 14:01:26,140 INFO org.apache.kafka.clients.producer.internals.TransactionManager [] - [Producer clientId=cid, transactionalId=kafka-sink-0-1] Discovered transaction coordinator b-3.mykafkacluster.amazonaws.com:9096 (id: 3 rack: null)
2023-04-05 14:01:26,244 INFO org.apache.kafka.clients.producer.internals.TransactionManager [] - [Producer clientId=cid, transactionalId=kafka-sink-0-1] ProducerId set to 16858 with epoch 5
2023-04-05 14:01:26,244 INFO org.apache.kafka.clients.producer.internals.TransactionManager [] - [Producer clientId=cid, transactionalId=kafka-sink-0-1] Invoking InitProducerId for the first time in order to acquire a producer ID
2023-04-05 14:01:26,245 INFO org.apache.kafka.clients.producer.internals.TransactionManager [] - [Producer clientId=cid, transactionalId=kafka-sink-0-1] Discovered transaction coordinator b-1.mykafkacluster.amazonaws.com:9096 (id: 1 rack: null)
2023-04-05 14:01:26,348 INFO org.apache.kafka.clients.producer.internals.TransactionManager [] - [Producer clientId=cid, transactionalId=kafka-sink-0-1] ProducerId set to 14832 with epoch 4
2023-04-05 14:01:26,348 INFO org.apache.kafka.clients.producer.internals.TransactionManager [] - [Producer clientId=cid, transactionalId=kafka-sink-0-1] Invoking InitProducerId for the first time in order to acquire a producer ID
2023-04-05 14:01:26,349 INFO org.apache.kafka.clients.producer.internals.TransactionManager [] - [Producer clientId=cid, transactionalId=kafka-sink-0-1] Discovered transaction coordinator b-2.mykafkacluster.amazonaws.com:9096 (id: 2 rack: null)
2023-04-05 14:01:26,451 INFO org.apache.kafka.clients.producer.internals.TransactionManager [] - [Producer clientId=cid, transactionalId=kafka-sink-0-1] ProducerId set to 15886 with epoch 3
2023-04-05 14:01:26,452 INFO org.apache.kafka.clients.producer.internals.TransactionManager [] - [Producer clientId=cid, transactionalId=kafka-sink-0-1] Invoking InitProducerId for the first time in order to acquire a producer ID
2023-04-05 14:01:26,453 INFO org.apache.kafka.clients.producer.internals.TransactionManager [] - [Producer clientId=cid, transactionalId=kafka-sink-0-1] Discovered transaction coordinator b-3.mykafkacluster.amazonaws.com:9096 (id: 3 rack: null)
2023-04-05 14:01:26,555 INFO org.apache.kafka.clients.producer.internals.TransactionManager [] - [Producer clientId=cid, transactionalId=kafka-sink-0-1] ProducerId set to 16891 with epoch 2
2023-04-05 14:01:26,555 INFO org.apache.kafka.clients.producer.internals.TransactionManager [] - [Producer clientId=cid, transactionalId=kafka-sink-0-1] Invoking InitProducerId for the first time in order to acquire a producer ID
2023-04-05 14:01:26,556 INFO org.apache.kafka.clients.producer.internals.TransactionManager [] - [Producer clientId=cid, transactionalId=kafka-sink-0-1] Discovered transaction coordinator b-1.mykafkacluster.amazonaws.com:9096 (id: 1 rack: null)
2023-04-05 14:01:26,659 INFO org.apache.kafka.clients.producer.internals.TransactionManager [] - [Producer clientId=cid, transactionalId=kafka-sink-0-1] ProducerId set to 14860 with epoch 1
2023-04-05 14:01:26,660 INFO org.apache.kafka.clients.producer.internals.TransactionManager [] - [Producer clientId=cid, transactionalId=kafka-sink-0-1] Invoking InitProducerId for the first time in order to acquire a producer ID
2023-04-05 14:01:26,660 INFO org.apache.kafka.clients.producer.internals.TransactionManager [] - [Producer clientId=cid, transactionalId=kafka-sink-0-1] Discovered transaction coordinator b-2.mykafkacluster.amazonaws.com:9096 (id: 2 rack: null)
2023-04-05 14:01:26,766 INFO org.apache.kafka.clients.producer.internals.TransactionManager [] - [Producer clientId=cid, transactionalId=kafka-sink-0-1] ProducerId set to 15917 with epoch 0
2023-04-05 14:01:26,767 INFO org.apache.kafka.clients.producer.internals.TransactionManager [] - [Producer clientId=cid, transactionalId=kafka-sink-0-1] Invoking InitProducerId for the first time in order to acquire a producer ID
2023-04-05 14:01:26,767 INFO org.apache.kafka.clients.producer.internals.TransactionManager [] - [Producer clientId=cid, transactionalId=kafka-sink-0-1] Discovered transaction coordinator b-1.mykafkacluster.amazonaws.com:9096 (id: 1 rack: null)
2023-04-05 14:01:26,870 INFO org.apache.kafka.clients.producer.internals.TransactionManager [] - [Producer clientId=cid, transactionalId=kafka-sink-0-1] ProducerId set to 14713 with epoch 23
2023-04-05 14:01:26,870 INFO org.apache.kafka.clients.producer.internals.TransactionManager [] - [Producer clientId=cid, transactionalId=kafka-sink-0-1] Invoking InitProducerId for the first time in order to acquire a producer ID
2023-04-05 14:01:26,871 INFO org.apache.kafka.clients.producer.internals.TransactionManager [] - [Producer clientId=cid, transactionalId=kafka-sink-0-1] Discovered transaction coordinator b-2.mykafkacluster.amazonaws.com:9096 (id: 2 rack: null)
2023-04-05 14:01:26,974 INFO org.apache.kafka.clients.producer.internals.TransactionManager [] - [Producer clientId=cid, transactionalId=kafka-sink-0-1] ProducerId set to 15752 with epoch 22
2023-04-05 14:01:26,974 INFO org.apache.kafka.clients.producer.internals.TransactionManager [] - [Producer clientId=cid, transactionalId=kafka-sink-0-1] Invoking InitProducerId for the first time in order to acquire a producer ID
2023-04-05 14:01:26,975 INFO org.apache.kafka.clients.producer.internals.TransactionManager [] - [Producer clientId=cid, transactionalId=kafka-sink-0-1] Discovered transaction coordinator b-3.mykafkacluster.amazonaws.com:9096 (id: 3 rack: null)
2023-04-05 14:01:27,077 INFO org.apache.kafka.clients.producer.internals.TransactionManager [] - [Producer clientId=cid, transactionalId=kafka-sink-0-1] ProducerId set to 16751 with epoch 21
2023-04-05 14:01:27,077 INFO org.apache.kafka.clients.producer.internals.TransactionManager [] - [Producer clientId=cid, transactionalId=kafka-sink-0-1] Invoking InitProducerId for the first time in order to acquire a producer ID
2023-04-05 14:01:27,077 INFO org.apache.kafka.clients.producer.internals.TransactionManager [] - [Producer clientId=cid, transactionalId=kafka-sink-0-1] Discovered transaction coordinator b-1.mykafkacluster.amazonaws.com:9096 (id: 1 rack: null)
2023-04-05 14:01:27,180 INFO org.apache.kafka.clients.producer.internals.TransactionManager [] - [Producer clientId=cid, transactionalId=kafka-sink-0-1] ProducerId set to 14724 with epoch 20
2023-04-05 14:01:27,180 INFO org.apache.kafka.clients.producer.internals.TransactionManager [] - [Producer clientId=cid, transactionalId=kafka-sink-0-1] Invoking InitProducerId for the first time in order to acquire a producer ID
2023-04-05 14:01:27,181 INFO org.apache.kafka.clients.producer.internals.TransactionManager [] - [Producer clientId=cid, transactionalId=kafka-sink-0-1] Discovered transaction coordinator b-2.mykafkacluster.amazonaws.com:9096 (id: 2 rack: null)
2023-04-05 14:01:27,284 INFO org.apache.kafka.clients.producer.internals.TransactionManager [] - [Producer clientId=cid, transactionalId=kafka-sink-0-1] ProducerId set to 15768 with epoch 19
2023-04-05 14:01:27,284 INFO org.apache.kafka.clients.producer.internals.TransactionManager [] - [Producer clientId=cid, transactionalId=kafka-sink-0-1] Invoking InitProducerId for the first time in order to acquire a producer ID
2023-04-05 14:01:27,285 INFO org.apache.kafka.clients.producer.internals.TransactionManager [] - [Producer clientId=cid, transactionalId=kafka-sink-0-1] Discovered transaction coordinator b-3.mykafkacluster.amazonaws.com:9096 (id: 3 rack: null)
2023-04-05 14:01:27,387 INFO org.apache.kafka.clients.producer.internals.TransactionManager [] - [Producer clientId=cid, transactionalId=kafka-sink-0-1] ProducerId set to 16765 with epoch 18
2023-04-05 14:01:27,387 INFO org.apache.kafka.clients.producer.internals.TransactionManager [] - [Producer clientId=cid, transactionalId=kafka-sink-0-1] Invoking InitProducerId for the first time in order to acquire a producer ID
2023-04-05 14:01:27,388 INFO org.apache.kafka.clients.producer.internals.TransactionManager [] - [Producer clientId=cid, transactionalId=kafka-sink-0-1] Discovered transaction coordinator b-1.mykafkacluster.amazonaws.com:9096 (id: 1 rack: null)
2023-04-05 14:01:27,492 INFO org.apache.kafka.clients.producer.internals.TransactionManager [] - [Producer clientId=cid, transactionalId=kafka-sink-0-1] ProducerId set to 14737 with epoch 17
2023-04-05 14:01:27,492 INFO org.apache.kafka.clients.producer.internals.TransactionManager [] - [Producer clientId=cid, transactionalId=kafka-sink-0-1] Invoking InitProducerId for the first time in order to acquire a producer ID
2023-04-05 14:01:27,493 INFO org.apache.kafka.clients.producer.internals.TransactionManager [] - [Producer clientId=cid, transactionalId=kafka-sink-0-1] Discovered transaction coordinator b-2.mykafkacluster.amazonaws.com:9096 (id: 2 rack: null)
2023-04-05 14:01:27,596 INFO org.apache.kafka.clients.producer.internals.TransactionManager [] - [Producer clientId=cid, transactionalId=kafka-sink-0-1] ProducerId set to 15785 with epoch 16
2023-04-05 14:01:27,596 INFO org.apache.kafka.clients.producer.internals.TransactionManager [] - [Producer clientId=cid, transactionalId=kafka-sink-0-1] Invoking InitProducerId for the first time in order to acquire a producer ID
2023-04-05 14:01:27,599 INFO org.apache.kafka.clients.producer.internals.TransactionManager [] - [Producer clientId=cid, transactionalId=kafka-sink-0-1] Discovered transaction coordinator b-3.mykafkacluster.amazonaws.com:9096 (id: 3 rack: null)
2023-04-05 14:01:27,702 INFO org.apache.kafka.clients.producer.internals.TransactionManager [] - [Producer clientId=cid, transactionalId=kafka-sink-0-1] ProducerId set to 16782 with epoch 15
2023-04-05 14:01:27,702 INFO org.apache.kafka.clients.producer.internals.TransactionManager [] - [Producer clientId=cid, transactionalId=kafka-sink-0-1] Invoking InitProducerId for the first time in order to acquire a producer ID
2023-04-05 14:01:27,703 INFO org.apache.kafka.clients.producer.internals.TransactionManager [] - [Producer clientId=cid, transactionalId=kafka-sink-0-1] Discovered transaction coordinator b-2.mykafkacluster.amazonaws.com:9096 (id: 2 rack: null)
2023-04-05 14:01:27,815 INFO org.apache.kafka.clients.producer.internals.TransactionManager [] - [Producer clientId=cid, transactionalId=kafka-sink-0-1] ProducerId set to 15798 with epoch 14
2023-04-05 14:01:27,815 INFO org.apache.kafka.clients.producer.internals.TransactionManager [] - [Producer clientId=cid, transactionalId=kafka-sink-0-1] Invoking InitProducerId for the first time in order to acquire a producer ID
2023-04-05 14:01:27,816 INFO org.apache.kafka.clients.producer.internals.TransactionManager [] - [Producer clientId=cid, transactionalId=kafka-sink-0-1] Discovered transaction coordinator b-1.mykafkacluster.amazonaws.com:9096 (id: 1 rack: null)
2023-04-05 14:01:27,919 INFO org.apache.kafka.clients.producer.internals.TransactionManager [] - [Producer clientId=cid, transactionalId=kafka-sink-0-1] ProducerId set to 14759 with epoch 13
2023-04-05 14:01:27,919 INFO org.apache.kafka.clients.producer.internals.TransactionManager [] - [Producer clientId=cid, transactionalId=kafka-sink-0-1] Invoking InitProducerId for the first time in order to acquire a producer ID
2023-04-05 14:01:27,920 INFO org.apache.kafka.clients.producer.internals.TransactionManager [] - [Producer clientId=cid, transactionalId=kafka-sink-0-1] Discovered transaction coordinator b-3.mykafkacluster.amazonaws.com:9096 (id: 3 rack: null)
2023-04-05 14:01:28,022 INFO org.apache.kafka.clients.producer.internals.TransactionManager [] - [Producer clientId=cid, transactionalId=kafka-sink-0-1] ProducerId set to 16803 with epoch 12
2023-04-05 14:01:28,023 INFO org.apache.kafka.clients.producer.internals.TransactionManager [] - [Producer clientId=cid, transactionalId=kafka-sink-0-1] Invoking InitProducerId for the first time in order to acquire a producer ID
2023-04-05 14:01:28,024 INFO org.apache.kafka.clients.producer.internals.TransactionManager [] - [Producer clientId=cid, transactionalId=kafka-sink-0-1] Discovered transaction coordinator b-2.mykafkacluster.amazonaws.com:9096 (id: 2 rack: null)
2023-04-05 14:01:28,126 INFO org.apache.kafka.clients.producer.internals.TransactionManager [] - [Producer clientId=cid, transactionalId=kafka-sink-0-1] ProducerId set to 15820 with epoch 11
2023-04-05 14:01:28,126 INFO org.apache.kafka.clients.producer.internals.TransactionManager [] - [Producer clientId=cid, transactionalId=kafka-sink-0-1] Invoking InitProducerId for the first time in order to acquire a producer ID
2023-04-05 14:01:28,127 INFO org.apache.kafka.clients.producer.internals.TransactionManager [] - [Producer clientId=cid, transactionalId=kafka-sink-0-1] Discovered transaction coordinator b-1.mykafkacluster.amazonaws.com:9096 (id: 1 rack: null)
2023-04-05 14:01:28,230 INFO org.apache.kafka.clients.producer.internals.TransactionManager [] - [Producer clientId=cid, transactionalId=kafka-sink-0-1] ProducerId set to 14782 with epoch 10
2023-04-05 14:01:28,230 INFO org.apache.kafka.clients.producer.internals.TransactionManager [] - [Producer clientId=cid, transactionalId=kafka-sink-0-1] Invoking InitProducerId for the first time in order to acquire a producer ID
2023-04-05 14:01:28,231 INFO org.apache.kafka.clients.producer.internals.TransactionManager [] - [Producer clientId=cid, transactionalId=kafka-sink-0-1] Discovered transaction coordinator b-3.mykafkacluster.amazonaws.com:9096 (id: 3 rack: null)
2023-04-05 14:01:28,333 INFO org.apache.kafka.clients.producer.internals.TransactionManager [] - [Producer clientId=cid, transactionalId=kafka-sink-0-1] ProducerId set to 16825 with epoch 9
2023-04-05 14:01:28,333 INFO org.apache.kafka.clients.producer.internals.TransactionManager [] - [Producer clientId=cid, transactionalId=kafka-sink-0-1] Invoking InitProducerId for the first time in order to acquire a producer ID
2023-04-05 14:01:28,334 INFO org.apache.kafka.clients.producer.internals.TransactionManager [] - [Producer clientId=cid, transactionalId=kafka-sink-0-1] Discovered transaction coordinator b-2.mykafkacluster.amazonaws.com:9096 (id: 2 rack: null)
2023-04-05 14:01:28,436 INFO org.apache.kafka.clients.producer.internals.TransactionManager [] - [Producer clientId=cid, transactionalId=kafka-sink-0-1] ProducerId set to 15844 with epoch 8
2023-04-05 14:01:28,436 INFO org.apache.kafka.clients.producer.internals.TransactionManager [] - [Producer clientId=cid, transactionalId=kafka-sink-0-1] Invoking InitProducerId for the first time in order to acquire a producer ID
2023-04-05 14:01:28,437 INFO org.apache.kafka.clients.producer.internals.TransactionManager [] - [Producer clientId=cid, transactionalId=kafka-sink-0-1] Discovered transaction coordinator b-1.mykafkacluster.amazonaws.com:9096 (id: 1 rack: null)
2023-04-05 14:01:28,542 INFO org.apache.kafka.clients.producer.internals.TransactionManager [] - [Producer clientId=cid, transactionalId=kafka-sink-0-1] ProducerId set to 14806 with epoch 7
2023-04-05 14:01:28,543 INFO org.apache.kafka.clients.producer.internals.TransactionManager [] - [Producer clientId=cid, transactionalId=kafka-sink-0-1] Invoking InitProducerId for the first time in order to acquire a producer ID
2023-04-05 14:01:28,544 INFO org.apache.kafka.clients.producer.internals.TransactionManager [] - [Producer clientId=cid, transactionalId=kafka-sink-0-1] Discovered transaction coordinator b-3.mykafkacluster.amazonaws.com:9096 (id: 3 rack: null)
2023-04-05 14:01:28,646 INFO org.apache.kafka.clients.producer.internals.TransactionManager [] - [Producer clientId=cid, transactionalId=kafka-sink-0-1] ProducerId set to 16851 with epoch 6
2023-04-05 14:01:28,646 INFO org.apache.kafka.clients.producer.internals.TransactionManager [] - [Producer clientId=cid, transactionalId=kafka-sink-0-1] Invoking InitProducerId for the first time in order to acquire a producer ID
2023-04-05 14:01:28,647 INFO org.apache.kafka.clients.producer.internals.TransactionManager [] - [Producer clientId=cid, transactionalId=kafka-sink-0-1] Discovered transaction coordinator b-2.mykafkacluster.amazonaws.com:9096 (id: 2 rack: null)
2023-04-05 14:01:28,749 INFO org.apache.kafka.clients.producer.internals.TransactionManager [] - [Producer clientId=cid, transactionalId=kafka-sink-0-1] ProducerId set to 15869 with epoch 5
2023-04-05 14:01:28,749 INFO org.apache.kafka.clients.producer.internals.TransactionManager [] - [Producer clientId=cid, transactionalId=kafka-sink-0-1] Invoking InitProducerId for the first time in order to acquire a producer ID
2023-04-05 14:01:28,751 INFO org.apache.kafka.clients.producer.internals.TransactionManager [] - [Producer clientId=cid, transactionalId=kafka-sink-0-1] Discovered transaction coordinator b-3.mykafkacluster.amazonaws.com:9096 (id: 3 rack: null)
2023-04-05 14:01:28,853 INFO org.apache.kafka.clients.producer.internals.TransactionManager [] - [Producer clientId=cid, transactionalId=kafka-sink-0-1] ProducerId set to 16868 with epoch 4
2023-04-05 14:01:28,853 INFO org.apache.kafka.clients.producer.internals.TransactionManager [] - [Producer clientId=cid, transactionalId=kafka-sink-0-1] Invoking InitProducerId for the first time in order to acquire a producer ID
2023-04-05 14:01:28,854 INFO org.apache.kafka.clients.producer.internals.TransactionManager [] - [Producer clientId=cid, transactionalId=kafka-sink-0-1] Discovered transaction coordinator b-2.mykafkacluster.amazonaws.com:9096 (id: 2 rack: null)
2023-04-05 14:01:28,956 INFO org.apache.kafka.clients.producer.internals.TransactionManager [] - [Producer clientId=cid, transactionalId=kafka-sink-0-1] ProducerId set to 15887 with epoch 3
2023-04-05 14:01:28,956 INFO org.apache.kafka.clients.producer.internals.TransactionManager [] - [Producer clientId=cid, transactionalId=kafka-sink-0-1] Invoking InitProducerId for the first time in order to acquire a producer ID
2023-04-05 14:01:28,957 INFO org.apache.kafka.clients.producer.internals.TransactionManager [] - [Producer clientId=cid, transactionalId=kafka-sink-0-1] Discovered transaction coordinator b-1.mykafkacluster.amazonaws.com:9096 (id: 1 rack: null)
2023-04-05 14:01:29,060 INFO org.apache.kafka.clients.producer.internals.TransactionManager [] - [Producer clientId=cid, transactionalId=kafka-sink-0-1] ProducerId set to 14846 with epoch 2
2023-04-05 14:01:29,060 INFO org.apache.kafka.clients.producer.internals.TransactionManager [] - [Producer clientId=cid, transactionalId=kafka-sink-0-1] Invoking InitProducerId for the first time in order to acquire a producer ID
2023-04-05 14:01:29,061 INFO org.apache.kafka.clients.producer.internals.TransactionManager [] - [Producer clientId=cid, transactionalId=kafka-sink-0-1] Discovered transaction coordinator b-3.mykafkacluster.amazonaws.com:9096 (id: 3 rack: null)
2023-04-05 14:01:29,163 INFO org.apache.kafka.clients.producer.internals.TransactionManager [] - [Producer clientId=cid, transactionalId=kafka-sink-0-1] ProducerId set to 16900 with epoch 1
2023-04-05 14:01:29,163 INFO org.apache.kafka.clients.producer.internals.TransactionManager [] - [Producer clientId=cid, transactionalId=kafka-sink-0-1] Invoking InitProducerId for the first time in order to acquire a producer ID
2023-04-05 14:01:29,163 INFO org.apache.kafka.clients.producer.internals.TransactionManager [] - [Producer clientId=cid, transactionalId=kafka-sink-0-1] Discovered transaction coordinator b-2.mykafkacluster.amazonaws.com:9096 (id: 2 rack: null)
2023-04-05 14:01:29,268 INFO org.apache.kafka.clients.producer.internals.TransactionManager [] - [Producer clientId=cid, transactionalId=kafka-sink-0-1] ProducerId set to 15918 with epoch 0
2023-04-05 14:01:29,268 INFO org.apache.kafka.clients.producer.internals.TransactionManager [] - [Producer clientId=cid, transactionalId=kafka-sink-0-1] Invoking InitProducerId for the first time in order to acquire a producer ID
2023-04-05 14:01:29,269 INFO org.apache.kafka.clients.producer.internals.TransactionManager [] - [Producer clientId=cid, transactionalId=kafka-sink-0-1] Discovered transaction coordinator b-3.mykafkacluster.amazonaws.com:9096 (id: 3 rack: null)
2023-04-05 14:01:29,374 INFO org.apache.kafka.clients.producer.internals.TransactionManager [] - [Producer clientId=cid, transactionalId=kafka-sink-0-1] ProducerId set to 16748 with epoch 22
2023-04-05 14:01:29,374 INFO org.apache.kafka.clients.producer.internals.TransactionManager [] - [Producer clientId=cid, transactionalId=kafka-sink-0-1] Invoking InitProducerId for the first time in order to acquire a producer ID
2023-04-05 14:01:29,375 INFO org.apache.kafka.clients.producer.internals.TransactionManager [] - [Producer clientId=cid, transactionalId=kafka-sink-0-1] Discovered transaction coordinator b-1.mykafkacluster.amazonaws.com:9096 (id: 1 rack: null)
2023-04-05 14:01:29,478 INFO org.apache.kafka.clients.producer.internals.TransactionManager [] - [Producer clientId=cid, transactionalId=kafka-sink-0-1] ProducerId set to 14721 with epoch 21
2023-04-05 14:01:29,478 INFO org.apache.kafka.clients.producer.internals.TransactionManager [] - [Producer clientId=cid, transactionalId=kafka-sink-0-1] Invoking InitProducerId for the first time in order to acquire a producer ID
2023-04-05 14:01:29,479 INFO org.apache.kafka.clients.producer.internals.TransactionManager [] - [Producer clientId=cid, transactionalId=kafka-sink-0-1] Discovered transaction coordinator b-2.mykafkacluster.amazonaws.com:9096 (id: 2 rack: null)
2023-04-05 14:01:29,582 INFO org.apache.kafka.clients.producer.internals.TransactionManager [] - [Producer clientId=cid, transactionalId=kafka-sink-0-1] ProducerId set to 15763 with epoch 20
2023-04-05 14:01:29,582 INFO org.apache.kafka.clients.producer.internals.TransactionManager [] - [Producer clientId=cid, transactionalId=kafka-sink-0-1] Invoking InitProducerId for the first time in order to acquire a producer ID
2023-04-05 14:01:29,582 INFO org.apache.kafka.clients.producer.internals.TransactionManager [] - [Producer clientId=cid, transactionalId=kafka-sink-0-1] Discovered transaction coordinator b-3.mykafkacluster.amazonaws.com:9096 (id: 3 rack: null)
2023-04-05 14:01:29,684 INFO org.apache.kafka.clients.producer.internals.TransactionManager [] - [Producer clientId=cid, transactionalId=kafka-sink-0-1] ProducerId set to 16761 with epoch 19
2023-04-05 14:01:29,685 INFO org.apache.kafka.clients.producer.internals.TransactionManager [] - [Producer clientId=cid, transactionalId=kafka-sink-0-1] Invoking InitProducerId for the first time in order to acquire a producer ID
2023-04-05 14:01:29,685 INFO org.apache.kafka.clients.producer.internals.TransactionManager [] - [Producer clientId=cid, transactionalId=kafka-sink-0-1] Discovered transaction coordinator b-1.mykafkacluster.amazonaws.com:9096 (id: 1 rack: null)
2023-04-05 14:01:29,788 INFO org.apache.kafka.clients.producer.internals.TransactionManager [] - [Producer clientId=cid, transactionalId=kafka-sink-0-1] ProducerId set to 14734 with epoch 18
2023-04-05 14:01:29,788 INFO org.apache.kafka.clients.producer.internals.TransactionManager [] - [Producer clientId=cid, transactionalId=kafka-sink-0-1] Invoking InitProducerId for the first time in order to acquire a producer ID
2023-04-05 14:01:29,790 INFO org.apache.kafka.clients.producer.internals.TransactionManager [] - [Producer clientId=cid, transactionalId=kafka-sink-0-1] Discovered transaction coordinator b-2.mykafkacluster.amazonaws.com:9096 (id: 2 rack: null)
2023-04-05 14:01:29,895 INFO org.apache.kafka.clients.producer.internals.TransactionManager [] - [Producer clientId=cid, transactionalId=kafka-sink-0-1] ProducerId set to 15779 with epoch 17
2023-04-05 14:01:29,895 INFO org.apache.kafka.clients.producer.internals.TransactionManager [] - [Producer clientId=cid, transactionalId=kafka-sink-0-1] Invoking InitProducerId for the first time in order to acquire a producer ID
2023-04-05 14:01:29,896 INFO org.apache.kafka.clients.producer.internals.TransactionManager [] - [Producer clientId=cid, transactionalId=kafka-sink-0-1] Discovered transaction coordinator b-3.mykafkacluster.amazonaws.com:9096 (id: 3 rack: null)
重复20分钟,有时甚至更长…
我试着移除水槽,应用程序在几秒钟内启动,但我正在失去输出。我还尝试将Sink交付保证更改为AT_LEAST_ONCE,它似乎更快(~1分钟),但仍然有很多这些日志…
这是Sink配置(与认证相关的一些道具被隐藏并加载到appConfiguration
对象中)
KafkaSink deviceDaySink = KafkaSink.<DeviceDayTimeTuple>builder()
.setRecordSerializer(KafkaRecordSerializationSchema.builder()
.setTopic(appConfiguration.getString("device-day")
.setValueSerializationSchema(new SerializeJsonDeviceDayTime())
.build()
)
.setDeliveryGuarantee(DeliveryGuarantee.EXACTLY_ONCE)
.setKafkaProducerConfig(appConfiguration.getKafkaProps())
.setProperty("transactional.id", "tid")
.setProperty("client.id", "cid")
.setProperty("transaction.timeout.ms", "30000")
.build();
编辑附加信息(Apr 12):现在我们每隔30秒设置一个检查点,这个主题之前已经被同一个应用程序写过了,但是如果我们重新创建这个主题,问题还是一样的。通过kafka的CLI脚本kafka-transactions.sh
,我们注意到flink应用程序在启动时打开了很多空事务:
kafka-sink-0-3394 3 18862 Empty
kafka-sink-0-4122 3 21106 Empty
kafka-sink-0-6148 3 21793 Empty
kafka-sink-0-4366 3 21189 Empty
kafka-sink-0-2584 3 18585 Empty
kafka-sink-0-2340 3 18501 Empty
kafka-sink-0-30 3 16458 Empty
kafka-sink-0-4816 3 21341 Empty
kafka-sink-0-308 3 16553 Empty
kafka-sink-0-2872 3 18682 Empty
kafka-sink-0-4654 3 21288 Empty
kafka-sink-0-3600 3 18931 Empty
kafka-sink-0-4898 3 21369 Empty
kafka-sink-0-146 3 16498 Empty
kafka-sink-0-5626 3 21616 Empty
kafka-sink-0-3844 3 21014 Empty
kafka-sink-0-4492 3 21231 Empty
kafka-sink-0-272 3 16541 Empty
kafka-sink-0-5220 3 21478 Empty
kafka-sink-0-5464 3 21561 Empty
kafka-sink-0-3682 3 18958 Empty
这些只是全部的一部分,并且transactionalId和producerID都与日志文件中的匹配。
问题在于Flink如何管理Kafka事务。对于常规生产者,当它想要使用事务时,它将用settransactionalId
初始化事务,Kafka将响应producentId
和epoch
(这是一个从0开始的迭代器)。
这是为了确保只有最新的生产者可以实际输出数据到Kafka。任何具有相同trasactionalId
但epoch
低于最新分配的生产者都将被隔离,并且不允许产生事务。这也意味着对于给定的生产者(由transactionalId
标识)将只有一个交易。
然而,Flink做的事情有点不同。对于每个检查点,它将创建一个新的transactionalId
和一个新的事务。该事务的名称将是<transactionalIdPrefix>-<subtask id>-<checkpoint id>
。因此,它将生成kafka-sink-0-1
,kafka-sink-0-2
等交易。
Kafka中有关事务的信息被持久化在主题中,并存储在当前Kafka事务leader缓存中。通常,事务主题的保留时间为1周。因此,使用单个生产者和单个实例的1分钟检查点,将有超过10,000个事务。当然,我们希望它们都处于Empty
或CompleteCommited
状态。
Flink在Kafka生产者启动时所做的是首先确保没有正在进行的事务。这是在org.apache.flink.connector.kafka.sink.TransactionAborter#abortTransactionOfSubtask
完成的。这段代码所做的是天真地从<transactionalIdPrefix>-<subtask id>-<current checkpoint id>
开始构建transactionalId
,并迭代直到Kafka返回epoch
等于0
,因为这意味着这是Kafka第一次看到这样的transactionalId
。它将执行initTransaction
,这使得Kafka用相同的transactionalId
刷新之前的事务,并使其保持在Empty
状态。
这对于常规操作没有问题,因为当前检查点id是最新的检查点。但是,如果您从上一个检查点开始或从新状态开始,则不会出现这种情况。在这种情况下,您的当前检查点id将是1
,因此它将从1
开始,并将上升到该集群中创建的最后一个事务。
每次Flink启动时,直到检查点id通过最后一个事务为止。
因为我知道Kafka中没有API允许你删除特定的transationalId
。中止它将只留下CompleteAborted
。你的选择是:
- 每当你想要开始新的Flink状态,你应该改变
transactionalIdPrefix
。这只会在你清理状态时起作用,因为如果你改变了transactionalIdPrefix
, Flink知道前一个是什么,并且无论如何都会扫描这些事务。 - 清除Kafka事务状态topic和force leader更改,以便它将从topic刷新缓存。
- 从国家和军队领导人的变化中删除特定的交易。
p。另外,如果你在共享Kafka上运行多个Flink本地实例,确保每个人都有唯一的transactionalIdPrefix
。
当检查点失败时,也可以进行扫描,因为它从上次成功的检查点开始扫描。
我们遇到了同样的问题,并注意到它来自不优雅的关闭。例如,当您使用intelllij在本地运行应用程序并按下停止按钮时,关闭方法不会被调用。不幸的是,只有在运行模式下才能正常关机,而不是在调试模式下....