我在AWS RDS上有一个Postgres Db和一个kafka连接连接器(Debezium Postgres(在表上侦听。连接器的配置:
{
"name": "my-connector",
"config": {
"connector.class": "io.debezium.connector.postgresql.PostgresConnector",
"database.dbname": "my_db",
"database.user": "my_user",
"max.queue.size": "32000",
"slot.name": "my_slot",
"tasks.max": "1",
"publication.name": "my_publication",
"database.server.name": "postgres",
"heartbeat.interval.ms": "1000",
"database.port": "my_port",
"include.schema.changes": "false",
"plugin.name": "pgoutput",
"table.whitelist": "public.my_table",
"tombstones.on.delete": "false",
"database.hostname": "my_host",
"database.password": "my_password",
"name": "my-connector",
"max.batch.size": "10000",
"database.whitelist": "my_db",
"snapshot.mode": "never"
},
"tasks": [
{
"connector": "my-connector",
"task": 0
}
],
"type": "source"
}
该表不像其他表那样频繁更新,这最初导致了复制滞后,如下所示:
SELECT slot_name,
pg_size_pretty(pg_wal_lsn_diff(pg_current_wal_lsn(), restart_lsn)) as replicationSlotLag,
pg_size_pretty(pg_wal_lsn_diff(pg_current_wal_lsn(), confirmed_flush_lsn)) as confirmedLag,
active
FROM pg_replication_slots;
slot_name | replicationslotlag | confirmedlag | active
-------------------------------+--------------------+--------------+--------
my_slot | 1664 MB | 1664 MB | t
它会变得如此之大,以至于可能会耗尽所有磁盘空间。
我添加了一个心跳,如果我登录到kafka代理并设置一个控制台消费者,如下所示:./kafka-console-consumer.sh --bootstrap-server my.broker.address:9092 --topic __debezium-heartbeat.postgres --from-beginning --consumer.config=/etc/kafka/consumer.properties
它会转储所有的心跳信息,然后每1000毫秒显示一条新的。
然而,插槽的大小仍然在不断增长。如果我做了一些事情,比如在表中插入一个伪记录,它会将插槽设置回一个小的滞后,这样就可以了。
不过,我想用心跳来做这件事。我不想插入周期性消息,因为这听起来会增加复杂性。为什么心跳没有减少插槽大小?
请查看https://debezium.io/documentation/reference/1.0/connectors/postgresql.html#wal-磁盘空间
你真的需要发出周期性的消息,但现在有一个帮助-https://issues.redhat.com/browse/DBZ-1815