Debezium SQL Server Source Connectors每天只流式传输事件一次



我面临一个问题,我的Debezium SQL Server Source Connectors没有实时流式传输CDC捕获,而是每天在19:07:29的同一时间流式传输所有事件。以下是的样本记录

"source":{
"version":"1.8.1.Final",
"connector":"sqlserver",
"name":"dataroom-rds",
"ts_ms":1655275673757,
"snapshot":"false",
"db":"dbDataRoom",
"sequence":null,
"schema":"dbo",
"table":"tblFolder",
"change_lsn":"0056e213:00000e55:000a",
"commit_lsn":"0056e213:00000ecd:0014",
"event_serial_no":1
},
"op":"c",
"ts_ms":1655284049349,
"transaction":null

您可以看到source.ts_msts_ms的差异。source.ts_ms15 June 2022 16:47:53.757,而ts_ms15 June 2022 19:07:29.349。文档中提到这意味着存在滞后,但没有提供更多信息。

https://debezium.io/documentation/reference/connectors/sqlserver#sqlserver-更改事件值

在源对象中,ts_ms表示更改发生的时间在数据库中提交。通过比较payload.source.ts_ms的值,您可以确定源数据库更新和Debezium之间的滞后。

下面是连接器配置

apiVersion: kafka.strimzi.io/v1beta2
kind: KafkaConnector
metadata:
name: mssql-connector-dbdataroom-raw
namespace: kafka
labels:
strimzi.io/cluster: kafka-connect-cluster
spec:
class: io.debezium.connector.sqlserver.SqlServerConnector
tasksMax: 1
config:
database.server.id: "1"
database.server.name: "dataroom-rds"
database.hostname: "${secrets:kafka/dataroom-rds-creds:host}"
database.port: "1433"
database.user: "${secrets:kafka/dataroom-rds-creds:username}"
database.password: "${secrets:kafka/dataroom-rds-creds:password}"
database.dbname: "${secrets:kafka/dataroom-rds-creds:dbname}"
table.include.list: "dbo.tblFolder,dbo.tblDocument,dbo.tblAllowedSecurityGroupFolder,dbo.tblAllowedSecurityGroupFolderHistory"
snapshot.isolation.mode: "read_committed"
snapshot.lock.timeout.ms: "-1"
poll.interval.ms: "2000"
database.history.kafka.bootstrap.servers: KUSTOMIZE_REPLACEMENT
database.history.kafka.topic: "schema-changes.dbdataroom"
database.history.producer.security.protocol: "SASL_SSL"
database.history.producer.sasl.mechanism: "SCRAM-SHA-512"
database.history.producer.sasl.jaas.config: "org.apache.kafka.common.security.scram.ScramLoginModule required username=${secrets:kafka/kafka-connect-msk-secrets:msk_sasl_user} password=${secrets:kafka/kafka-connect-msk-secrets:msk_sasl_password} ;"
database.history.consumer.security.protocol: "SASL_SSL"
database.history.consumer.sasl.mechanism: "SCRAM-SHA-512"
database.history.consumer.sasl.jaas.config: "org.apache.kafka.common.security.scram.ScramLoginModule required username=${secrets:kafka/kafka-connect-msk-secrets:msk_sasl_user} password=${secrets:kafka/kafka-connect-msk-secrets:msk_sasl_password} ;"
transforms: "changeTopicCase"
# This transform changes the case of tables names to lowercase to match the topic. Table names in SQL Server
# have uppercase characters.
transforms.changeTopicCase.type: "com.github.jcustenborder.kafka.connect.transform.common.ChangeTopicCase"
transforms.changeTopicCase.from: "UPPER_UNDERSCORE"
transforms.changeTopicCase.to: "LOWER_UNDERSCORE"

Hassan,如果没有日志,这很难解决。请检查以下内容并张贴更多日志

  1. 检查表上是否启用了CDC
  2. 查询SQL Server中的CDC表,有一个默认的CDC模式,其中捕获了更改
SELECT TOP 10 * FROM cdc.dbo_<TABLE-NAME//e.g.tblFolder>_CT -- add filters on time columns 
  1. 唯一的SMT是案例更改,因此预计不会出现任何问题

  2. 检查Kafka连接日志。这个问题可能是被部署到k8s。登录POD

curl localhost:8083/connectors
curl localhost:8083/connectors/<connector-name-from-above>/tasks/0/status
  1. 日志上有其他提示吗?

  2. SQL Server中是否存在问题?sp_who2,sp_whoisactive也许有什么线索?

  3. kubernetes和RDS SQL Server之间有网络问题吗?

虽然时间相关性可能是有效的,但这些将是发现任何问题并缩小范围的方法

问题在于SQL Server 中为CDC作业设置的轮询间隔

最新更新