Kafka连接设置使用AWS MSK从Aurora发送记录



我必须将记录从Aurora/Mysql发送到MSK,然后从那里发送到Elastic搜索服务

Aurora-->Kafka连接--->AWS MSK-->Kafka连接-->弹性搜索

Aurora表结构中的记录是这样的
我认为记录将以这种格式进入AWS MSK。

"o36347-5d17-136a-9749-Oe46464",0,"NEW_CASE","WRLDCHK","o36347-5d17-136a-9749-Oe46464","<?xml version=""1.0"" encoding=""UTF-8"" standalone=""yes""?><caseCreatedPayload><batchDetails/>","CASE",08-JUL-17 10.02.32.217000000 PM,"TIME","UTC","ON","0a348753-5d1e-17a2-9749-3345,MN4,","","0a348753-5d1e-17af-9749-FGFDGDFV","EOUHEORHOE","2454-5d17-138e-9749-setwr23424","","","",,"","",""

所以为了使用弹性搜索,我需要使用正确的模式,所以我必须使用模式注册表。

我的问题

问题1

对于以上类型的消息,我应该如何使用架构注册表?架构注册表是必需的?。我必须为此创建JSON结构吗?如果是,我必须将其保存在哪里。需要更多帮助才能理解这一点?

我编辑过

vim /usr/local/confluent/etc/schema-registry/schema-registry.properties

提到动物园,但我没有提到kafkastore.topic=_schema是什么如何将其链接到自定义架构。

甚至我开始并得到这个错误

Caused by: java.util.concurrent.ExecutionException: org.apache.kafka.common.errors.TimeoutException: Topic _schemas not present in metadata after 60000 ms.

这是我所期望的,因为我没有对模式做任何事情。

我确实安装了jdbc连接器,当我启动时,我得到了以下错误

Invalid value java.sql.SQLException: No suitable driver found for jdbc:mysql://123871289-eruyre.cluster-ceyey.us-east-1.rds.amazonaws.com:3306/trf?user=admin&password=Welcome123 for configuration Couldn't open connection to jdbc:mysql://123871289-eruyre.cluster-ceyey.us-east-1.rds.amazonaws.com:3306/trf?user=admin&password=Welcome123
Invalid value java.sql.SQLException: No suitable driver found for jdbc:mysql://123871289-eruyre.cluster-ceyey.us-east-1.rds.amazonaws.com:3306/trf?user=admin&password=Welcome123 for configuration Couldn't open connection to jdbc:mysql://123871289-eruyre.cluster-ceyey.us-east-1.rds.amazonaws.com:3306/trf?user=admin&password=Welcome123
You can also find the above list of errors at the endpoint `/{connectorType}/config/validate`

问题2我可以在一个ec2上创建两个连接器吗(jdbc和elastic serach one)。如果可以,我必须在sepearte cli中同时启动这两个连接器?

问题3当我打开vim/usr/local/confluent/etc/kafka connect jdbc/source-quickstart-sqlite.properties时我只看到属性值如下

name=test-source-sqlite-jdbc-autoincrement
connector.class=io.confluent.connect.jdbc.JdbcSourceConnector
tasks.max=1
connection.url=jdbc:mysql://123871289-eruyre.cluster-ceyey.us-east-1.rds.amazonaws.com:3306/trf?user=admin&password=Welcome123
mode=incrementing
incrementing.column.name=id
topic.prefix=trf-aurora-fspaudit-

在上面的属性文件中,我可以提到架构名称和表名称吗?

根据答案,我正在更新Kafka连接JDBC 的配置

---------------启动JDBC连接弹性搜索-------------------------

wget /usr/local http://packages.confluent.io/archive/5.2/confluent-5.2.0-2.11.tar.gz -P ~/Downloads/
tar -zxvf ~/Downloads/confluent-5.2.0-2.11.tar.gz -C ~/Downloads/
sudo mv ~/Downloads/confluent-5.2.0 /usr/local/confluent
wget https://cdn.mysql.com//Downloads/Connector-J/mysql-connector-java-5.1.48.tar.gz
tar -xzf  mysql-connector-java-5.1.48.tar.gz
sudo mv mysql-connector-java-5.1.48 mv /usr/local/confluent/share/java/kafka-connect-jdbc

然后

vim /usr/local/confluent/etc/kafka-connect-jdbc/source-quickstart-sqlite.properties

然后我修改了以下属性

connection.url=jdbc:mysql://fdgfgdfgrter.us-east-1.rds.amazonaws.com:3306/trf
mode=incrementing
connection.user=admin
connection.password=Welcome123
table.whitelist=PANStatementInstanceLog
schema.pattern=dbo

最后一次修改

vim /usr/local/confluent/etc/kafka/connect-standalone.properties

在这里我修改了以下属性

bootstrap.servers=b-3.205147-ertrtr.erer.c5.ertert.us-east-1.amazonaws.com:9092,b-6.ertert-riskaudit.ertet.c5.kafka.us-east-1.amazonaws.com:9092,b-1.ertert-riskaudit.ertert.c5.kafka.us-east-1.amazonaws.com:9092
key.converter.schemas.enable=true
value.converter.schemas.enable=true
offset.storage.file.filename=/tmp/connect.offsets
offset.flush.interval.ms=10000
plugin.path=/usr/local/confluent/share/java

当我列出主题时,我没有看到任何为表名列出的主题。

错误消息的堆栈跟踪

是否需要
[2020-01-03 07:40:57,169] ERROR Failed to create job for /usr/local/confluent/etc/kafka-connect-jdbc/source-quickstart-sqlite.properties (org.apache.kafka.connect.cli.ConnectStandalone:108)
[2020-01-03 07:40:57,169] ERROR Stopping after connector error (org.apache.kafka.connect.cli.ConnectStandalone:119)
java.util.concurrent.ExecutionException: org.apache.kafka.connect.runtime.rest.errors.BadRequestException: Connector configuration is invalid and contains the following 2 error(s):
Invalid value com.mysql.jdbc.exceptions.jdbc4.CommunicationsException: Communications link failure
The last packet sent successfully to the server was 0 milliseconds ago. The driver has not received any packets from the server. for configuration Couldn't open connection to jdbc:mysql://****.us-east-1.rds.amazonaws.com:3306/trf
Invalid value com.mysql.jdbc.exceptions.jdbc4.CommunicationsException: Communications link failure
The last packet sent successfully to the server was 0 milliseconds ago. The driver has not received any packets from the server. for configuration Couldn't open connection to jdbc:mysql://****.us-east-1.rds.amazonaws.com:3306/trf
You can also find the above list of errors at the endpoint `/{connectorType}/config/validate`
at org.apache.kafka.connect.util.ConvertingFutureCallback.result(ConvertingFutureCallback.java:79)
at org.apache.kafka.connect.util.ConvertingFutureCallback.get(ConvertingFutureCallback.java:66)
at org.apache.kafka.connect.cli.ConnectStandalone.main(ConnectStandalone.java:116)
Caused by: org.apache.kafka.connect.runtime.rest.errors.BadRequestException: Connector configuration is invalid and contains the following 2 error(s):
Invalid value com.mysql.jdbc.exceptions.jdbc4.CommunicationsException: Communications link failure
The last packet sent successfully to the server was 0 milliseconds ago. The driver has not received any packets from the server. for configuration Couldn't open connection to jdbc:mysql://****.us-east-1.rds.amazonaws.com:3306/trf
Invalid value com.mysql.jdbc.exceptions.jdbc4.CommunicationsException: Communications link failure
The last packet sent successfully to the server was 0 milliseconds ago. The driver has not received any packets from the server. for configuration Couldn't open connection to jdbc:mysql://****.us-east-1.rds.amazonaws.com:3306/trf
You can also find the above list of errors at the endpoint `/{connectorType}/config/validate`
at org.apache.kafka.connect.runtime.AbstractHerder.maybeAddConfigErrors(AbstractHerder.java:423)
at org.apache.kafka.connect.runtime.standalone.StandaloneHerder.putConnectorConfig(StandaloneHerder.java:188)
at org.apache.kafka.connect.cli.ConnectStandalone.main(ConnectStandalone.java:113)
curl -X POST -H "Accept:application/json" -H "Content-Type:application/json" IPaddressOfKCnode:8083/connectors/ -d '{"name": "emp-connector", "config": { "connector.class": "io.confluent.connect.jdbc.JdbcSourceConnector", "tasks.max": "1", "connection.url": "jdbc:mysql://IPaddressOfLocalMachine:3306/test_db?user=root&password=pwd","table.whitelist": "emp","mode": "timestamp","topic.prefix": "mysql-" } }'

架构注册表?

否。您可以在json记录中启用模式。JDBC源可以基于表信息为您创建它们

value.converter=org.apache.kafka...JsonConverter 
value.converter.schemas.enable=true

提到了zookeper,但我没有提到什么是kafkastore。topic=_schema

如果您想使用Schema Registry,您应该使用带有Kafka地址的kafkastore.bootstrap.servers.,而不是Zookeeper。因此删除kafkastore.connection.url

有关所有属性的解释,请阅读文档

我没有对模式做任何事情。

没关系。当注册表首次启动时,会创建模式主题

我可以在一个ec2 上创建两个连接器吗

是(忽略可用的JVM堆空间)。同样,这在KafkaConnect文档中有详细说明。

使用独立模式,您首先传递connectworker配置,然后在一个命令中传递多达N个连接器属性

使用分布式模式,使用Kafka Connect REST API

https://docs.confluent.io/current/connect/managing/configuring.html

当我打开vim/usr/local/confluent/etc/kafka连接jdbc/source-quickstart-sqlite.properties 时

首先,这是针对Sqlite的,而不是Mysql/Postgres。你不需要使用快速启动文件,它们只是用于参考

同样,所有属性都有很好的记录

https://docs.confluent.io/current/connect/kafka-connect-jdbc/index.html#connect-jdbc

我确实安装了jdbc连接器,当我启动时,我得到了以下错误

以下是有关如何调试的更多信息

https://www.confluent.io/blog/kafka-connect-deep-dive-jdbc-source-connector/


如前所述,我个人建议在可能的情况下使用Debezium/CDC

RDS Aurora 的Debezium连接器

我猜您计划使用AVRO来传输数据,所以在启动Kafka Connect工作程序时,不要忘记将AVROConverter指定为默认转换器。如果您将使用JSON,则不需要Schema Registry。

1.1kafkastore.topic=_schema

您是否启动了自己的模式注册表?启动"模式注册表"时,必须指定"模式"主题。基本上,Schema Registry将使用此主题来存储其注册的模式,如果出现故障,它可以从中恢复它们。

1.2jdbc connector installed and when i start i get below error默认情况下,JDBC连接器仅适用于SQLite和PostgreSQL。如果您希望它与MySQL数据库一起工作,那么您也应该将MySQL驱动程序添加到类路径中。

2.这取决于您如何部署Kafka Connect工作人员。如果您选择分布式模式(推荐),则实际上不需要单独的CLI。您可以通过Kafka Connect REST API部署连接器。

3.还有一个名为table.whitelist的属性,您可以在它上指定模式和表。例如:表.白名单用户,产品,交易

相关内容

  • 没有找到相关文章

最新更新