使用KAFKA PRODUCER -FILESORCE连接器读取文件的内容



如何使用kafka生产者读取文件的内容?此处找到的典型解决方案(用|将文件输送到生产者中(看起来很肮脏。

我最近发现一个解决方案比将文件的内容输送到生产者外壳中,即使用filesource connector。

根据链接,FileSource Connector旨在精确求解"将文件数据读取到生产者中"的用例,例如检查日志文件的内容并在遇到[ERROR][FATAL]时启动警报。

完整命令是(假设我们在kafka的根文件夹中(:

bin/connect-standalone.sh config/connect-standalone.properties config/connect-file-source.properties

两个要配置的属性文件:

  • config/connect-standalone.properties
  • config/connect-file-source.properties

第一个定义如何连接到独立连接器。就像:

# Licensed to the Apache Software Foundation (ASF) under one or more
# contributor license agreements.  See the NOTICE file distributed with
# this work for additional information regarding copyright ownership.
# The ASF licenses this file to You under the Apache License, Version 2.0
# (the "License"); you may not use this file except in compliance with
# the License.  You may obtain a copy of the License at
#
#    http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# These are defaults. This file just demonstrates how to override some settings.
bootstrap.servers=localhost:9092
# The converters specify the format of data in Kafka and how to translate it into Connect data. Every Connect user will
# need to configure these based on the format they want their data in when loaded from or stored into Kafka
 key.converter=org.apache.kafka.connect.json.JsonConverter
 value.converter=org.apache.kafka.connect.json.JsonConverter
# Converter-specific settings can be passed in by prefixing the Converter's setting with the converter we want to apply
# it to
key.converter.schemas.enable=false
value.converter.schemas.enable=false
# The internal converter used for offsets and config data is configurable and must be specified, but most users will
# always want to use the built-in default. Offset and config data is never visible outside of Kafka Connect in this format.
 internal.key.converter=org.apache.kafka.connect.json.JsonConverter
 internal.value.converter=org.apache.kafka.connect.json.JsonConverter
internal.key.converter.schemas.enable=false
internal.value.converter.schemas.enable=false
offset.storage.file.filename=/tmp/connect.offsets
# Flush much faster than normal, which is useful for testing/debugging
offset.flush.interval.ms=10000
# Set to a list of filesystem paths separated by commas (,) to enable class loading isolation for plugins
# (connectors, converters, transformations). The list should consist of top level directories that include
# any combination of:
# a) directories immediately containing jars with plugins and their dependencies
# b) uber-jars with plugins and their dependencies
# c) directories immediately containing the package directory structure of classes of plugins and their dependencies
# Note: symlinks will be followed to discover dependencies or plugins.
# Examples:
# plugin.path=/usr/local/share/java,/usr/local/share/kafka/plugins,/opt/connectors,
#plugin.path=

非常简单。只有两件事要注意:

  • bootstrap.servers=localhost:9092:kafka bootstrap服务器
  • (internal.)key/value.converter.schemas.enable=false:您必须将它们设置为false以解析文件中的字符串线。

第二个文件更简单:

name=local-file-source
connector.class=FileStreamSource
tasks.max=1
file=/tmp/test.txt
topic=connect-test
  • file:要阅读的文件
  • topic:创建一个主题,以使消费者聆听

如果您想用风暴消费内容,就足够了。

如果您不用读取文件,而是要将来自Kafka的内容写入文件,则使用Filesink Connector。我没有亲自使用它,但我想这也是如此,但是在消费者方面。配置文件是config/connect-file-sink.properties

相关内容

  • 没有找到相关文章

最新更新