我正在尝试使用以下代码从 Spark 流连接到 kafka 以执行一个小 POC。
这就是我开始卡夫卡的方式
bin/zookeeper-server-start.sh config/zookeeper.properties
bin/kafka-server-start.sh config/server.properties
bin/kafka-topics.sh --create --zookeeper localhost:2181 --replication-factor 1 --partitions 1 --topic test
bin/kafka-console-producer.sh --broker-list localhost:9092 --topic test
bin/kafka-console-consumer.sh --zookeeper localhost:2181 --topic test --from-beginning
这是我的 Spark 流式处理代码,用于接收消息并在控制台上打印它们。
object ReadingFromKafkaSource extends App {
Logger.getLogger("org").setLevel(Level.ERROR)
val conf = new SparkConf()
.setMaster("local[*]")
.setAppName("test")
val streamingContext = new StreamingContext(conf, Seconds(20))
val lines = KafkaUtils.createStream(streamingContext, "localhost:9092", "spark-streaming-configuration-group", Map("test" -> 1))
lines.print()
streamingContext.start()
streamingContext.awaitTermination()
}
我收到以下错误消息。
4:45:26.002 [RecurringTimer - BlockGenerator] DEBUG org.apache.spark.streaming.util.RecurringTimer - Callback for BlockGenerator called at time 1520952326000
14:45:26.204 [RecurringTimer - BlockGenerator] DEBUG org.apache.spark.streaming.util.RecurringTimer - Callback for BlockGenerator called at time 1520952326200
14:45:26.405 [RecurringTimer - BlockGenerator] DEBUG org.apache.spark.streaming.util.RecurringTimer - Callback for BlockGenerator called at time 1520952326400
14:45:26.601 [RecurringTimer - BlockGenerator] DEBUG org.apache.spark.streaming.util.RecurringTimer - Callback for BlockGenerator called at time 1520952326600
14:45:26.801 [RecurringTimer - BlockGenerator] DEBUG org.apache.spark.streaming.util.RecurringTimer - Callback for BlockGenerator called at time 1520952326800
14:45:27.000 [RecurringTimer - BlockGenerator] DEBUG org.apache.spark.streaming.util.RecurringTimer - Callback for BlockGenerator called at time 1520952327000
14:45:27.201 [RecurringTimer - BlockGenerator] DEBUG org.apache.spark.streaming.util.RecurringTimer - Callback for BlockGenerator called at time 1520952327200
14:45:27.244 [Executor task launch worker for task 99] DEBUG org.apache.zookeeper.ZooKeeper - Closing session: 0x0
14:45:27.244 [Executor task launch worker for task 99] DEBUG org.apache.zookeeper.ClientCnxn - Closing client for session: 0x0
14:45:27.401 [RecurringTimer - BlockGenerator] DEBUG org.apache.spark.streaming.util.RecurringTimer - Callback for BlockGenerator called at time 1520952327400
14:45:27.600 [RecurringTimer - BlockGenerator] DEBUG org.apache.spark.streaming.util.RecurringTimer - Callback for BlockGenerator called at time 1520952327600
14:45:27.742 [Executor task launch worker for task 99-SendThread(localhost:9092)] DEBUG org.apache.zookeeper.ClientCnxn - An exception was thrown while closing send thread for session 0x0 : Client session timed out, have not heard from server in 3005ms for sessionid 0x0
14:45:27.801 [RecurringTimer - BlockGenerator] DEBUG org.apache.spark.streaming.util.RecurringTimer - Callback for BlockGenerator called at time 1520952327800
14:45:27.844 [Executor task launch worker for task 99] DEBUG org.apache.zookeeper.ClientCnxn - Disconnecting client for session: 0x0
14:45:27.844 [Executor task launch worker for task 99] INFO org.apache.zookeeper.ZooKeeper - Session: 0x0 closed
14:45:27.844 [Executor task launch worker for task 99-EventThread] INFO org.apache.zookeeper.ClientCnxn - EventThread shut down
14:45:27.844 [Executor task launch worker for task 99] INFO org.apache.spark.streaming.receiver.ReceiverSupervisorImpl - Stopping receiver with message: Error starting receiver 0: org.I0Itec.zkclient.exception.ZkTimeoutException: Unable to connect to zookeeper server within timeout: 10000
14:45:27.844 [Executor task launch worker for task 99] INFO org.apache.spark.streaming.receiver.ReceiverSupervisorImpl - Called receiver onStop
14:45:27.844 [Executor task launch worker for task 99] INFO org.apache.spark.streaming.receiver.ReceiverSupervisorImpl - Deregistering receiver 0
14:45:27.845 [dispatcher-event-loop-1] ERROR org.apache.spark.streaming.scheduler.ReceiverTracker - Deregistered receiver for stream 0: Error starting receiver 0 - org.I0Itec.zkclient.exception.ZkTimeoutException: Unable to connect to zookeeper server within timeout: 10000
at org.I0Itec.zkclient.ZkClient.connect(ZkClient.java:1232)
at org.I0Itec.zkclient.ZkClient.<init>(ZkClient.java:156)
at org.I0Itec.zkclient.ZkClient.<init>(ZkClient.java:130)
at kafka.utils.ZkUtils$.createZkClientAndConnection(ZkUtils.scala:75)
at kafka.utils.ZkUtils$.apply(ZkUtils.scala:57)
at kafka.consumer.ZookeeperConsumerConnector.connectZk(ZookeeperConsumerConnector.scala:191)
at kafka.consumer.ZookeeperConsumerConnector.<init>(ZookeeperConsumerConnector.scala:139)
at kafka.consumer.ZookeeperConsumerConnector.<init>(ZookeeperConsumerConnector.scala:156)
at kafka.consumer.Consumer$.create(ConsumerConnector.scala:109)
at org.apache.spark.streaming.kafka.KafkaReceiver.onStart(KafkaInputDStream.scala:100)
at org.apache.spark.streaming.receiver.ReceiverSupervisor.startReceiver(ReceiverSupervisor.scala:149)
at org.apache.spark.streaming.receiver.ReceiverSupervisor.start(ReceiverSupervisor.scala:131)
at org.apache.spark.streaming.scheduler.ReceiverTracker$ReceiverTrackerEndpoint$$anonfun$9.apply(ReceiverTracker.scala:607)
at org.apache.spark.streaming.scheduler.ReceiverTracker$ReceiverTrackerEndpoint$$anonfun$9.apply(ReceiverTracker.scala:597)
at org.apache.spark.SparkContext$$anonfun$34.apply(SparkContext.scala:2173)
at org.apache.spark.SparkContext$$anonfun$34.apply(SparkContext.scala:2173)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87)
at org.apache.spark.scheduler.Task.run(Task.scala:108)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:335)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Kafka 工作正常,但火花流说连接到动物园管理员服务存在问题。
你提供了 Kafka 代理的端口,你应该提供 Zookeeper 的端口(如你在文档中看到的(,默认情况下实际上是 2181,尝试使用 localhost:2181
而不是 localhost:9092
。这应该可以肯定地解决问题(假设您正在运行Kafka和Zookeper(。
我会同样的错误。 我通过将端口号 9092 更改为 2181(来自 zoo.cfg - 属性客户端端口=2181( 解决了它