Sparkstreaming + Kafka to hdfs



当我尝试使用火花流使用来自 kafka 主题的消息时,出现以下错误

scala> val kafkaStream = KafkaUtils.createStream(ssc, "<ipaddress>:2181","spark-streaming-consumer-group", Map("test1" -> 5))

错误:

`missing or invalid dependency detected while loading class file 'KafkaUtils.class'.
Could not access term kafka in package <root>,
because it (or its dependencies) are missing. Check your build definition for
missing or conflicting dependencies. (Re-run with `-Ylog-classpath` to see the problematic classpath.)
A full rebuild may help if 'KafkaUtils.class' was compiled against an incompatible version of <root>.`

斯卡拉版本:2.11.8 火花版本:2.1.0.2.6.0.3-8

我已经使用了所有类型的库来制作火花流卡夫卡,但没有任何效果:

我正在从火花外壳执行代码:

./spark-shell --jars /data/home/local/504/spark-streaming-kafka_2.10-1.5.1.jar, /data/home/local/504/spark-streaming_2.10-1.5.1.jar

法典

import org.apache.spark.SparkConf
val conf = new SparkConf().setMaster("local[*]").setAppName("KafkaReceiver")
import org.apache.spark.streaming.StreamingContext
import org.apache.spark.streaming.Seconds
val ssc = new StreamingContext(conf, Seconds(10))
import org.apache.spark.streaming.kafka.KafkaUtils
val kafkaStream = KafkaUtils.createStream(ssc, "<ipaddress>:2181","spark-streaming-consumer-group", Map("test1" -> 5))

对此问题的任何建议。

由于您使用的是 Scala 2.11 和 Spark 2.1.0,因此您应该使用这些 jar

  • spark-streaming-kafka-0-10_2.11-2.1.0.jar
  • spark-streaming_2.11-2.1.0.jar

如果您使用的是 Kafka 0.10+,否则请相应地更改它。

简单的程序看起来像

import org.apache.spark.streaming.kafka010.ConsumerStrategies.Subscribe
import org.apache.spark.streaming.kafka010.KafkaUtils
import org.apache.spark.streaming.kafka010.LocationStrategies.PreferConsistent
import org.apache.kafka.common.serialization.StringDeserializer
val streamingContext = new StreamingContext(sc, Seconds(5))
//Parameters for kafka
val kafkaParams = Map[String, Object](
"bootstrap.servers" -> "servers,
"key.deserializer" -> classOf[StringDeserializer],
"value.deserializer" -> classOf[StringDeserializer],
"group.id" -> "test-consumer-group",
"auto.offset.reset" -> "earliest",
"enable.auto.commit" -> (false: java.lang.Boolean)
)
val topics = "topics,seperated,by,comma".split(",")
// crate dstreams
val stream = KafkaUtils.createDirectStream[String, String](
streamingContext,
PreferConsistent,
Subscribe[String, String](topics, kafkaParams)
)
//stream.print()
stream.map(_.value().toString).print()

希望这个嘶嘶!

最新更新