线程 "main" 中的流式处理 Spark 异常



我正在尝试使用 sbt 使用 spark 和 scala 流式传输 Twitter 数据,一切顺利,但我有一个问题:

这是我的 buld.sbt:

import Assembly._
import AssemblyPlugin._
name := "TwitterSparkStreaming"
version := "0.1"
scalaVersion := "2.12.3"
libraryDependencies ++= Seq(
"org.apache.spark" % "spark-core_2.11" % "1.5.2",
"org.apache.spark" % "spark-sql_2.11" % "1.5.2",
"org.apache.spark" % "spark-streaming_2.11" % "1.5.2",
"org.apache.spark" % "spark-streaming-twitter_2.11" % "1.6.3",
"joda-time" %% "joda-time" % "2.9.1",
"org.twitter4j" % "twitter4j-core" % "3.0.3",
"org.twitter4j" % "twitter4j-stream" % "3.0.3",
"edu.stanford.nlp" % "stanford-corenlp" % "3.5.2",
"edu.stanford.nlp" % "stanford-corenlp" % "3.5.2" classifier "models"
)
resolvers += "Akka Repository" at "http://repo.akka.io./releases/"
assemblyMergeStrategy in assembly := {
case PathList("META-INF", xs @ _*) => MergeStrategy.discard
case x => MergeStrategy.first
}

这是包含org.apache.spark.Logging的类:

import org.apache.log4j.{Logger, Level}
import org.apache.spark.Logging
object LogUtils extends Logging{
def setStreamingLogLevels(): Unit ={
val log4jInitialized = Logger.getRootLogger.getAllAppenders.hasMoreElements
if(!log4jInitialized)
{
logInfo("Setting log level to [WARN] for streaming example." + " To override add a custom log4j.properties to the classpath.")
Logger.getRootLogger.setLevel(Level.WARN)
}
}
}

这是为我出现的错误:

Exception in thread "main" java.lang.NoSuchMethodError: org.apache.spark.Logging.$init$(Lorg/apache/spark/Logging;)V
at LogUtils$.<init>(LogUtils.scala:4)
at LogUtils$.<clinit>(LogUtils.scala)
at TwitterStreaming$.main(TwitterStreaming.scala:30)
at TwitterStreaming.main(TwitterStreaming.scala)

我可以知道我该如何解决它吗?

注意:我尝试将org.apache.spark依赖项从版本2.2.0更改为1.5.2,但问题是一样的

我不确定为什么这段代码会出错。但是有一种更好的方法来设置Spark中的日志级别。

请参考链接 https://spark.apache.org/docs/latest/api/java/org/apache/spark/SparkContext.html#setLogLevel-java.lang.String-

Spark 在 sparkContext 级别有方法,所以你可以调用,

sparkContext.setLogLevel("WARN"(

最新更新