我正试图从Spark开始。我有Hadoop(3.3.1)和Spark(3.2.2)在我的库。我已经将SPARK_HOME、PATH、HADOOP_HOME和LD_LIBRARY_PATH设置为各自的路径。我也在运行JDK 17 (echo和-version在终端中可以正常工作)。
然而,我仍然得到以下错误:
Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
Setting default log level to "WARN".
To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).
21/10/25 17:17:07 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
java.lang.IllegalAccessError: class org.apache.spark.storage.StorageUtils$ (in unnamed module @0x1f508f09) cannot access class sun.nio.ch.DirectBuffer (in module java.base) because module java.base does not export sun.nio.ch to unnamed module @0x1f508f09
at org.apache.spark.storage.StorageUtils$.<init>(StorageUtils.scala:213)
at org.apache.spark.storage.StorageUtils$.<clinit>(StorageUtils.scala)
at org.apache.spark.storage.BlockManagerMasterEndpoint.<init>(BlockManagerMasterEndpoint.scala:110)
at org.apache.spark.SparkEnv$.$anonfun$create$9(SparkEnv.scala:348)
at org.apache.spark.SparkEnv$.registerOrLookupEndpoint$1(SparkEnv.scala:287)
at org.apache.spark.SparkEnv$.create(SparkEnv.scala:336)
at org.apache.spark.SparkEnv$.createDriverEnv(SparkEnv.scala:191)
at org.apache.spark.SparkContext.createSparkEnv(SparkContext.scala:277)
at org.apache.spark.SparkContext.<init>(SparkContext.scala:460)
at org.apache.spark.SparkContext$.getOrCreate(SparkContext.scala:2690)
at org.apache.spark.sql.SparkSession$Builder.$anonfun$getOrCreate$2(SparkSession.scala:949)
at scala.Option.getOrElse(Option.scala:189)
at org.apache.spark.sql.SparkSession$Builder.getOrCreate(SparkSession.scala:943)
at org.apache.spark.repl.Main$.createSparkSession(Main.scala:106)
... 55 elided
<console>:14: error: not found: value spark
import spark.implicits._
^
<console>:14: error: not found: value spark
import spark.sql
^
Welcome to
____ __
/ __/__ ___ _____/ /__
_ / _ / _ `/ __/ '_/
/___/ .__/_,_/_/ /_/_ version 3.2.0
/_/
Using Scala version 2.12.15 (OpenJDK 64-Bit Server VM, Java 17.0.1)
Type in expressions to have them evaluated.
Type :help for more information.
有什么办法解决这个问题吗?
不支持Java 17 - Spark运行在Java 8/11(来源:https://spark.apache.org/docs/latest/)。
安装Java 11,并将Spark指向它。
警告unable to load native-hadoop library for platform
是很常见的,并不意味着有任何问题。
打开终端,输入以下命令——>中. bashrc
确保在lib之后添加了本机文件如下所示
export HADOOP_OPTS = "-Djava.library.path=$HADOOP_HOME/lib/native"
和在终端source ~/.bashrc中输入以下命令
试试这个,它可能对你有帮助。
如果你在Mac上通过brew安装openjdk@11,它会给你一些警告并告诉你该怎么做。我一直面临着同样的问题,直到我做了这些步骤:
要让系统Java包装器找到这个JDK,用
符号链接它sudo ln -sfn /usr/local/opt/openjdk@11/libexec/openjdk.jdk /Library/Java/JavaVirtualMachines/openjdk-11.jdk
如果您需要在PATH中首先使用openjdk@11,请运行:
echo 'export PATH="/usr/local/opt/openjdk@11/bin:$PATH"' >> ~/.zshrc
要让编译器找到openjdk@11,您可能需要设置:
export CPPFLAGS="-I/usr/local/opt/openjdk@11/include"
我最近遇到了同样的问题,我的问题是java版本。需要使用Java 11/17 for spark版本>3.2.0,参考这里的文档。