Spark:无法为平台加载原生hadoop库



我正试图从Spark开始。我有Hadoop(3.3.1)和Spark(3.2.2)在我的库。我已经将SPARK_HOME、PATH、HADOOP_HOME和LD_LIBRARY_PATH设置为各自的路径。我也在运行JDK 17 (echo和-version在终端中可以正常工作)。

然而,我仍然得到以下错误:

Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
Setting default log level to "WARN".
To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).
21/10/25 17:17:07 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
java.lang.IllegalAccessError: class org.apache.spark.storage.StorageUtils$ (in unnamed module @0x1f508f09) cannot access class sun.nio.ch.DirectBuffer (in module java.base) because module java.base does not export sun.nio.ch to unnamed module @0x1f508f09
at org.apache.spark.storage.StorageUtils$.<init>(StorageUtils.scala:213)
at org.apache.spark.storage.StorageUtils$.<clinit>(StorageUtils.scala)
at org.apache.spark.storage.BlockManagerMasterEndpoint.<init>(BlockManagerMasterEndpoint.scala:110)
at org.apache.spark.SparkEnv$.$anonfun$create$9(SparkEnv.scala:348)
at org.apache.spark.SparkEnv$.registerOrLookupEndpoint$1(SparkEnv.scala:287)
at org.apache.spark.SparkEnv$.create(SparkEnv.scala:336)
at org.apache.spark.SparkEnv$.createDriverEnv(SparkEnv.scala:191)
at org.apache.spark.SparkContext.createSparkEnv(SparkContext.scala:277)
at org.apache.spark.SparkContext.<init>(SparkContext.scala:460)
at org.apache.spark.SparkContext$.getOrCreate(SparkContext.scala:2690)
at org.apache.spark.sql.SparkSession$Builder.$anonfun$getOrCreate$2(SparkSession.scala:949)
at scala.Option.getOrElse(Option.scala:189)
at org.apache.spark.sql.SparkSession$Builder.getOrCreate(SparkSession.scala:943)
at org.apache.spark.repl.Main$.createSparkSession(Main.scala:106)
... 55 elided
<console>:14: error: not found: value spark
import spark.implicits._
^
<console>:14: error: not found: value spark
import spark.sql
^
Welcome to
____              __
/ __/__  ___ _____/ /__
_ / _ / _ `/ __/  '_/
/___/ .__/_,_/_/ /_/_   version 3.2.0
/_/

Using Scala version 2.12.15 (OpenJDK 64-Bit Server VM, Java 17.0.1)
Type in expressions to have them evaluated.
Type :help for more information.

有什么办法解决这个问题吗?

不支持Java 17 - Spark运行在Java 8/11(来源:https://spark.apache.org/docs/latest/)。

安装Java 11,并将Spark指向它。

警告unable to load native-hadoop library for platform是很常见的,并不意味着有任何问题。

打开终端,输入以下命令——>中. bashrc

确保在lib之后添加了本机文件如下所示

export HADOOP_OPTS = "-Djava.library.path=$HADOOP_HOME/lib/native"

在终端source ~/.bashrc中输入以下命令

试试这个,它可能对你有帮助。

如果你在Mac上通过brew安装openjdk@11,它会给你一些警告并告诉你该怎么做。我一直面临着同样的问题,直到我做了这些步骤:

要让系统Java包装器找到这个JDK,用

符号链接它sudo ln -sfn /usr/local/opt/openjdk@11/libexec/openjdk.jdk /Library/Java/JavaVirtualMachines/openjdk-11.jdk

如果您需要在PATH中首先使用openjdk@11,请运行:

echo 'export PATH="/usr/local/opt/openjdk@11/bin:$PATH"' >> ~/.zshrc

要让编译器找到openjdk@11,您可能需要设置:

export CPPFLAGS="-I/usr/local/opt/openjdk@11/include"

我最近遇到了同样的问题,我的问题是java版本。需要使用Java 11/17 for spark版本>3.2.0,参考这里的文档。