py4j.protocol.Py4JJavaError:调用o49.csv时发生错误



我是pyspark的新手。我在本地机器上运行pyspark。我正试图从pyspark数据框架写入CSV文件。所以我写了下面的代码

dataframe.write.mode(附加). csv (outputPath)

但是我得到一个错误信息

Traceback (most recent call last):
File "D:PycharmProjectspythonProjectorgsparkweblogSparkWebLogsAnalysis.py", line 71, in <module>
weblog_sessionIds.write.mode('append').csv(outputPath)
File "C:spark-3.1.2-bin-hadoop3.2pythonpysparksqlreadwriter.py", line 1372, in csv
self._jwrite.csv(path)
File "C:spark-3.1.2-bin-hadoop3.2pythonlibpy4j-0.10.9-src.zippy4jjava_gateway.py", line 1304, in __call__
File "C:spark-3.1.2-bin-hadoop3.2pythonpysparksqlutils.py", line 111, in deco
return f(*a, **kw)
File "C:spark-3.1.2-bin-hadoop3.2pythonlibpy4j-0.10.9-src.zippy4jprotocol.py", line 326, in get_return_value
py4j.protocol.Py4JJavaError: An error occurred while calling o49.csv.
: java.lang.UnsatisfiedLinkError: org.apache.hadoop.io.nativeio.NativeIO$Windows.createDirectoryWithMode0(Ljava/lang/String;I)V
at org.apache.hadoop.io.nativeio.NativeIO$Windows.createDirectoryWithMode0(Native Method)
at org.apache.hadoop.io.nativeio.NativeIO$Windows.createDirectoryWithMode(NativeIO.java:560)
at org.apache.hadoop.fs.RawLocalFileSystem.mkOneDirWithMode(RawLocalFileSystem.java:534)
at org.apache.hadoop.fs.RawLocalFileSystem.mkdirsWithOptionalPermission(RawLocalFileSystem.java:587)
at org.apache.hadoop.fs.RawLocalFileSystem.mkdirs(RawLocalFileSystem.java:559)
at org.apache.hadoop.fs.RawLocalFileSystem.mkdirsWithOptionalPermission(RawLocalFileSystem.java:586)
at org.apache.hadoop.fs.RawLocalFileSystem.mkdirs(RawLocalFileSystem.java:559)
at org.apache.hadoop.fs.ChecksumFileSystem.mkdirs(ChecksumFileSystem.java:705)
at org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter.setupJob(FileOutputCommitter.java:354)
at org.apache.spark.internal.io.HadoopMapReduceCommitProtocol.setupJob(HadoopMapReduceCommitProtocol.scala:178)
at org.apache.spark.sql.execution.datasources.FileFormatWriter$.write(FileFormatWriter.scala:173)
at org.apache.spark.sql.execution.datasources.InsertIntoHadoopFsRelationCommand.run(InsertIntoHadoopFsRelationCommand.scala:188)
at org.apache.spark.sql.execution.command.DataWritingCommandExec.sideEffectResult$lzycompute(commands.scala:108)
at org.apache.spark.sql.execution.command.DataWritingCommandExec.sideEffectResult(commands.scala:106)
at org.apache.spark.sql.execution.command.DataWritingCommandExec.doExecute(commands.scala:131)
at org.apache.spark.sql.execution.SparkPlan.$anonfun$execute$1(SparkPlan.scala:180)
at org.apache.spark.sql.execution.SparkPlan.$anonfun$executeQuery$1(SparkPlan.scala:218)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
at org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:215)
at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:176)

你能建议我纠正这个错误吗?

通过删除winutils文件夹中的hadoop.dll文件并使用较低版本的Spark

最新更新