写.在火花中失败



我正在尝试使用SparkR编写SparkDataFrame。

write.df(spark_df,"/mypartition/enablers/Prod Data/data2/tempdata/tempdata_l2/","csv")

但是得到以下错误-

InsertIntoHadoopFsRelationCommand: Aborting job.
java.io.IOException: Failed to rename DeprecatedRawLocalFileStatus{path=file:/mypartition/enablers/Prod Data/data2/tempdata/tempdata_l2/_temporary/0/task_201610040736_0200_m_000112/part-r-00112-c4c5f30e-343d-4b02-a0f2-e9e5582047e5.snappy.parquet; isDirectory=false; length=331279; replication=1; blocksize=33554432; modification_time=1475566611000; access_time=0; owner=; group=; permission=rw-rw-rw-; isSymlink=false} to file:/mypartition/enablers/Prod Data/data2/tempdata/tempdata_l2/part-r-00112-c4c5f30e-343d-4b02-a0f2-e9e5582047e5.snappy.parquet
    at org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter.mergePaths(FileOutputCommitter.java:371)

另外,还会得到以下错误-

WARN FileUtil: Failed to delete file or dir [/mypartition/enablers/Prod Data/data2/tempdata/tempdata_l2/_temporary/0/task_201610040736_0200_m_000110/.part-r-00110-c4c5f30e-343d-4b02-a0f2-e9e5582047e5.snappy.parquet.crc]: it still exists.

提前感谢您的宝贵意见

通过使用root用户解决了这个问题,最初Spark试图以root身份写入,但在删除临时文件时,它使用的是登录用户,将登录用户更改为root并解决了这个问题

校验和文件未正确删除。你可以尝试重命名校验和(crc)文件并重新执行吗?

cd /mypartition/enablers/Prod Data/data2/tempdata/tempdata_l2/__temporary/0/task_201610040736_0200_m_000110/
mv .part-r-00110-c4c5f30e-343d-4b02-a0f2-e9e5582047e5.snappy.parquet.crc .part-r-00110-c4c5f30e-343d-4b02-a0f2-e9e5582047e5.snappy.parquet.crc_backup

相关内容

  • 没有找到相关文章

最新更新