java 为什么 Spark 应用程序失败并显示“IOException: (null) entry in command string: null chmod 0644”?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/48010634/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-11-03 09:55:20  来源:igfitidea点击:

Why does Spark application fail with "IOException: (null) entry in command string: null chmod 0644"?

javaapache-sparkapache-spark-sql

提问by John Humanyun

I'm trying to write the dataset results into a single CSV using below using JAVA

我正在尝试使用下面的 JAVA 将数据集结果写入单个 CSV

dataset.write().mode(SaveMode.Overwrite).option("header",true).csv("C:\tmp\csvs");

But it goes for timed out , the file is not being written.

但它超时了,文件没有被写入。

Throws org.apache.spark.SparkException: Job aborted.

投掷 org.apache.spark.SparkException: Job aborted.

Error:

错误:

org.apache.spark.SparkException: Job aborted due to stage failure:

Task 0 in stage 13.0 failed 1 times, most recent failure: Lost task 0.0 in stage 13.0 (TID 16, localhost): java.io.IOException: (null) entry in command string: null chmod 0644 C:\tmp333333testSpark\_temporary
java.io.IOException: (null) entry in command string: null chmod 0644
\_temporary\attempt_201712282255_0013_m_000000_0\part-r-00000-229fd1b6-ffb9-4ba1-9dc9-89dfdbd0be43.csv at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:770) at org.apache.hadoop.util.Shell.execCommand(Shell.java:866) at org.apache.hadoop.util.Shell.execCommand(Shell.java:849) at org.apache.hadoop.fs.RawLocalFileSystem.setPermission(RawLocalFileSystem.java:733) at org.apache.hadoop.fs.RawLocalFileSystem$LocalFSFileOutputStream.<init>(RawLocalFileSystem.java:225) at org.apache.hadoop.fs.RawLocalFileSystem$LocalFSFileOutputStream.<init>(RawLocalFileSystem.java:209) at org.apache.hadoop.fs.RawLocalFileSystem.createOutputStreamWithMode(RawLocalFileSystem.java:307) at org.apache.hadoop.fs.RawLocalFileSystem.create(RawLocalFileSystem.java:296) at org.apache.hadoop.fs.RawLocalFileSystem.create(RawLocalFileSystem.java:328) at org.apache.hadoop.fs.ChecksumFileSystem$ChecksumFSOutputSummer.<init>(ChecksumFileSystem.java:398) at org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.java:461) at org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.java:440) at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:911) at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:892) at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:789) at org.apache.hadoop.mapreduce.lib.output.TextOutputFormat.getRecordWriter(TextOutputFormat.java:132) at org.apache.spark.sql.execution.datasources.csv.CsvOutputWriter.<init>(CSVRelation.scala:200) at org.apache.spark.sql.execution.datasources.csv.CSVOutputWriterFactory.newInstance(CSVRelation.scala:170) at org.apache.spark.sql.execution.datasources.BaseWriterContainer.newOutputWriter(WriterContainer.scala:131) at org.apache.spark.sql.execution.datasources.DefaultWriterContainer.writeRows(WriterContainer.scala:247) at org.apache.spark.sql.execution.datasources.InsertIntoHadoopFsRelationCommand$$anonfun$run$$anonfun$apply$mcV$sp.apply(InsertIntoHadoopFsRelationCommand.scala:143) at org.apache.spark.sql.execution.datasources.InsertIntoHadoopFsRelationCommand$$anonfun$run$$anonfun$apply$mcV$sp.apply(InsertIntoHadoopFsRelationCommand.scala:143) at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:70) at org.apache.spark.scheduler.Task.run(Task.scala:86) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:274) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745)

回答by Leo C

You might want to narrow down to fixing the following exception:

您可能希望缩小范围以修复以下异常:

##代码##

Try set HADOOP_HOMEto the subdirectory with bin\winuitls.exeas reported in this SO question. If that doesn't help, there is a work-around reported at another SO link.

尝试将HADOOP_HOME子目录设置为bin\winuitls.exe如此问题中所报告的。如果这没有帮助,则在另一个SO 链接中报告了一种解决方法。