java.io.IOException: Could not locate executable null\bin\winutils.exe in the Hadoop binaries.

问题描述:

在使用IDEA本地开发spark的过程中,运行程序出现以下错误:

java.io.IOException: Could not locate executable null\bin\winutils.exe in the Hadoop binaries.
	at org.apache.hadoop.util.Shell.getQualifiedBinPath(Shell.java:378)
	at org.apache.hadoop.util.Shell.getWinUtilsPath(Shell.java:393)
	at org.apache.hadoop.util.Shell.(Shell.java:386)
	at org.apache.hadoop.util.StringUtils.(StringUtils.java:79)
	at org.apache.hadoop.security.Groups.parseStaticMapping(Groups.java:116)
	at org.apache.hadoop.security.Groups.(Groups.java:93)
	at org.apache.hadoop.security.Groups.(Groups.java:73)
	at org.apache.hadoop.security.Groups.getUserToGroupsMappingService(Groups.java:293)
	at org.apache.hadoop.security.UserGroupInformation.initialize(UserGroupInformation.java:283)
	at org.apache.hadoop.security.UserGroupInformation.ensureInitialized(UserGroupInformation.java:260)
	at org.apache.hadoop.security.UserGroupInformation.loginUserFromSubject(UserGroupInformation.java:789)
	at org.apache.hadoop.security.UserGroupInformation.getLoginUser(UserGroupInformation.java:774)
	at org.apache.hadoop.security.UserGroupInformation.getCurrentUser(UserGroupInformation.java:647)
	at org.apache.spark.util.Utils$$anonfun$getCurrentUserName$1.apply(Utils.scala:2467)
	at org.apache.spark.util.Utils$$anonfun$getCurrentUserName$1.apply(Utils.scala:2467)
	at scala.Option.getOrElse(Option.scala:121)
	at org.apache.spark.util.Utils$.getCurrentUserName(Utils.scala:2467)
	at org.apache.spark.SparkContext.(SparkContext.scala:292)
	at org.apache.spark.SparkContext$.getOrCreate(SparkContext.scala:2493)
	at org.apache.spark.sql.SparkSession$Builder$$anonfun$7.apply(SparkSession.scala:933)
	at org.apache.spark.sql.SparkSession$Builder$$anonfun$7.apply(SparkSession.scala:924)
	at scala.Option.getOrElse(Option.scala:121)
	at org.apache.spark.sql.SparkSession$Builder.getOrCreate(SparkSession.scala:924)
	at com.gewdata.SMSModeling1$.main(SMSModeling1.scala:37)
	at com.gewdata.SMSModeling1.main(SMSModeling1.scala)

解决办法:

解决这个问题其实不需要在windows安装hadoop

  1. 下载winutils.exe,它是在Windows系统上需要的hadoop调试环境工具,百度一下有很多;

  2. 我是将winutils.exe放在本地C:\hadoop\bin\winutils.exe;

  3. 可以在代码中直接指定hadoop的目录:

System.setProperty("hadoop.home.dir","C:\\hadoop")// 指定hadoop目录

或者在系统中将hadoop的环境变量配置上去即可:
java.io.IOException: Could not locate executable null\bin\winutils.exe in the Hadoop binaries._第1张图片

在使用spark进行文本分类的时候,利用朴素贝叶斯建模,保存模型出现以下问题:

java.io.IOException: (null) entry in command string: null chmod 0644 D:\sourceCode\Spark\model\metadata\_temporary\0\_temporary\attempt_20181012104232_0034_m_000000_0\part-00000
	at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:762)
	at org.apache.hadoop.util.Shell.execCommand(Shell.java:859)
	at org.apache.hadoop.util.Shell.execCommand(Shell.java:842)
	at org.apache.hadoop.fs.RawLocalFileSystem.setPermission(RawLocalFileSystem.java:661)
	at org.apache.hadoop.fs.ChecksumFileSystem$1.apply(ChecksumFileSystem.java:501)
	at org.apache.hadoop.fs.ChecksumFileSystem$FsOperation.run(ChecksumFileSystem.java:482)
	at org.apache.hadoop.fs.ChecksumFileSystem.setPermission(ChecksumFileSystem.java:498)
	at org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.java:467)
	at org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.java:433)
	at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:908)
	at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:801)
	at org.apache.hadoop.mapred.TextOutputFormat.getRecordWriter(TextOutputFormat.java:123)
	at org.apache.spark.internal.io.HadoopMapRedWriteConfigUtil.initWriter(SparkHadoopWriter.scala:224)
	at org.apache.spark.internal.io.SparkHadoopWriter$.org$apache$spark$internal$io$SparkHadoopWriter$$executeTask(SparkHadoopWriter.scala:118)
	at org.apache.spark.internal.io.SparkHadoopWriter$$anonfun$3.apply(SparkHadoopWriter.scala:79)
	at org.apache.spark.internal.io.SparkHadoopWriter$$anonfun$3.apply(SparkHadoopWriter.scala:78)
	at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87)
	at org.apache.spark.scheduler.Task.run(Task.scala:109)
	at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:345)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:748)
18/10/12 10:42:33 WARN TaskSetManager: Lost task 0.0 in stage 5.0 (TID 5, localhost, executor driver): java.io.IOException: (null) entry in command string: null chmod 0644 D:\sourceCode\Spark\model\metadata\_temporary\0\_temporary\attempt_20181012104232_0034_m_000000_0\part-00000
	at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:762)
	at org.apache.hadoop.util.Shell.execCommand(Shell.java:859)
	at org.apache.hadoop.util.Shell.execCommand(Shell.java:842)
	at org.apache.hadoop.fs.RawLocalFileSystem.setPermission(RawLocalFileSystem.java:661)
	at org.apache.hadoop.fs.ChecksumFileSystem$1.apply(ChecksumFileSystem.java:501)
	at org.apache.hadoop.fs.ChecksumFileSystem$FsOperation.run(ChecksumFileSystem.java:482)
	at org.apache.hadoop.fs.ChecksumFileSystem.setPermission(ChecksumFileSystem.java:498)
	at org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.java:467)
	at org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.java:433)
	at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:908)
	at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:801)
	at org.apache.hadoop.mapred.TextOutputFormat.getRecordWriter(TextOutputFormat.java:123)
	at org.apache.spark.internal.io.HadoopMapRedWriteConfigUtil.initWriter(SparkHadoopWriter.scala:224)
	at org.apache.spark.internal.io.SparkHadoopWriter$.org$apache$spark$internal$io$SparkHadoopWriter$$executeTask(SparkHadoopWriter.scala:118)
	at org.apache.spark.internal.io.SparkHadoopWriter$$anonfun$3.apply(SparkHadoopWriter.scala:79)
	at org.apache.spark.internal.io.SparkHadoopWriter$$anonfun$3.apply(SparkHadoopWriter.scala:78)
	at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87)
	at org.apache.spark.scheduler.Task.run(Task.scala:109)
	at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:345)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:748)

一直以为是权限的问题,结果直到我解决了上面的问题,这个问题也就不存在啦

你可能感兴趣的:(java.io.IOException: Could not locate executable null\bin\winutils.exe in the Hadoop binaries.)