java.io.IOException: Expecting a line not the end of stream
at org.apache.hadoop.fs.DF.parseExecResult(DF.java:109)
at org.apache.hadoop.util.Shell.runCommand(Shell.java:179)
at org.apache.hadoop.util.Shell.run(Shell.java:134)
at org.apache.hadoop.fs.DF.getAvailable(DF.java:73)
at org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathForWrite(LocalDirAllocator.java:329)
at org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:124)
at org.apache.hadoop.mapred.MapOutputFile.getSpillFileForWrite(MapOutputFile.java:107)
at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.sortAndSpill(MapTask.java:1221)
at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.flush(MapTask.java:1129)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:359)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307)
at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:177)
报该异常是因为运行df命令有中文造成的,
可以将cygwin环境设置成英文环境即可
命令:
export "LANG=en.UTF-8"
生成的索引可以通过命令查看
命令:./nutch readdb ../crawled/crawldb/ -stats