Windows7操作系统下使用Eclipse来搭建hadoop开发环境

一、步骤:

1、  安装jdk(略)

2、 安装eclipse(略)

如有问题请参考http://blog.csdn.net/crazytaliban/article/details/68958000

3、 在windows7中安装hadoop-2.7.3。

只需将hadoop-2.7.3.tar.gz解压即可,我解压到d:\hadoop-2.7.3。

4、 安装eclipse的hadoop插件。

从网上下载hadoop-eclipse-plugin-2.7.3.jar(或自己编译), 将其拷贝到D:\eclipse\plugins目录下即可。

5、 启动eclipse,点开Windows->preferences,弹出如下对话框,设置hadoop的安装目录。注意:是windows7系统下的目录。即步骤3解压的目录。设置好后点击OK。

Windows7操作系统下使用Eclipse来搭建hadoop开发环境_第1张图片

6、点开Windows->ShowView->Other…,弹出如下对话框。

Windows7操作系统下使用Eclipse来搭建hadoop开发环境_第2张图片

在其中选中Map/ReduceLocations,点击OK后将成功添加Map/ReduceLocations窗口,如下图:

Windows7操作系统下使用Eclipse来搭建hadoop开发环境_第3张图片

7、点击右侧的小象图标创建New Hadoop Location…,如下图:

弹出如下对话框:


Windows7操作系统下使用Eclipse来搭建hadoop开发环境_第4张图片

设置Hadoop集群的主节点,此处我添加的是主节点的名称“Master”,如此设置需要在C:\Windows\System32\drivers\etc文件下的host文件中添加一行信息:

192.168.1.200   Master,如下图。 或者直接在Host文本框中输入IP地址。

Windows7操作系统下使用Eclipse来搭建hadoop开发环境_第5张图片

DFS Master中的Port信息一定要与core-site.xml文件中的配置信息一致。

       fs.defaultFS

       hdfs://Master:9000

       NameNode URI

设置好后点击Finsh,Map/ReduceLocations中多出一条记录。

Windows7操作系统下使用Eclipse来搭建hadoop开发环境_第6张图片

8、点开Windows->Perspective->OpenPerspective->Other…,弹出如下对话框。

Windows7操作系统下使用Eclipse来搭建hadoop开发环境_第7张图片
选择Map/Reduce,在Project Explorer中添加DFSLocations,如下图:
Windows7操作系统下使用Eclipse来搭建hadoop开发环境_第8张图片


9、新建工程

点开File->New->Project,弹出如下对话框:
Windows7操作系统下使用Eclipse来搭建hadoop开发环境_第9张图片

选择Map/Reduce Project,点击Next,弹出如下对话框:
Windows7操作系统下使用Eclipse来搭建hadoop开发环境_第10张图片

输入工程名称,如果Windows7上的Hadoop位置有变化可以点击“ConfigureHadoop install directory”进行重新配置。

然后点击Next,弹出如下对话框:

Windows7操作系统下使用Eclipse来搭建hadoop开发环境_第11张图片


点击“Finish”,完成创建。
Windows7操作系统下使用Eclipse来搭建hadoop开发环境_第12张图片

右键单击工程目录下的src子目录,在弹出菜单中选择New->Class,弹出如下对话框:
Windows7操作系统下使用Eclipse来搭建hadoop开发环境_第13张图片

在对话框中填写类名称,新建Java类。然后点击“Finish”。完成类的添加,如下图:
Windows7操作系统下使用Eclipse来搭建hadoop开发环境_第14张图片

打开新添加的类WordCount.java,在其中添加如下代码:

import java.io.IOException;

import java.util.StringTokenizer;

import org.apache.hadoop.conf.Configuration;

import org.apache.hadoop.fs.Path;

import org.apache.hadoop.io.IntWritable;

import org.apache.hadoop.io.Text;

import org.apache.hadoop.mapreduce.Job;

import org.apache.hadoop.mapreduce.Mapper;

import org.apache.hadoop.mapreduce.Reducer;

import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;

import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;

import org.apache.hadoop.util.GenericOptionsParser;

 

public classWordCount {

   publicstaticclassTokenizerMapperextends Mapper {

        private final static IntWritable one = newIntWritable(1);

        private Textword =new Text();

        public void map(Object key, Text value, Context context) throws IOException, InterruptedException {

            StringTokenizer itr =new StringTokenizer(value.toString());

            while (itr.hasMoreTokens()) {

                word.set(itr.nextToken());

                context.write(word,one);

            }

        }

   }

 

   publicstaticclassIntSumReducerextends Reducer {

        private IntWritableresult =new IntWritable();

        public void reduce(Text key, Iterablevalues, Contextcontext)

                throws IOException, InterruptedException {

            intsum = 0;

            for (IntWritableval :values) {

                sum += val.get();

            }

            result.set(sum);

            context.write(key,result);

        }

   }

 

   publicstaticvoidmain(String[]args)throwsException {

        Configuration conf = new Configuration();

        String[] otherArgs = new GenericOptionsParser(conf,args).getRemainingArgs();

        if (otherArgs.length!= 2) {

            System.err.println("Usage: wordcount ");

            System.exit(2);

        }

        Job job = newJob(conf,"word count");

        job.setJarByClass(WordCount.class);

        job.setMapperClass(TokenizerMapper.class);

        job.setCombinerClass(IntSumReducer.class);

        job.setReducerClass(IntSumReducer.class);

        job.setOutputKeyClass(Text.class);

        job.setOutputValueClass(IntWritable.class);

        FileInputFormat.addInputPath(job,new Path(otherArgs[0]));

        FileOutputFormat.setOutputPath(job,new Path(otherArgs[1]));

        System.exit(job.waitForCompletion(true) ? 0 : 1);

   }

}

 

10、编译运行。

点开Run->Run Configrations,弹出如下对话框:

Windows7操作系统下使用Eclipse来搭建hadoop开发环境_第15张图片

点击“JavaApplication”,然后单击左上角的添加按钮,弹出如下对话框:

Windows7操作系统下使用Eclipse来搭建hadoop开发环境_第16张图片

在对话框中修改Name,然后点击“Search”按钮添加“Main class”,然后点击“Apply”按钮。然后在右侧Arguments选项卡中输入如下内容:

hdfs://Master:9000/input/wc.txt

hdfs://Master:9000/output1

注意:确保hdfs上存在/input/wc.txt文件,并且不存在/output1文件夹。

如下图:

Windows7操作系统下使用Eclipse来搭建hadoop开发环境_第17张图片

然后点击“Run”即可运行。

11、点击“Run”之后可能存在的错误。

错误一:(不影响代码编译执行)
Windows7操作系统下使用Eclipse来搭建hadoop开发环境_第18张图片

log4j:WARN No appenders could be foundfor logger (org.apache.hadoop.metrics2.lib.MutableMetricsFactory).

log4j:WARN Please initialize the log4jsystem properly.

log4j:WARN Seehttp://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.

 

错误原因:

无法生成日志。

解决办法:

在创建的Java(mapreduce)工程的src文件下新建文件log4j.properties,文件内容如下:

# Configure logging for testing:optionally with log file 

#log4j.rootLogger=debug,appender 

log4j.rootLogger=info,appender 

#log4j.rootLogger=error,appender 

#\u8F93\u51FA\u5230\u63A7\u5236\u53F0 

log4j.appender.appender=org.apache.log4j.ConsoleAppender 

#\u6837\u5F0F\u4E3ATTCCLayout 

log4j.appender.appender.layout=org.apache.log4j.TTCCLayout

Windows7操作系统下使用Eclipse来搭建hadoop开发环境_第19张图片

错误二:

Windows7操作系统下使用Eclipse来搭建hadoop开发环境_第20张图片

Exception in thread "main" java.net.ConnectException:Call From cyril-PC/192.168.1.106 to Master:8020 failed on connection exception:java.net.ConnectException: Connection refused: no further information; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused

      atsun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)

      atsun.reflect.NativeConstructorAccessorImpl.newInstance(Unknown Source)

      atsun.reflect.DelegatingConstructorAccessorImpl.newInstance(Unknown Source)

      atjava.lang.reflect.Constructor.newInstance(Unknown Source)

      atorg.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:792)

      atorg.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:732)

      atorg.apache.hadoop.ipc.Client.call(Client.java:1479)

      atorg.apache.hadoop.ipc.Client.call(Client.java:1412)

      atorg.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:229)

      atcom.sun.proxy.$Proxy10.getFileInfo(Unknown Source)

      atorg.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo(ClientNamenodeProtocolTranslatorPB.java:771)

      atsun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

      atsun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)

      atsun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)

      atjava.lang.reflect.Method.invoke(Unknown Source)

      atorg.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:191)

      atorg.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)

      atcom.sun.proxy.$Proxy11.getFileInfo(Unknown Source)

      atorg.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:2108)

      atorg.apache.hadoop.hdfs.DistributedFileSystem$22.doCall(DistributedFileSystem.java:1305)

      atorg.apache.hadoop.hdfs.DistributedFileSystem$22.doCall(DistributedFileSystem.java:1301)

      atorg.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)

      atorg.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1301)

      atorg.apache.hadoop.fs.FileSystem.exists(FileSystem.java:1426)

      atorg.apache.hadoop.mapreduce.lib.output.FileOutputFormat.checkOutputSpecs(FileOutputFormat.java:145)

      atorg.apache.hadoop.mapreduce.JobSubmitter.checkSpecs(JobSubmitter.java:266)

      atorg.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:139)

      atorg.apache.hadoop.mapreduce.Job$10.run(Job.java:1290)

      atorg.apache.hadoop.mapreduce.Job$10.run(Job.java:1287)

      atjava.security.AccessController.doPrivileged(Native Method)

      atjavax.security.auth.Subject.doAs(Unknown Source)

      atorg.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1698)

      atorg.apache.hadoop.mapreduce.Job.submit(Job.java:1287)

      atorg.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1308)

      atWordCount.main(WordCount.java:61)

Caused by: java.net.ConnectException: Connectionrefused: no further information

      atsun.nio.ch.SocketChannelImpl.checkConnect(Native Method)

      atsun.nio.ch.SocketChannelImpl.finishConnect(Unknown Source)

      atorg.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)

      atorg.apache.hadoop.net.NetUtils.connect(NetUtils.java:531)

      atorg.apache.hadoop.net.NetUtils.connect(NetUtils.java:495)

      atorg.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:614)

      atorg.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:712)

      atorg.apache.hadoop.ipc.Client$Connection.access$2900(Client.java:375)

      atorg.apache.hadoop.ipc.Client.getConnection(Client.java:1528)

      atorg.apache.hadoop.ipc.Client.call(Client.java:1451)

      ...28 more

可能原因:

1、文件系统端口不是默认的8020,需要查看core-site.xml文件

       fs.defaultFS

       hdfs://Master:9000

       NameNode URI

2、hadoop集群没有启动

解决办法:

1、确保hadoop集群启动正确;

2、在编译执行时,在输入参数中添加端口值,否则默认为8020,如:

hdfs://Master:9000/input/test.txt

hdfs://Master:9000/output1

错误三:

Windows7操作系统下使用Eclipse来搭建hadoop开发环境_第21张图片

[main] ERRORorg.apache.hadoop.util.Shell - Failed to locate the winutils binary in thehadoop binary path

java.io.IOException: Could not locate executable D:\hadoop-2.7.3\bin\winutils.exe in theHadoop binaries.

      atorg.apache.hadoop.util.Shell.getQualifiedBinPath(Shell.java:379)

      atorg.apache.hadoop.util.Shell.getWinUtilsPath(Shell.java:394)

      atorg.apache.hadoop.util.Shell.(Shell.java:387)

      atorg.apache.hadoop.util.GenericOptionsParser.preProcessForWindows(GenericOptionsParser.java:440)

      atorg.apache.hadoop.util.GenericOptionsParser.parseGeneralOptions(GenericOptionsParser.java:486)

      atorg.apache.hadoop.util.GenericOptionsParser.(GenericOptionsParser.java:170)

      atorg.apache.hadoop.util.GenericOptionsParser.(GenericOptionsParser.java:153)

      atWordCount.main(WordCount.java:47)

[main] WARNorg.apache.hadoop.util.NativeCodeLoader - Unable to load native-hadoop libraryfor your platform... using builtin-java classes where applicable

[main] INFOorg.apache.hadoop.conf.Configuration.deprecation - session.id is deprecated.Instead, use dfs.metrics.session-id

[main] INFOorg.apache.hadoop.metrics.jvm.JvmMetrics - Initializing JVM Metrics withprocessName=JobTracker, sessionId=

Exception in thread "main" java.io.IOException:(null) entry in command string: null chmod 0700D:\tmp\hadoop-cyril\mapred\staging\Cyril2058634212\.staging

      atorg.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:769)

      atorg.apache.hadoop.util.Shell.execCommand(Shell.java:866)

      atorg.apache.hadoop.util.Shell.execCommand(Shell.java:849)

      atorg.apache.hadoop.fs.RawLocalFileSystem.setPermission(RawLocalFileSystem.java:733)

      atorg.apache.hadoop.fs.RawLocalFileSystem.mkOneDirWithMode(RawLocalFileSystem.java:491)

      atorg.apache.hadoop.fs.RawLocalFileSystem.mkdirsWithOptionalPermission(RawLocalFileSystem.java:531)

      atorg.apache.hadoop.fs.RawLocalFileSystem.mkdirs(RawLocalFileSystem.java:509)

      atorg.apache.hadoop.fs.FilterFileSystem.mkdirs(FilterFileSystem.java:305)

      atorg.apache.hadoop.mapreduce.JobSubmissionFiles.getStagingDir(JobSubmissionFiles.java:133)

      atorg.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:144)

      atorg.apache.hadoop.mapreduce.Job$10.run(Job.java:1290)

      atorg.apache.hadoop.mapreduce.Job$10.run(Job.java:1287)

      atjava.security.AccessController.doPrivileged(Native Method)

      atjavax.security.auth.Subject.doAs(Unknown Source)

      atorg.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1698)

      atorg.apache.hadoop.mapreduce.Job.submit(Job.java:1287)

      atorg.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1308)

      atWordCount.main(WordCount.java:61)

可能原因:

ERROR org.apache.hadoop.util.Shell - Failed to locate thewinutils binary in the hadoop binary path,这条日志信息中明确说明了是因为缺少winutils.exe

解决办法:

只需把winutils.exe拷贝到%HADOOP_HOME%/bin目录下即可。winutils.exe文件可从网上下载。然后重新编译执行Java代码,会出现错误三。

 

错误四:

Windows7操作系统下使用Eclipse来搭建hadoop开发环境_第22张图片

[main] WARN org.apache.hadoop.util.NativeCodeLoader- Unable to load native-hadoop library for your platform... using builtin-javaclasses where applicable

[main] INFOorg.apache.hadoop.conf.Configuration.deprecation - session.id is deprecated.Instead, use dfs.metrics.session-id

[main] INFOorg.apache.hadoop.metrics.jvm.JvmMetrics - Initializing JVM Metrics withprocessName=JobTracker, sessionId=

[main] WARNorg.apache.hadoop.mapreduce.JobResourceUploader - No job jar file set.  User classes may not be found. See Job orJob#setJar(String).

[main] INFOorg.apache.hadoop.mapreduce.lib.input.FileInputFormat - Total input paths toprocess : 1

[main] INFOorg.apache.hadoop.mapreduce.JobSubmitter - number of splits:1

[main] INFOorg.apache.hadoop.mapreduce.JobSubmitter - Submitting tokens for job:job_local1072922152_0001

[main] INFOorg.apache.hadoop.mapreduce.JobSubmitter - Cleaning up the staging areafile:/tmp/hadoop-Cyril/mapred/staging/Cyril1072922152/.staging/job_local1072922152_0001

Exception in thread "main"java.lang.UnsatisfiedLinkError:org.apache.hadoop.io.nativeio.NativeIO$Windows.access0(Ljava/lang/String;I)Z

      atorg.apache.hadoop.io.nativeio.NativeIO$Windows.access0(Native Method)

      atorg.apache.hadoop.io.nativeio.NativeIO$Windows.access(NativeIO.java:609)

      atorg.apache.hadoop.fs.FileUtil.canRead(FileUtil.java:977)

      atorg.apache.hadoop.util.DiskChecker.checkAccessByFileMethods(DiskChecker.java:187)

      atorg.apache.hadoop.util.DiskChecker.checkDirAccess(DiskChecker.java:174)

      atorg.apache.hadoop.util.DiskChecker.checkDir(DiskChecker.java:108)

      atorg.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.confChanged(LocalDirAllocator.java:285)

      atorg.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathForWrite(LocalDirAllocator.java:344)

      atorg.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:150)

      atorg.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:131)

      atorg.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:115)

      atorg.apache.hadoop.mapred.LocalDistributedCacheManager.setup(LocalDistributedCacheManager.java:125)

      atorg.apache.hadoop.mapred.LocalJobRunner$Job.(LocalJobRunner.java:163)

      atorg.apache.hadoop.mapred.LocalJobRunner.submitJob(LocalJobRunner.java:731)

      atorg.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:240)

      atorg.apache.hadoop.mapreduce.Job$10.run(Job.java:1290)

      atorg.apache.hadoop.mapreduce.Job$10.run(Job.java:1287)

      atjava.security.AccessController.doPrivileged(Native Method)

      atjavax.security.auth.Subject.doAs(Unknown Source)

      atorg.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1698)

      atorg.apache.hadoop.mapreduce.Job.submit(Job.java:1287)

      atorg.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1308)

      atWordCount.main(WordCount.java:61)

可能原因:

1、[main] WARNorg.apache.hadoop.util.NativeCodeLoader - Unable to load native-hadoop libraryfor your platform... using builtin-java classes where applicable,由这条日志信息表明缺少hadoop动态库(Hadoop.dll)

解决办法:

在网上下载hadoop.dll,将其拷贝到c:\windows\System32\文件夹下即可。


12、程序打包

右键单击工程,点开“Export…”在弹出的对话框中选择“JAR file”,如下图:
Windows7操作系统下使用Eclipse来搭建hadoop开发环境_第23张图片

点击“Next”,弹出如下对话框。

Windows7操作系统下使用Eclipse来搭建hadoop开发环境_第24张图片

勾选如图所示选项,单击“Browse”添加“JAR file”的输出路径,然后单击“Next”按钮,弹出如下对话框:

Windows7操作系统下使用Eclipse来搭建hadoop开发环境_第25张图片
单击“Next”,弹出如下对话框:
Windows7操作系统下使用Eclipse来搭建hadoop开发环境_第26张图片
单击“Browse”按钮添加Main class,如下图所示:

Windows7操作系统下使用Eclipse来搭建hadoop开发环境_第27张图片

单击“OK”
Windows7操作系统下使用Eclipse来搭建hadoop开发环境_第28张图片

单击“Finish”。

将生成的“WordCount.jar”文件上传到集群的“/srv/ftp”目录下,然后通过以下命令执行程序:

hadoopjar /srv/ftp/WordCount.jar /input /output







你可能感兴趣的:(Java,eclipse,hadoop,windows,7)