win8下用Maven构建hadoop环境编写MapReduce程序

1 . 用Maven创建一个标准化的Java项目
D:\workspace\java>mvn archetype:generate -DarchetypeGroupId=org.apache.maven.archetypes -DgroupId=siat.hadoop -DartifactId=TestHadoop -DpackageName=siat.hadoop -Dversion=1.0-SNAPSHOT -DinteractiveMode=false
 
  
[INFO] Scanning for projects...
[INFO]
[INFO] ------------------------------------------------------------------------
[INFO] Building Maven Stub Project (No POM) 1
[INFO] ------------------------------------------------------------------------
[INFO]
[INFO] >>> maven-archetype-plugin:2.2:generate (default-cli) > generate-sources
@ standalone-pom >>>
[INFO]
[INFO] <<< maven-archetype-plugin:2.2:generate (default-cli) < generate-sources
@ standalone-pom <<<</pre>
解释一下,-DgroupId=siat.hadoop -DartifactId=TestHadoop -DpackageName=siat.hadoop这些参数指定好之后会自动创建,不需要手动创建

如果以上能顺利执行,跳过这里,直接看第2步!一般都能很快的顺利通过,如果出现exception,则按照以下配置(注:出现这种情况最好重新执行上面命令,不然后面可能出现不好解决的问题)
Choose a number or apply filter (format: [groupId:]artifactId, case sensitive co
ntains): 493: 1( 注:若后面有hint:enter or space is。。。直接按回车或者空格键
Choose am.ik.archetype:spring-boot-blank-archetype version:
1: 0.9.0
2: 0.9.1
3: 0.9.2
Choose a number: 3: 3( 选择一个version id输入
Downloading: https://repo.maven.apache.org/maven2/am/ik/archetype/spring-boot-bl
ank-archetype/0.9.2/spring-boot-blank-archetype-0.9.2.jar
Downloaded: https://repo.maven.apache.org/maven2/am/ik/archetype/spring-boot-bla
nk-archetype/0.9.2/spring-boot-blank-archetype-0.9.2.jar (6 KB at 3.5 KB/sec)
Downloading: https://repo.maven.apache.org/maven2/am/ik/archetype/spring-boot-bl
ank-archetype/0.9.2/spring-boot-blank-archetype-0.9.2.pom
Downloaded: https://repo.maven.apache.org/maven2/am/ik/archetype/spring-boot-bla
nk-archetype/0.9.2/spring-boot-blank-archetype-0.9.2.pom (3 KB at 5.7 KB/sec)
[INFO] Using property: groupId = org.conan.myhadoop.mr
Define value for property 'artifactId': : myHadoop( 根据需要输入
Define value for property 'version':  1.0-SNAPSHOT: : 1.0-SNAPSHOT( 照着前面输入即可
[INFO] Using property: package = org.conan.myhadoop.mr
Confirm properties configuration:
groupId: org.conan.myhadoop.mr
artifactId: myHadoop
version: 1.0-SNAPSHOT
package: org.conan.myhadoop.mr
 Y: : Y( 如果正确,输入Y
[INFO] -------------------------------------------------------------------------
---
[INFO] Using following parameters for creating project from Archetype: spring-bo
ot-blank-archetype:0.9.2
[INFO] -------------------------------------------------------------------------
---
[INFO] Parameter: groupId, Value: org.conan.myhadoop.mr
[INFO] Parameter: artifactId, Value: myHadoop
[INFO] Parameter: version, Value: 1.0-SNAPSHOT
[INFO] Parameter: package, Value: org.conan.myhadoop.mr
[INFO] Parameter: packageInPathFormat, Value: org/conan/myhadoop/mr
[INFO] Parameter: version, Value: 1.0-SNAPSHOT
[INFO] Parameter: package, Value: org.conan.myhadoop.mr
[INFO] Parameter: groupId, Value: org.conan.myhadoop.mr
[INFO] Parameter: artifactId, Value: myHadoop
[WARNING] The directory D:\workspace\java\myHadoop already exists.
[INFO] project created from Archetype in dir: D:\workspace\java\myHadoop
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 12:05 min
[INFO] Finished at: 2014-10-23T14:23:02+08:00
[INFO] Final Memory: 10M/26M
[INFO] ------------------------------------------------------------------------
D:\workspace\java>

2 . 进入项目,执行mvn clean install 命令(可以先mvn clean,再mvn install)

~ D:\workspace\java>cd TestHadoop
 ~ D:\workspace\java\TestHadoop>mvn clean install
。。。
 
    
[INFO] Installing D:\workspace\java\TestHadoop\target\TestHadoop-1.0-SNAPSHOT.j
r to C:\Users\michael\.m2\repository\siat\hadoop\TestHadoop\1.0-SNAPSHOT\TestHa
oop-1.0-SNAPSHOT.jar
[INFO] Installing D:\workspace\java\TestHadoop\pom.xml to C:\Users\michael\.m2\
epository\siat\hadoop\TestHadoop\1.0-SNAPSHOT\TestHadoop-1.0-SNAPSHOT.pom
[INFO] ------------------------------------------------------------------------
[INFO]  BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 25.122 s
[INFO] Finished at: 2014-10-23T15:50:55+08:00
[INFO] Final Memory: 11M/28M
[INFO] ------------------------------------------------------------------------
D:\workspace\java\TestHadoop>

[INFO]
[INFO] --- maven-archetype-plugin:2.2:generate (default-cli) @ standalone-pom --
-
[INFO] Generating project in Batch mode
[INFO] No archetype defined. Using maven-archetype-quickstart (org.apache.maven.
archetypes:maven-archetype-quickstart:1.0)
Downloading: https://repo.maven.apache.org/maven2/org/apache/maven/archetypes/ma
ven-archetype-quickstart/1.0/maven-archetype-quickstart-1.0.jar
Downloaded: https://repo.maven.apache.org/maven2/org/apache/maven/archetypes/mav
en-archetype-quickstart/1.0/maven-archetype-quickstart-1.0.jar (5 KB at 2.6 KB/s
ec)
Downloading: https://repo.maven.apache.org/maven2/org/apache/maven/archetypes/ma
ven-archetype-quickstart/1.0/maven-archetype-quickstart-1.0.pom
Downloaded: https://repo.maven.apache.org/maven2/org/apache/maven/archetypes/mav
en-archetype-quickstart/1.0/maven-archetype-quickstart-1.0.pom (703 B at 1.5 KB/
sec)
[INFO] -------------------------------------------------------------------------
---
[INFO] Using following parameters for creating project from Old (1.x) Archetype:
 maven-archetype-quickstart:1.0
[INFO] -------------------------------------------------------------------------
---
[INFO] Parameter: groupId, Value: siat.hadoop
[INFO] Parameter: packageName, Value: siat.hadoop
[INFO] Parameter: package, Value: siat.hadoop
[INFO] Parameter: artifactId, Value: TestHadoop
[INFO] Parameter: basedir, Value: D:\workspace\java
[INFO] Parameter: version, Value: 1.0-SNAPSHOT
[INFO] project created from Old (1.x) Archetype in dir: D:\workspace\java\TestHa
doop
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 11.390 s
[INFO] Finished at: 2014-10-23T15:45:02+08:00
[INFO] Final Memory: 11M/27M
[INFO] ------------------------------------------------------------------------
D:\workspace\java>



以下错误是在第1步不能顺利执行的时候才会出现的问题,试了网上提供的很多方法,也未能解决。出现这个问题,还是从第1步开始吧!so,第1步顺利执行是保证这一步不出问题的基础。
[ERROR] Failed to execute goal org.apache.maven.plugins:maven-compiler-plugin:3.
1:compile (default-compile) on project myHadoop: Fatal error compiling: 无效的目
标发行版: 1.8 -> [Help 1]
[ERROR]
[ERROR] To see the full stack trace of the errors, re-run Maven with the -e swit
ch.
[ERROR] Re-run Maven using the -X switch to enable full debug logging.
[ERROR]
[ERROR] For more information about the errors and possible solutions, please rea
d the following articles:
[ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/MojoExecutionE
xception
D:\workspace\java\myHadoop>


pom.xml文件:
 <project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
  xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/maven-v4_0_0.xsd">
  <modelVersion>4.0.0</modelVersion>
  <groupId>siat.hadoop</groupId>
  <artifactId>TestHadoop</artifactId>
  <packaging>jar</packaging>
  <version>1.0-SNAPSHOT</version>
  <name>TestHadoop</name>
  <url>http://maven.apache.org</url>
  
  
  
  <dependencies>

  <dependency>
    <groupId>org.apache.hadoop</groupId>
    <artifactId>hadoop-core</artifactId>
    <version>1.2.1</version>
  </dependency>

    <dependency>
      <groupId>junit</groupId>
      <artifactId>junit</artifactId>
      <version>3.8.1</version>
      <scope>test</scope>
    </dependency>
  </dependencies>
</project>

 

 

3. 导入项目到eclipse

File-->Import-->Maven-->Exsiting Maven Projects(不要选General导入,有可能会识别不了maven项目)

4. 增加hadoop依赖

这里我使用hadoop-1.2.1版本,修改文件:pom.xml(修改D:\workspace\java\TestHadoop下的文件,eclipse中会自动同步

 

  <project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"

  xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/maven-v4_0_0.xsd">

  <modelVersion>4.0.0</modelVersion>

  <groupId>siat.hadoop</groupId>

  <artifactId>TestHadoop</artifactId>

  <packaging>jar</packaging>

  <version>1.0-SNAPSHOT</version>

  <name>TestHadoop</name>

  <url>http://maven.apache.org</url>

  

  

  

  <dependencies>


  <dependency>

    <groupId>org.apache.hadoop</groupId>

    <artifactId>hadoop-core</artifactId>

    <version>1.2.1</version>

  </dependency>


    <dependency>

      <groupId>junit</groupId>

      <artifactId>junit</artifactId>

      <version>3.8.1</version>

      <scope>test</scope>

    </dependency>

  </dependencies>

</project>


   

 

 

5. 下载依赖

~ mvn clean install                 (和第2步一样)
Tests run: 1, Failures: 0, Errors: 0, Skipped: 0

[INFO]
[INFO] --- maven-jar-plugin:2.4:jar (default-jar) @ TestHadoop ---
[INFO] Building jar: D:\workspace\java\TestHadoop\target\TestHadoop-1.0-SNAPSHO
.jar
[INFO]
[INFO] --- maven-install-plugin:2.4:install (default-install) @ TestHadoop ---
[INFO] Installing D:\workspace\java\TestHadoop\target\TestHadoop-1.0-SNAPSHOT.j
r to C:\Users\michael\.m2\repository\siat\hadoop\TestHadoop\1.0-SNAPSHOT\TestHa
oop-1.0-SNAPSHOT.jar
[INFO] Installing D:\workspace\java\TestHadoop\pom.xml to C:\Users\michael\.m2\
epository\siat\hadoop\TestHadoop\1.0-SNAPSHOT\TestHadoop-1.0-SNAPSHOT.pom
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 3.181 s
[INFO] Finished at: 2014-10-23T16:25:31+08:00
[INFO] Final Memory: 14M/35M
[INFO] ------------------------------------------------------------------------
D:\workspace\java\TestHadoop>

在eclipse中刷新项目:看到Maven Denpendencies中已经成功加载了hadoop-core-1.2.1.jar等其他依赖包。项目的依赖程序,被自动加载的库路径下面。

6. 从Hadoop集群环境下载hadoop配置文件(我没有建集群,只部署了一台hadoop的master服务器)

  • core-site.xml
  • hdfs-site.xml
  • mapred-site.xml

保存在src/main/resources/hadoop目录下面

7.配置本地host,增加master的域名指向

 c:/Windows/System32/drivers/etc/hosts 
172.21.5.235 master(集群中的主服务器,即NameNode)
这里我用laptop部署好了hadoop的环境,当作服务器,6个进程全部启动,laptop和台式机都连在了一个路由器上,即一个局域网内。

8. MapReduce程序开发

编写一个简单的MapReduce程序,实现wordcount功能。

新建一个Java文件:WordCount.java

。。。

。。。(win下需要重新编译hadoop的jar包,具体参考下面网站)

9. 说明

这样,我们就实现了在win7/win8中的开发,通过Maven构建Hadoop依赖环境,在Eclipse中开发MapReduce的程序,然后运行JavaAPP。Hadoop应用会自动把我们的MR程序打成jar包,再上传的远程的hadoop环境中运行,返回日志在Eclipse控制台输出。


参考:http://blog.fens.me/hadoop-maven-eclipse/



你可能感兴趣的:(eclipse,mapreduce,maven,win8,HADOOP集群)