SpringBoot集成Hadoop

        SpringBoot集成Hadoop,相关配置过程如下。默认在Linux下已经装好Hadoop集群(Hadoop-2.8.5)。

一、集成HDFS

1、主要application.properties配置

#hdfs
hdfs.url=hdfs://192.168.2.5:9000
hdfs.username=root
hdfs.replication=2
hdfs.blocksize=67108864

2、主要pom.xml配置


	
		org.springframework.boot
		spring-boot-starter-web
		
			
				org.springframework.boot
				spring-boot-starter-logging
			
		
	
	
	
		org.apache.hadoop
		hadoop-common
		2.8.5
	
	
	
		org.apache.hadoop
		hadoop-hdfs
		2.8.5
	
	
	
		org.apache.hadoop
		hadoop-client
		2.8.5
	
	
	
		com.alibaba
		fastjson
		1.2.44
	
	
	
		io.springfox
		springfox-swagger2
		2.9.2
	
	
	
		io.springfox
		springfox-swagger-ui
		2.9.2
	
	
	
		com.google.guava
		guava
		27.0.1-jre
	

完成其他相关配置和代码。

3、Windows下环境变量设置

        在Windows下启动程序,请求HDFS报以下错误:

java.io.FileNotFoundException: java.io.FileNotFoundException: HADOOP_HOME and hadoop.home.dir are unset.

        Windows下Java集成HADOOP,需要用到winutils.exe。提示缺少HADOOP_HOME或hadoop.home.dir相关配置。从https://github.com/steveloughran/winutils下载相关库,与本集群hadoop-2.8.5最接近的库为hadoop-2.8.3,下载hadoop-2.8.3到本地磁盘。通过设置Windows环境变量或在程序中设置hadoop.home.dir解决。

(1)设置Windows环境变量

设置Windows环境变量HADOOP_HOME=D:\soft\hadoop\winutils\hadoop-2.8.3

(2)程序中设置hadoop.home.dir

在程序初始化Hadoop Configuration之前添加以下代码设置环境变量

//Windows设置hadoop.home.dir
System.setProperty("hadoop.home.dir","D:\\soft\\hadoop\\winutils\\hadoop-2.8.3");

4、Windows主机映射

需要配置本机的hosts文件,添加hadoop集群的主机映射,配置内容和hadoop集群的配置差不多。编辑C:\Windows\System32\drivers\etc\hosts文件,添加:

# Copyright (c) 1993-2009 Microsoft Corp.
#
# This is a sample HOSTS file used by Microsoft TCP/IP for Windows.
#
# This file contains the mappings of IP addresses to host names. Each
# entry should be kept on an individual line. The IP address should
# be placed in the first column followed by the corresponding host name.
# The IP address and the host name should be separated by at least one
# space.
#
# Additionally, comments (such as these) may be inserted on individual
# lines or following the machine name denoted by a '#' symbol.
#
# For example:
#
#      102.54.94.97     rhino.acme.com          # source server
#       38.25.63.10     x.acme.com              # x client host

# localhost name resolution is handled within DNS itself.
#	127.0.0.1       localhost
#	::1             localhost
192.168.2.5 hadoop.master

 

你可能感兴趣的:(Java,Hadoop)