Ubuntu16.04 hadoop2.7.3 伪分布配置

一、在Ubuntu配置java环境变量

我是下载jdk1.8.0_151最新版本的。官方网站:http://www.oracle.com/technetwork/java/javase/downloads/jdk8-downloads-2133151.html

(因为我是Linux64位的,就下载了jdk-8u151-linux-x64.tar.gz)
1.vim /etc/profile

#我的java根目录是/java
export JAVA_HOME=/java/jdk1.8.0_151
export JRE_HOME=${JAVA_HOME}/jre
export CLASSPATH=.:${JAVA_HOME}/lib:${JRE_HOME}/lib
export PATH=${JAVA_HOME}/bin:$PATH

二、安装ssh-server并实现免密码登录
1.在Ubuntu中下载ssh-server

sudo apt-get install openssh-server

2.启动ssh-server

sudo /etc/init.d/ssh start

出现以下字样

[ ok ] Starting ssh (via systemctl): ssh.service.

查看ssh-server服务是否启动

 ps -ef|grep ssh

如果出现下面情况:

root       1073      1  0 13:05 ?        00:00:00 /usr/sbin/sshd -D
root       6799   2245  0 14:02 pts/19   00:00:00 grep --color=auto ssh

说明ssh-server启动成功

3.设置ssh-server免密码登录

使用如下命令,一直回车,直到生成了rsa

ssh-keygen -t rsa

导入authorized_keys:

cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys

测试可不可以免密码登录了

ssh localhost

如果出现以下情况:

Welcome to Ubuntu 16.04.3 LTS (GNU/Linux 4.10.0-28-generic x86_64)

 * Documentation:  https://help.ubuntu.com
 * Management:     https://landscape.canonical.com
 * Support:        https://ubuntu.com/advantage

packages can be updated.
updates are security updates.

Last login: Sat Oct 28 15:05:51 2017 from 127.0.0.1

关闭防火墙

ufw disable

三、安装hadoop单机模式和伪分布模式

1.下载hadoop-2.7.3.tar.gz,解压到/usr/local(单机模式搭建):

下载网站:http://archive.apache.org/dist/hadoop/core/hadoop-2.7.3/?C=S;O=A

切换到/usr/local下,将hadoop-2.7.3重命名为hadoop:

cd /usr/local

sudo mv hadoop-2.7.3 hadoop

修改/usr/local/hadoop的使用权限:

sudo chmod 777 /usr/local/hadoop

配置.bashrc文件

  1. sudo vim ~/.bashrc

在文件末尾追加下面内容,然后保存:

#HADOOP VARIABLES START
export HADOOP_INSTALL=/bigdata/hadoop-2.7.3
export HADOOP_MAPRED_HOME=$HADOOP_INSTALL
export HADOOP_COMMON_HOME=$HADOOP_INSTALL
export HADOOP_HDFS_MOME=$HADOOP_INSTALL
export YARE_HOME=$HADOOP_INSTALL
export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_INSTALL/lib/native
export HADOOP_OPTS="-Djava.library.path=$HADOOP_INSTALL/lib"
export PATH=$PATH:$HADOOP_INSTALL/sbin
export PATH=$PATH:$HADOOP_INSTALL/bin


 #HADOOP VARIABLES END

执行下面命令,使添加的环境变量生效:

source ~/.bashrc

hadoop配置 (伪分布模式搭建)

1.配置hadoop-env.sh:

sudo vim /usr/local/hadoop/etc/hadoop/hadoop-env.sh

在文件加入以下代码:

# the Java implementation to use. 
export JAVA_HOME=/java/jdk1.8.0_151
export HADOOP_HOME=/bigdata/hadoop-2.7.3
export PATH=$PATH:/bigdata/hadoop-2.7.3/bin
export HADOOP_OPTS="-Djava.library.path=${HADOOP_HOME}/lib/native"

配置yarn-env.sh:

sudo vim /usr/local/hadoop/etc/hadoop/yarn-env.sh

在文件最后加上以下代码:

# JAVA_HOME=/java/jdk1.8.0_151
export /java/jdk1.8.0_151

配置core-site.xml,在home目录下创建/usr/local/hadoop/tmp目录,然后在core-site.xml中添加下列内容

sudo mkdir /usr/local/hadoop/tmp

sudo vim /usr/local/hadoop/etc/hadoop/core-site.xml

在文件最后加上以下代码:

  
	
		fs.defaultFS
		hdfs://hadoop4:9000
	
	
	 
                dfs.replication
                1
        

        
                hadoop.tmp.dir
                /bigdata/hadoop-2.7.3/tmp
        

        
                dfs.name.dir
                /home/hdfs/name
        

        
                hadoop.tmp.dir
                /home/hadoop3/hadoop_tmp
		A base for other temporary directories.
        

         

                     dfs.permissions

                     false


          

            If "true", enable permission checking in HDFS. If "false", permiss    ion checking is turned off,but all other behavior is unchanged.  Switching f    rom one parameter value to the other does  not change the mode, owner or gro    up of files or directories
            
         







配置hdfs-site.xml:

sudo vim /usr/local/hadoop/etc/hadoop/hdfs-site.xml

在文件最后加上以下代码:


	
		dfs.replication
		1
	

        
                dfs.data.dir
                /bigdata/hadoop-2.7.3/data
        


配置mapred-site.xml:

sudo  /usr/local/hadoop/etc/hadoop/mapred-site.xml

在文件最后加上以下代码:

  

        
                mapreduce.framework.name
                yarn
        

        
                mapred.job.tracker
                localhost:9001
        

~                   

配置yarn-site.xml:

sudo vim /usr/local/hadoop/etc/hadoop/yarn-site.xml

在文件最后加上以下代码:


                yarn.nodemanager.aux-services
                mapreduce_shuffle
        

        
                yarn.nodemanager.aux-services.mapreduce.shuffle.class
                org.apache.hadoop.mapred.ShuffleHandler
        

        
                yarn.resourcemanager.scheduler.address
                hadoop4:8030
        

        
                yarn.resourcemanager.address
                hadoop4:8032
        

        
                yarn.resourcemanager.resource-traker.address
                hadoop4:8031
        



关机重启系统。

sudo   reboot

测试Hadoop是否安装并配置成功

验证Hadoop单机模式安装完成:

hadoop version

出现Hadoop版本信息单机模式成功了
例如以下信息

Hadoop 2.7.3
Subversion https://git-wip-us.apache.org/repos/asf/hadoop.git -r baa91f7c6bc9cb92be5982de4719c1c8af91ccff
Compiled by root on 2016-08-18T01:41Z
Compiled with protoc 2.5.0
From source with checksum 2e4ce5f957ea4db193bce3734ff29ff4
This command was run using /usr/local/bigdata/hadoop-2.7.3/share/hadoop/common/hadoop-common-2.7.3.ja

启动hdfs使用为分布模式

格式化namenode:

hdfs namenode -format  

有 “……has been successfully formatted” 等字样出现即说明格式化成功
启动hdfs

start-dfs.sh

显示进程

jps

出现以下六个就成功了

 ResourceManager
 Jps
 DataNode
 SecondaryNameNode
 NameNode
 NodeManager

在浏览器中输入http://localhost:50070/进行测试

输入 http://localhost:8088/测试伪分布安装配置是否成功

停止hdfs

stop-all.sh

运行wordcount

启动hdfs

start-all.sh

查看hdfs底下包含的文件目录

hadoop dfs -ls /

如果是第一次运行hdfs,则什么都不会显示
在hdfs中创建一个文件目录input,将/usr/local/hadoop/README.txt上传至input中

hdfs dfs -mkdir /input
hadoop fs -put /usr/local/hadoop/README.txt /input

执行以下命令运行wordcount,并将结果输出到output中


hadoop jar /usr/local/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.3.jar wordcount /input /output

执行成功后output 目录底下会生成两个文件 _SUCCESS 成功标志的文件,里面没有内容。 一个是 part-r-00000 ,通过以下命令查看执行的结果

hadoop fs -cat /output/part-r-00000

你可能感兴趣的:(hadoop)