Hadoop安装
0.部署规划:
修改 /etc/hosts,添加hosts如下:
# hadoop nodes 192.168.75.128 master 192.168.75.130 slave1 192.168.75.131 slave2 |
1.添加hadoop用户并添加到sudoers:
Sudo adduser hadoop Sudo vim /etc/sudoers ###可能需要添加可写权限chmod +w sudoers 添加内容入下: hadoop ALL=(ALL:ALL) ALL |
2.安装Java并设置环境变量:
下载JDK并解压到/usr/local,修改/etc/profile如下:
# set jdk classpath export JAVA_HOME=/usr/local/jdk1.8.0_111 export JRE_HOME=$JAVA_HOME/jre export PATH=$JAVA_HOME/bin:$JAVA_HOME/jre/bin:$PATH export CLASSPATH=$CLASSPATH:.:$JAVA_HOME/lib:$JAVA_HOME/jre/lib |
运行source /etc/profile使环境变量生效。
3.安装openssh并生成key。
sudo apt-get install openssh-server |
切换到hadoop用户,运行:
ssh-keygen -t rsa cat .ssh/id_rsa.pub >> .ssh/authorized_keys |
将生成的 authorized_keys 文件复制到 slave1 和 slave2 的 .ssh目录下(其余类似)
scp .ssh/authorized_keys hadoop@slave1:~/.ssh
scp .ssh/authorized_keys hadoop@slave2:~/.ssh
PS:设置命令行显示IP。
可使用ssh slave1测试能够免密码登录。
4.下载并配置hadoop(如下步骤三台机器都需要设置,或者设置一台后scp进行复制):
使用wget下载,例如:
wget http://apache.fayea.com/hadoop/common/hadoop-2.7.3/hadoop-2.7.3.tar.gz
配置HADOOP环境变量:
# set hadoop classpath export HADOOP_HOME=/home/hadoop/hadoop-2.7.3/ export HADOOP_MAPRED_HOME=$HADOOP_HOME export HADOOP_COMMON_HOME=$HADOOP_HOME export HADOOP_HDFS_HOME=$HADOOP_HOME export YARN_HOME=$HADOOP_HOME export JAVA_LIBRARY_PATH=$HADOOP_HOME/lib/native export HADOOP_CONF_DIR=$HADOOP_HOME/etc/hadoop export YARN_CONF_DIR=$HADOOP_HOME/etc/hadoop export HADOOP_PREFIX=$HADOOP_HOME export CLASSPATH=$CLASSPATH:.:$HADOOP_HOME/bin |
配置./etc/hadoop/core-site.xml:
|
配置./etc/hadoop/hdfs-site.xml
|
配置./etc/hadoop/mapred-site.xml
|
配置./etc/hadoop/yarn-site.xml
|
配置./etc/hadoop/slaves文件:
[[email protected] hadoop]$cat slaves localhost slave1 slave2 |
配置环境变量文件:hadoop-env.sh、mapred-env.sh、yarn-env.sh ,添加 JAVA_HOME:
# The java implementation to use.
export JAVA_HOME=/usr/local/jdk1.8.0_131
5.启动hadoop,在master节点操作。
A.初始化文件系统:
/home/hadoop/hadoop-2.7.3/bin/hdfs namenode -format
输出显示:
Storage directory /home/hadoop/hadoop-2.7.3/dfs/namenode has been successfully formatted.
表示初始化成功。
B.启动hadoop集群:
/home/hadoop/hadoop-2.7.3/sbin/start-all.sh
使用jps检查启动情况:
###master [[email protected] hadoop]$jps 3584 NameNode 4147 NodeManager 4036 ResourceManager 3865 SecondaryNameNode 3721 DataNode ###slave1和slave2使用jps只有: Datanode NodeManager |
浏览器查看HDFS:
http://192.168.75.128:50070
浏览器查看MapReduce:
http://192.168.75.128:8088
C.停止Hadoop:
/home/hadoop/hadoop-2.7.3/sbin/stop-all.sh