hadoop 2.x 基础环境搭建

一、环境介绍

  • 操作系统:ubuntu-14.04.1-x64(下载、安装过程略)

  • jdk:jdk1.8.0_60(下载、安装过程略)

  • hadoop:hadoop-2.7.1,下载地址:http://www.apache.org/dyn/closer.cgi/hadoop/common/hadoop-2.7.1/hadoop-2.7.1.tar.gz

二、安装ssh

1、安装并启动ssh

apt-get install ssh
apt-get install openssh-server
ssh start

2、设置免密码登录,生成私钥和公钥

ssh -keygen -t rsa -P ""
cat ~/.ssh/id_rsa.put >> ~/.ssh/authorized_keys

3、测试是否可以免密码登录

ssh localhost

三、安装hadoop

1、解压hadoop

tar -zxvf /home/adrianlynn/Downloads/software/hadoop-2.7.1.tar.gz -C /opt
mv /opt/hadoop-2.7.1 /opt/hadoop

2、配置/etc/profile(为后续使用方便,可略),添加以下内容

export HADOOP_HOME=/opt/hadoop

export PATH=${HADOOP_HOME}/bin:$PATH

3、配置$HADOOP_HOME/etc/hadoop/hadoop-env.sh,添加以下内容

export JAVA_HOME=/opt/jdk1.8.0_60

export PATH=$PATH:/opt/hadoop/bin

4、配置$HADOOP_HOME/etc/hadoop/yarn-env.sh,添加以下内容

export JAVA_HOME=/opt/jdk1.8.0_60

5、使配置生效:

source $HADOOP_HOME/etc/hadoop/hadoop-env.sh
source $HADOOP_HOME/etc/hadoop/yarn-env.sh

6、输入hadoop version检验配置是否成功

root@Demon:~# hadoop version
Hadoop 2.7.1
Subversion https://git-wip-us.apache.org/repos/asf/hadoop.git -r 15ecc87ccf4a0228f35af08fc56de536e6ce657a
Compiled by jenkins on 2015-06-29T06:04Z
Compiled with protoc 2.5.0
From source with checksum fc0a1a23fc1868e4d5ee7fa2b28a58a
This command was run using /opt/hadoop/share/hadoop/common/hadoop-common-2.7.1.jar

7、运行wordcount例子,在$HADOOP_HOME下建立input文件夹,并拷贝一定量的文件进去,再执行程序(程序下载地址:http://www.java2s.com/Code/Jar/h/Downloadhadoopexamples120jar.htm),运行结果可在output中查看

mkdir input
cp -R $HADOOP_HOME/etc/hadoop/* $HADOOP_HOME/input
cd $HADOOP_HOME
hadoop jar hadoop-examples-1.2.0.jar wordcount input output

四、伪分布式集群建立

1、建HDFS目录

mkdir -p $HADOOP_HOME/hdfs/name
mkdir -p $HADOOP_HOME/hdfs/data

2、配置core-site.xml,Demon替换成hostname(下同)

<configuration>
  <property>
    <name>fs.defaultFS</name>
    <value>hdfs://Demon:9000</value>
  </property>
  <property>
    <name>hadoop.tmp.dir</name>
    <value>file:/opt/hadoop/tmp</value>
  </property>
</configuration>

3、配置$HADOOP_HOME/etc/hadoop/hdfs-site.xml

<configuration>
  <property>
    <name>dfs.datanode.ipc.address</name>
    <value>Demon:50020</value>
  </property>
  <property>
    <name>dfs.datanode.http.address</name>
    <value>Demon:50075</value>
  </property>
  <property>
    <name>dfs.namenode.name.dir</name>
    <value>file:/opt/hadoop/hdfs/name</value>
  </property>
  <property>
    <name>dfs.datanode.data.dir</name>
    <value>file:/opt/hadoop/hdfs/data</value>
  </property>
  <property>
    <name>dfs.namenode.secondary.http-address</name>
    <value>Demon:50090</value>
  </property>

4、配置$HADOOP_HOME/etc/hadoop/mapred-site.xml,默认不存在,需要自建

<configuration>
  <property>
    <name>mapreduce.framework.name</name>
    <value>yarn</value>
  </property>
</configuration>

5、配置$HADOOP_HOME/etc/hadoop/yarn-site.xml

<configuration>
  <property>
    <name>yarn.nodemanager.aux-services</name>
    <value>mapreduce_shuffle</value>
  </property>
</configuration>

6、初始化namenode:./bin/hdfs namenode -format,出现以下信息表示成功

15/09/13 11:19:55 INFO common.Storage: Storage directory /opt/hadoop/hdfs/name has been successfully formatted.
15/09/13 11:19:55 INFO namenode.NNStorageRetentionManager: Going to retain 1 images with txid >= 0
15/09/13 11:19:55 INFO util.ExitUtil: Exiting with status 0
15/09/13 11:19:55 INFO namenode.NameNode: SHUTDOWN_MSG: 
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at Demon.Lucifer/127.0.1.1
************************************************************/

7、启动所有服务:./sbin/start-all.sh

root@Demon:/opt/hadoop# ./sbin/start-all.sh
This script is Deprecated. Instead use start-dfs.sh and start-yarn.sh
Starting namenodes on [Demon]
Demon: starting namenode, logging to /opt/hadoop/logs/hadoop-root-namenode-Demon.out
localhost: starting datanode, logging to /opt/hadoop/logs/hadoop-root-datanode-Demon.out
Starting secondary namenodes [Demon]
Demon: starting secondarynamenode, logging to /opt/hadoop/logs/hadoop-root-secondarynamenode-Demon.out
starting yarn daemons
starting resourcemanager, logging to /opt/hadoop/logs/yarn-root-resourcemanager-Demon.out
localhost: starting nodemanager, logging to /opt/hadoop/logs/yarn-root-nodemanager-Demon.out

8、查看JPS

root@Demon:/opt/hadoop# jps
27056 Jps
26805 ResourceManager
26614 SecondaryNameNode
26950 NodeManager
26247 NameNode
26395 DataNode

9、查看namenode状态:http://demon:50070/,查看yarn状态:http://demon:8088/

10、在集群下运行wordcount例子,在hdfs中建立input文件夹,并拷贝一定量的文件进去,再执行程序

cd $HADOOP_HOME
hadoop dfs -mkdir /input
hadoop dfs -copyFromLocal /opt/hadoop/etc/hadoop/* /input
hadoop jar hadoop-examples-1.2.0.jar wordcount /input /output


你可能感兴趣的:(hadoop)