鲁春利的工作笔记,谁说程序员不能有文艺范?
hadoop环境分主要分为单机、伪分布式以及集群,这里主要记录集群环境的搭建。
1、安装说明
系统:CentOS-6.5-x86_64
关闭防火墙和SELinux:
service iptables status
service iptables stop
chkconfig iptables off
vi /etc/sysconfig/selinux
设置 SELINUX=disabled
通过VMWare安装了三个虚拟机
设置静态IP地址:
vi /etc/sysconfig/network-scripts/ifcfg-eth0
DEVICE="eth0"
NM_CONTROLLED="yes"
NAME="System eth0"
BOOTPROTO=none
ONBOOT="yes"
TYPE="Ethernet"
UUID="d0bfa44e-951f-4b4c-b002-4b41aff8ddfc"
IPADDR=192.168.137.117(需要修改为自己想要的IP)
PREFIX=24
GATEWAY=192.168.137.1(默认网关)
DNS1=114.114.115.115(免费的域名解析服务器)
DEFROUTE=yes
IPV4_FAILURE_FATAL=yes
IPV6INIT=no
HWADDR=00:0C:29:60:17:AF
LAST_CONNECT=1435483515
修改HostName:
vi /etc/sysconfig/network
三台主机名分别为:
nnode
dnode1
dnode2
IP与HostName绑定:
vi /etc/hosts
192.168.137.117 nnode nnode
192.168.137.118 dnode1 dnode1
192.168.137.119 dnode2 dnode2
解除 Linux 系统的最大进程数和最大文件打开数限制
vim /etc/security/limits.conf
# 添加如下的行
* soft nofile 4100
* hard nofile 4100
* 代表针对所有用户
noproc 是代表最大进程数
nofile 是代表最大文件打开数
2、 JDK安装及配置
jdk-7u75-linux-x64.tar.gz
解压tar包:
tar -xzv -f jdk-7u75-linux-x64.tar.gz -C /usr/local/jdk1.7
配置环境变量:
vi /etc/profile
添加内容:
export JAVA_HOME=/usr/local/jdk1.7
export PATH=$PATH:$JAVA_HOME/bin
export CLASSPATH=.:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar
测试验证
java -version
3、 SSH免密码登录配置
su - hadoop
三台机器分别生成公钥和私钥
ssh-keygen -t dsa -P '' -f ~/.ssh/id_dsa
默认在 ~/.ssh目录生成两个文件:
id_dsa :私钥
id_dsa.pub :公钥
chmod 700 ~/.ssh(必须是700)
chmod 600 ~/.ssh/authorized_keys(最好是600)
分发认证文件:
将dnode1和dnode2的authorized_keys文件拷贝到nnode的/home/hadoop目录下,并将其中的内容附加到nnode的authorized_keys文件中
cat authorized_keys >> /home/hadoop/.ssh/authorized_keys
最后将最终的authorized_keys分发到另外两台机器dnode1和dnode2上
说明:
若执行了上述操作后登录仍提示输入密码,建议查看/var/log/secure文件的输出,确认是否为权限问题。
4、新增hadoop用户
出于安全考虑使用普通用户而非系统管理员root用户来调用hadoop
groupadd hadoop
useradd -g hadoop hadoop
passwd hadoop
输入密码:hadoop
5、配置zookeeper
zookeeper-3.4.6.tar.gz
解压tar包:
tar -xzv -f zookeeper-3.4.6.tar.gz -C /usr/local/zookeeper.3.4.6
增加zookeeper的环境变量配置
[hadoop@nnode ~]$ vim .bash_profile export ZOOKEEPER_HOME=/usr/local/zookeeper3.4.6 export PATH=$ZOOKEEPER_HOME/bin:$ZOOKEEPER_HOME/conf:$PATH
修改配置文件:
拷贝zoo_sample.cfg到zoo.cfg
# The number of milliseconds of each tick tickTime=2000 # The number of ticks that the initial synchronization phase can take initLimit=10 # The number of ticks that can pass between # sending a request and getting an acknowledgement syncLimit=5 # the directory where the snapshot is stored. dataDir=/usr/local/zookeeper3.4.6/data # the directory where transaction files is stored. dataLogDir=/usr/local/zookeeper3.4.6/logs # the port at which the clients will connect clientPort=2181 # the maximum number of client connections. maxClientCnxns=100 # 在集群模式下,集群中的每台机器都需要感知到整个集群是由哪几台机器组成; # 在配置文件中,按如下格式配置,每行代表一个机器配置server.id=host:port:port # id为Server ID,标识机器在集群中的机器序号(1~255) server.1=nnode:2888:3888 server.2=dnode1:2888:3888 server.3=dnode2:2888:3888
其中的server.1、server.2、server.3需要在每台机器的dataDir对应的目录下新建名为myid的文本文件,文件的内容分别为1、2、3(对应于server点后面的数字)。
[hadoop@nnode data]$ pwd /usr/local/zookeeper3.4.6/data [hadoop@nnode data]$ ll total 12 -rw-rw-r-- 1 hadoop hadoop 2 May 13 11:19 myid drwxrwxr-x 2 hadoop hadoop 4096 Jul 18 21:07 version-2 -rw-rw-r-- 1 hadoop hadoop 5 Jul 18 17:27 zookeeper_server.pid [hadoop@nnode data]$ cat myid 1 [hadoop@nnode data]$
6、Hadoop安装配置
hadoop-2.6.0.tar.gz
解压tar包:
tar -xzv -f hadoop-2.6.0.tar.gz -C /usr/local/hadoop2.6.0
修改配置文件(/usr/local/hadoop2.6.0/etc/hadoop/目录下)
a.) hadoop-env.sh
# 配置hadoop运行时的环境变量
export JAVA_HOME=/usr/local/jdk1.7
export HADOOP_HOME=/usr/local/hadoop2.6.0
b.) mapred-env.sh
# 配置mapreduce运行时的环境变量
export JAVA_HOME=/usr/local/jdk1.7
c.) yarn-env.sh
# 配置yarn运行时的环境变量
export JAVA_HOME=/usr/local/jdk1.7
d.) core-site.xml
<configuration> <!-- version of this configuration file --> <property> <name>hadoop.common.configuration.version</name> <value>0.23.0</value> </property> <!-- 这里的值指的是默认的HDFS路径。当有多个HDFS集群同时工作时, 用户如果不写集群名称,那么默认使用哪个哪?在这里指定! 该值来自于hdfs-site.xml中的配置 --> <property> <name>fs.defaultFS</name> <value>hdfs://cluster</value> </property> <!-- 这里的路径默认是NameNode、DataNode、JournalNode等存放数据的公共目录。 用户也可以自己单独指定这三类节点的目录。--> <!-- 默认值为/tmp/hadoop-${user.name},linux下的/tmp目录容易被清空, 建议指定自己的目录 --> <property> <name>hadoop.tmp.dir</name> <value>/usr/local/hadoop2.6.0/tmp</value> </property> <!-- config zookeeper for ha --> <!-- 这里是ZooKeeper集群的地址和端口。注意,数量一定是奇数, 且不少于三个节点 --> <property> <name>ha.zookeeper.quorum</name> <value>nnode:2181,dnode1:2181,dnode2:2181</value> </property> </configuration>
e.) hdfs-site.xml
<configuration> <!-- version of this configuration file --> <property> <name>hadoop.hdfs.configuration.version</name> <value>1</value> </property> <!-- 指定DataNode存储block的副本数量,现在有两个datanode,设定2。 默认值是3个 --> <property> <name>dfs.replication</name> <value>2</value> </property> <!-- 指定DFS的name node在本地文件系统的什么位置存储name table(fsimage) --> <!-- 实际上最终的hdfs数据还是在namenode节点所在的linux主机上存着 --> <!-- 如果这里是以英文逗号分割的目录列表,那么name table将复制到所有目录; 作为冗余存储。 --> <!-- 默认file://${hadoop.tmp.dir}/dfs/name --> <property> <name>dfs.namenode.name.dir</name> <value>/usr/local/hadoop2.6.0/data/dfs/name</value> </property> <!-- 默认为${dfs.namenode.name.dir},指定DFS name node在本地文件系统的什么 位置存储transaction (edits) file。 --> <!-- 如果这里是以英文逗号分割的目录列表,那么transaction (edits) file将复制到 所有目录;作为冗余存储(for redundancy)。 --> <property> <name>dfs.namenode.name.dir</name> <value>/usr/local/hadoop2.6.0/data/dfs/edits</value> </property> <!-- 指定DFS的data node在本地文件系统的什么位置存储its blocks --> <!-- 如果是以逗号分割的目录列表,则数据会存储在所有指定的目录, 一般都是将目录分散在不同的设备上,不存着的目录会被忽略 --> <!-- 默认file://${hadoop.tmp.dir}/dfs/data --> <property> <name>dfs.datanode.data.dir</name> <value>/usr/local/hadoop2.6.0/data/dfs/data</value> </property> <!-- 默认为true,启用NameNodes和DataNodes的WebHDFS(通过50070来访问) --> <property> <name>dfs.webhdfs.enabled</name> <value>true</value> </property> <!-- 是否启用HDFS的权限检查,现在还不熟悉,暂时禁用 --> <property> <name>dfs.permissions.enabled</name> <value>false</value> </property> <!-- The follows just for ha --> <!-- Comma-separated list of nameservices. --> <!-- 我这里只有一个集群,一个命名空间,一个nameservice,自定义集群名称 --> <property> <name>dfs.nameservices</name> <value>cluster</value> </property> <!-- dfs.ha.namenodes.EXAMPLENAMESERVICE --> <!-- EXAMPLENAMESERVICE代表了一个样例,如前面的cluster,表示该集群中的namenode, 自定义名称,这里的值也是逻辑名称,名字随便起,相互不重复即可 --> <property> <name>dfs.ha.namenodes.cluster</name> <value>nn1,nn2</value> </property> <!-- for rpc connection --> <!-- 处理客户端请求的RPC地址,对于HA/Federation这种有多个namenode存着的情况, 为自定义的nameservie和namenode标识 --> <!-- Hadoop的架构基于RPC来实现的,NameNode等为RPC的server端, 如FileSystem等为实现的RPC的client端 --> <property> <name>dfs.namenode.rpc-address.cluster.nn1</name> <value>nnode:8020</value> </property> <!-- 下面为另一个namenode节点 --> <property> <name>dfs.namenode.rpc-address.cluster.nn2</name> <value>dnode1:8020</value> </property> <!-- for http connection --> <!-- 默认值0.0.0.0:50070,dfs namenode web ui监听的端口, namenode启动后可以通过该地址查看namenode状态 --> <property> <name>dfs.namenode.http-address.cluster.nn1</name> <value>nnode:50070</value> </property> <!-- 同上 --> <property> <name>dfs.namenode.http-address.cluster.nn2</name> <value>dnode1:50070</value> </property> <!-- for connection with namenodes--> <!-- 用来进行HDFS服务通信的RPC地址,如果配置了该地址则BackupNode、Datanodes 以及其他服务应当连接该地址。 --> <!-- 对于HA/Federation有多个namenode的情况,应采用nameservice.namenode的形式 --> <!-- 如果该参数未设置,dfs.namenode.rpc-address将作为默认值被使用 --> <property> <name>dfs.namenode.servicerpc-address.cluster.nn1</name> <value>nnode:53310</value> </property> <!-- 同上 --> <property> <name>dfs.namenode.servicerpc-address.cluster.nn2</name> <value>dnode1:53310</value> </property> <!-- namenode.shared.edits --> <!-- 在HA中,多个namenode节点间共享存储目录时,使用的JournalNode集群信息。 --> <!-- active状态的namenode执行write,而standby状态的namenode执行read, 以保证namespaces同步。--> <!-- 该目录无需位于dfs.namenode.edits.dir目录之上,若非HA集群该目录为空。 --> <property> <name>dfs.namenode.shared.edits.dir</name> <value>qjournal://nnode:8485;dnode1:8485;dnode2:8485/cluster</value> </property> <!-- journalnode.edits.dir --> <!-- the path where the JournalNode daemon will store its local state --> <property> <name>dfs.journalnode.edits.dir</name> <value>/usr/local/hadoop2.6.0/data/ha/journal</value> </property> <!-- failover proxy --> <!-- 指定cluster出故障时,哪个实现类负责执行故障切换 --> <property> <name>dfs.client.failover.proxy.provider.cluster</name> <value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value> </property> <!-- automatic-failover --> <!-- 指定cluster是否启动自动故障恢复,即当NameNode出故障时,是否自动切换到另一台NameNode --> <property> <name>dfs.ha.automatic-failover.enabled</name> <value>true</value> </property> <!-- 一旦需要NameNode切换,使用ssh方式进行操作 --> <property> <name>dfs.ha.fencing.methods</name> <value>sshfence</value> </property> <!-- 如果使用ssh进行故障切换,使用ssh通信时用的密钥存储的位置 --> <property> <name>dfs.ha.fencing.ssh.private-key-files</name> <value>/home/hadoop/.ssh/id_dsa</value> </property> </configuration>
f.) mapred-site.xml
<configuration> <!-- The runtime framework for executing MapReduce jobs. Can be one of local, classic or yarn. --> <property> <name>mapreduce.framework.name</name> <value>yarn</value> </property> <!-- The host and port that the MapReduce job tracker runs at. --> <property> <name>mapreduce.jobtracker.address</name> <value>nnode:9001</value> </property> <!-- The job tracker http server address and port the server will listen on, 默认0.0.0.0:50030。If the port is 0 then server will start on a free port --> <property> <name>mapreduce.jobtracker.http.address</name> <value>nnode:50030</value> </property> <!-- The task tracker http server address and port,默认值0.0.0.0:50060。 If the port is 0 then the server will start on a free port. --> <property> <name>mapreduce.tasktracker.http.address</name> <value>nnode:50060</value> </property> <!-- MapReduce JobHistory Server IPC host:port --> <property> <name>mapreduce.jobhistory.address</name> <value>nnode:10020</value> </property> <!-- MapReduce JobHistory Server Web UI host:port --> <property> <name>mapreduce.jobhistory.webapp.address</name> <value>nnode:19888</value> </property> <!-- The directory where MapReduce stores control files. 默认值${hadoop.tmp.dir}/mapred/system. --> <property> <name>mapreduce.jobtracker.system.dir</name> <value>/usr/local/hadoop2.6.0/data/mapred/system</value> </property> <!-- The root of the staging area for users' job files, 默认值${hadoop.tmp.dir}/mapred/staging --> <property> <name>mapreduce.jobtracker.staging.root.dir</name> <value>/usr/local/hadoop2.6.0/data/mapred/staging</value> </property> <!-- A shared directory for temporary files. 默认值${hadoop.tmp.dir}/mapred/temp --> <property> <name>mapreduce.cluster.temp.dir</name> <value>/usr/local/hadoop2.6.0/data/mapred/tmp</value> </property> <!-- The local directory where MapReduce stores intermediate data files. 默认值${hadoop.tmp.dir}/mapred/local --> <property> <name>mapreduce.cluster.local.dir</name> <value>/usr/local/hadoop2.6.0/data/mapred/local</value> </property> </configuration>
g.) yarn-site.xml
<configuration> <!-- The hostname of the RM.还是单点,这是隐患 --> <property> <name>yarn.resourcemanager.hostname</name> <value>nnode</value> </property> <!-- The hostname of the NM. <property> <name>yarn.nodemanager.hostname</name> <value>nnode</value> </property> --> <!-- NodeManager上运行的附属服务。需配置成mapreduce_shuffle,才可运行MapReduce --> <property> <name>yarn.nodemanager.aux-services</name> <value>mapreduce_shuffle</value> </property> <!-- 默认值即为该Handler --> <property> <name>yarn.nodemanager.aux-services.mapreduce_shuffle.class</name> <value>org.apache.hadoop.mapred.ShuffleHandler</value> </property> <!-- 说明:为了能够运行MapReduce程序,需要让各个NodeManager在启动时加载 shuffle server,shuffle server实际上是Jetty/Netty Server,Reduce Task通 过该server从各个NodeManager上远程拷贝Map Task产生的中间结果。 上面增加的两个配置均用于指定shuffle serve。 --> </configuration>
h.) slaves
# 指定所有的DataNode节点列表,每行一个节点名称 [hadoop@nnode ~]$ cd /usr/local/hadoop2.6.0/etc/hadoop/ [hadoop@nnode hadoop]$ cat slaves dnode1 dnode2 [hadoop@nnode hadoop]$
i.) 将zookeeper和hadoop配置好的安装文件分发到另外两台机器
scp -r hadoop2.6.0 hadoop@dnode1:/usr/local/ scp -r hadoop2.6.0 hadoop@dnode2:/usr/local/
7、启动hadoop集群
启动ZooKeeper集群
在nnode、dnode1和dnode2三台机器上依次执行zkServer.sh start
[hadoop@nnode ~]$ zkServer.sh start JMX enabled by default Using config: /usr/local/zookeeper3.4.6/bin/../conf/zoo.cfg Starting zookeeper ... STARTED [hadoop@nnode ~]$ # 检查zookeeper的状态 [hadoop@nnode ~]$ zkServer.sh status JMX enabled by default Using config: /usr/local/zookeeper3.4.6/bin/../conf/zoo.cfg Mode: follower [hadoop@nnode ~]$ # 三个节点都执行完成后通过zkCli.sh验证 [hadoop@nnode ~]$ zkCli.sh Connecting to localhost:2181 2015-07-19 10:39:40,897 [myid:] - INFO [main:Environment@100] - Client environment:zookeeper.version=3.4.6-1569965, built on 02/20/2014 09:09 GMT 2015-07-19 10:39:40,903 [myid:] - INFO [main:Environment@100] - Client environment:host.name=nnode 2015-07-19 10:39:40,903 [myid:] - INFO [main:Environment@100] - Client environment:java.version=1.7.0_75 2015-07-19 10:39:40,906 [myid:] - INFO [main:Environment@100] - Client environment:java.vendor=Oracle Corporation 2015-07-19 10:39:40,906 [myid:] - INFO [main:Environment@100] - Client environment:java.home=/usr/local/jdk1.7/jre # 中间略 2015-07-19 10:39:40,907 [myid:] - INFO [main:Environment@100] - Client environment:java.io.tmpdir=/tmp 2015-07-19 10:39:40,907 [myid:] - INFO [main:Environment@100] - Client environment:java.compiler=<NA> 2015-07-19 10:39:40,907 [myid:] - INFO [main:Environment@100] - Client environment:os.name=Linux 2015-07-19 10:39:40,908 [myid:] - INFO [main:Environment@100] - Client environment:os.arch=amd64 2015-07-19 10:39:40,908 [myid:] - INFO [main:Environment@100] - Client environment:os.version=2.6.32-431.el6.x86_64 2015-07-19 10:39:40,908 [myid:] - INFO [main:Environment@100] - Client environment:user.name=hadoop 2015-07-19 10:39:40,908 [myid:] - INFO [main:Environment@100] - Client environment:user.home=/home/hadoop 2015-07-19 10:39:40,908 [myid:] - INFO [main:Environment@100] - Client environment:user.dir=/home/hadoop 2015-07-19 10:39:40,910 [myid:] - INFO [main:ZooKeeper@438] - Initiating client connection, connectString=localhost:2181 sessionTimeout=30000 watcher=org.apache.zookeeper.ZooKeeperMain$MyWatcher@75aa57fb Welcome to ZooKeeper! JLine support is enabled [zk: localhost:2181(CONNECTING) 0] ls / # 查看zk集群中的节点 # 启动的顺序是nnode>dnode1>dnode2,由于ZooKeeper集群启动的时候,每个结点都试图去连接集群中 # 的其它结点,先启动的肯定连不上后面还没启动的,所以日志前面部分可能会有异常报出。 # 最后集群在选出一个Leader后就稳定了,其他结点可能也出现类似问题,属于正常。
格式化ZooKeeper集群,目的是在ZooKeeper集群上建立HA的相应节点
[hadoop@nnode ~]$ hdfs zkfc �CformatZK
启动JournalNode集群
# 在三个节点上分别执行hadoop-daemon.sh start journalnode [hadoop@nnode ~]$ hadoop-daemon.sh start journalnode starting journalnode, logging to /usr/local/hadoop2.6.0/logs/hadoop-hadoop-journalnode-nnode.out [hadoop@nnode ~]$ jps 12425 Jps 12385 JournalNode 11637 QuorumPeerMain # 注意这里是hadoop-daemon.sh而非hadoop-daemons.sh(daemon后面没有s)
格式化集群的NameNode
hdfs namenode -format # 我这里在任意目录下均可执行hdfs命令是由于我已经把hadoop的bin目录加入环境变量
启动NameNode
[hadoop@nnode ~]$ hadoop-daemon.sh start namenode starting namenode, logging to /usr/local/hadoop2.6.0/logs/hadoop-hadoop-namenode-nnode.out [hadoop@nnode ~]$ tail -n50 /usr/local/hadoop2.6.0/logs/hadoop-hadoop-namenode-nnode.log 2015-07-19 10:58:54,762 INFO org.apache.hadoop.util.GSet: VM type = 64-bit 2015-07-19 10:58:54,764 INFO org.apache.hadoop.util.GSet: 2.0% max memory 966.7 MB = 19.3 MB 2015-07-19 10:58:54,764 INFO org.apache.hadoop.util.GSet: capacity = 2^21 = 2097152 entries 2015-07-19 10:58:54,772 INFO org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: dfs.block.access.token.enable=false 2015-07-19 10:58:54,772 INFO org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: defaultReplication = 2 2015-07-19 10:58:54,773 INFO org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: maxReplication = 512 2015-07-19 10:58:54,773 INFO org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: minReplication = 1 2015-07-19 10:58:54,773 INFO org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: maxReplicationStreams = 2 2015-07-19 10:58:54,773 INFO org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: shouldCheckForEnoughRacks = false 2015-07-19 10:58:54,773 INFO org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: replicationRecheckInterval = 3000 2015-07-19 10:58:54,773 INFO org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: encryptDataTransfer = false 2015-07-19 10:58:54,773 INFO org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: maxNumBlocksToLog = 1000 2015-07-19 10:58:54,778 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: fsOwner = hadoop (auth:SIMPLE) 2015-07-19 10:58:54,778 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: supergroup = supergroup 2015-07-19 10:58:54,778 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: isPermissionEnabled = false 2015-07-19 10:58:54,778 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Determined nameservice ID: cluster 2015-07-19 10:58:54,778 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: HA Enabled: true 2015-07-19 10:58:54,783 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Append Enabled: true 2015-07-19 10:58:55,113 INFO org.apache.hadoop.util.GSet: Computing capacity for map INodeMap 2015-07-19 10:58:55,114 INFO org.apache.hadoop.util.GSet: VM type = 64-bit 2015-07-19 10:58:55,114 INFO org.apache.hadoop.util.GSet: 1.0% max memory 966.7 MB = 9.7 MB 2015-07-19 10:58:55,114 INFO org.apache.hadoop.util.GSet: capacity = 2^20 = 1048576 entries 2015-07-19 10:58:55,123 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: Caching file names occuring more than 10 times 2015-07-19 10:58:55,139 INFO org.apache.hadoop.util.GSet: Computing capacity for map cachedBlocks 2015-07-19 10:58:55,139 INFO org.apache.hadoop.util.GSet: VM type = 64-bit 2015-07-19 10:58:55,139 INFO org.apache.hadoop.util.GSet: 0.25% max memory 966.7 MB = 2.4 MB 2015-07-19 10:58:55,139 INFO org.apache.hadoop.util.GSet: capacity = 2^18 = 262144 entries 2015-07-19 10:58:55,144 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: dfs.namenode.safemode.threshold-pct = 0.9990000128746033 2015-07-19 10:58:55,144 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: dfs.namenode.safemode.min.datanodes = 0 2015-07-19 10:58:55,144 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: dfs.namenode.safemode.extension = 30000 2015-07-19 10:58:55,145 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Retry cache on namenode is enabled 2015-07-19 10:58:55,146 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Retry cache will use 0.03 of total heap and retry cache entry expiry time is 600000 millis 2015-07-19 10:58:55,151 INFO org.apache.hadoop.util.GSet: Computing capacity for map NameNodeRetryCache 2015-07-19 10:58:55,151 INFO org.apache.hadoop.util.GSet: VM type = 64-bit 2015-07-19 10:58:55,152 INFO org.apache.hadoop.util.GSet: 0.029999999329447746% max memory 966.7 MB = 297.0 KB 2015-07-19 10:58:55,152 INFO org.apache.hadoop.util.GSet: capacity = 2^15 = 32768 entries 2015-07-19 10:58:55,159 INFO org.apache.hadoop.hdfs.server.namenode.NNConf: ACLs enabled? false 2015-07-19 10:58:55,159 INFO org.apache.hadoop.hdfs.server.namenode.NNConf: XAttrs enabled? true 2015-07-19 10:58:55,163 INFO org.apache.hadoop.hdfs.server.namenode.NNConf: Maximum size of an xattr: 16384 2015-07-19 10:58:55,187 INFO org.apache.hadoop.hdfs.server.common.Storage: Lock on /usr/local/hadoop2.6.0/data/dfs/name/in_use.lock acquired by nodename 12629@nnode 2015-07-19 10:58:59,551 INFO org.apache.hadoop.hdfs.server.namenode.FSImageFormatPBINode: Loading 176 INodes. 2015-07-19 10:58:59,626 INFO org.apache.hadoop.hdfs.server.namenode.FSImageFormatProtobuf: Loaded FSImage in 0 seconds. 2015-07-19 10:58:59,626 INFO org.apache.hadoop.hdfs.server.namenode.FSImage: Loaded image for txid 8261 from /usr/local/hadoop2.6.0/data/dfs/name/current/fsimage_0000000000000008261 2015-07-19 10:58:59,631 INFO org.apache.hadoop.hdfs.server.namenode.FSImage: Reading org.apache.hadoop.hdfs.server.namenode.RedundantEditLogInputStream@3f1084df expecting start txid #8262 2015-07-19 10:58:59,632 INFO org.apache.hadoop.hdfs.server.namenode.FSImage: Start loading edits file http://dnode1:8480/getJournal?jid=cluster&segmentTxId=8262&storageInfo=-60%3A339614018%3A0%3ACID-61347f43-0510-4b07-ad99-956472c0e49f, http://nnode:8480/getJournal?jid=cluster&segmentTxId=8262&storageInfo=-60%3A339614018%3A0%3ACID-61347f43-0510-4b07-ad99-956472c0e49f 2015-07-19 10:58:59,635 INFO org.apache.hadoop.hdfs.server.namenode.EditLogInputStream: Fast-forwarding stream 'http://dnode1:8480/getJournal?jid=cluster&segmentTxId=8262&storageInfo=-60%3A339614018%3A0%3ACID-61347f43-0510-4b07-ad99-956472c0e49f, http://nnode:8480/getJournal?jid=cluster&segmentTxId=8262&storageInfo=-60%3A339614018%3A0%3ACID-61347f43-0510-4b07-ad99-956472c0e49f' to transaction ID 8262 2015-07-19 10:58:59,635 INFO org.apache.hadoop.hdfs.server.namenode.EditLogInputStream: Fast-forwarding stream 'http://dnode1:8480/getJournal?jid=cluster&segmentTxId=8262&storageInfo=-60%3A339614018%3A0%3ACID-61347f43-0510-4b07-ad99-956472c0e49f' to transaction ID 8262
同步NameNode的数据
# 在主机dnode1上执行如下同步命令 hdfs namenode -bootstrapStandby # 注意这里bootstrapStandby前面的折线必须为英文状态的,否则总是会报错
启动另一个Namenode
[hadoop@dnode1 ~]$ hadoop-daemon.sh start namenode starting namenode, logging to /usr/local/hadoop2.6.0/logs/hadoop-hadoop-namenode-dnode1.out [hadoop@dnode1 ~]$
启动所有的DataNode
[hadoop@nnode ~]$ hadoop-daemons.sh start datanode dnode1: starting datanode, logging to /usr/local/hadoop2.6.0/logs/hadoop-hadoop-datanode-dnode1.out dnode2: starting datanode, logging to /usr/local/hadoop2.6.0/logs/hadoop-hadoop-datanode-dnode2.out [hadoop@nnode ~]$ # 上面的命令为hadoop-daemons.sh(这里的daemon是带有s的) # 查看dnode1节点的进程 [hadoop@dnode1 ~]$ jps 12431 JournalNode 12534 NameNode 11636 QuorumPeerMain 12651 DataNode 12737 Jps [hadoop@dnode1 ~]$ # 查看dnode2节点的进程 [hadoop@dnode2 ~]$ jps 12477 Jps 12400 DataNode 11566 QuorumPeerMain 12286 JournalNode [hadoop@dnode2 ~]$
启动Yarn
[hadoop@nnode ~]$ start-yarn.sh starting yarn daemons starting resourcemanager, logging to /usr/local/hadoop2.6.0/logs/yarn-hadoop-resourcemanager-nnode.out dnode1: starting nodemanager, logging to /usr/local/hadoop2.6.0/logs/yarn-hadoop-nodemanager-dnode1.out dnode2: starting nodemanager, logging to /usr/local/hadoop2.6.0/logs/yarn-hadoop-nodemanager-dnode2.out [hadoop@nnode ~]$
启动ZooKeeperFailoverController
# nnode和dnode1节点配置的有namenode,在这两个节点执行 [hadoop@nnode ~]$ hadoop-daemon.sh start zkfc starting zkfc, logging to /usr/local/hadoop2.6.0/logs/hadoop-hadoop-zkfc-nnode.out [hadoop@nnode ~]$ [hadoop@nnode ~]$ jps 12629 NameNode 12385 JournalNode 13571 Jps 11637 QuorumPeerMain 13218 ResourceManager 13502 DFSZKFailoverController
验证HDFS是否好用
[hadoop@nnode ~]$ hdfs dfs -ls -R /user/hadoop -rw-r--r-- 2 hadoop hadoop 2297 2015-06-29 14:44 /user/hadoop/20130913152700.txt.gz -rw-r--r-- 2 hadoop hadoop 211 2015-06-29 14:45 /user/hadoop/20130913160307.txt.gz -rw-r--r-- 2 hadoop hadoop 93046447 2015-07-18 18:01 /user/hadoop/apache-hive-1.2.0-bin.tar.gz -rw-r--r-- 2 hadoop hadoop 4139112 2015-06-28 22:54 /user/hadoop/httpInterceptor_192.168.1.101_1_20130913160307.txt -rw-r--r-- 2 hadoop hadoop 240 2015-05-30 20:54 /user/hadoop/lucl.gz -rw-r--r-- 2 hadoop hadoop 63 2015-05-27 23:55 /user/hadoop/lucl.txt -rw-r--r-- 2 hadoop hadoop 9994248 2015-06-29 14:12 /user/hadoop/scalog.txt -rw-r--r-- 2 hadoop hadoop 2664495 2015-06-28 20:54 /user/hadoop/scalog.txt.gz -rw-r--r-- 3 hadoop hadoop 28026803 2015-06-24 21:16 /user/hadoop/test.txt.gz -rw-r--r-- 2 hadoop hadoop 28551 2015-05-27 23:54 /user/hadoop/zookeeper.out
验证Yarn是否好用
# 通过如下地址访问正常 http://nnode:8088/cluster
验证HA的故障自动转移是否好用
http://nnode:50070 http://dnode1:50070 # 此时两个namenode一个为active一个为standby,将active的namenode进程kill后, # standby状态的namenode成为active