部署hadoop标准文档及标准目录
版本:v1.1
安装基础系统
1. 操作系统安装:centos6.3 x64
2. 分区情况:boot分区200M,根分区50G,swap分区,>32G <64G, 剩余空间(或者独立磁盘分区)划分到/data目录。
3. 安装套件:基础系统、开发环境。
4. 光盘安装:或者挂载光驱,执行yum --disablerepo=\* --enablerepo=c6-media groupinstall base 'Development tools'
datanode硬件初始化
1. RAID制作:系统盘做raid1,数据盘每个盘独立做raid0。
2. 格式化分区:使用下列命令进行每块硬盘的分区和格式化。
#!/bin/bash
for i in sdb sdc sdd sde sdf sdg sdh sdi
do
parted /dev/$i mklabel gpt
parted /dev/$i mkpart 1 ext4 0M 2000G
mkfs.ext4 /dev/${i}1 &
done
3. 修改fstab
[root@namenode002 ~]# cat /etc/fstab
tmpfs
/dev/shm tmpfs defaults 0 0
devpts
/dev/pts devpts gid=5,mode=620 0 0
sysfs
/sys sysfs defaults 0 0
proc
/proc proc defaults 0 0
/dev/sda1 /boot
ext3 defaults 1 1
/dev/sda2 /
ext3 defaults 1 1
/dev/sdb1 /data/disk01 ext3 noatime,nodiratime 1 1
/dev/sdc1 /data/disk02 ext3 noatime,nodiratime 1 1
/dev/sdd1 /data/disk03 ext3 noatime,nodiratime 1 1
/dev/sde1 /data/disk04 ext3 noatime,nodiratime 1 1
/dev/sdf1 /data/disk05 ext3 noatime,nodiratime 1 1
/dev/sdg1 /data/disk06 ext3 noatime,nodiratime 1 1
/dev/sdh1 /data/disk07 ext3 noatime,nodiratime 1 1
/dev/sdi1 /data/disk08 ext3 noatime,nodiratime 1 1
[root@namenode002 ~]# mkdir -p /data/disk0{1,2,3,4,5,6,7,8}
[root@namenode002 ~]# mount -a
[root@namenode002 ~]# cat /etc/mtab
/dev/sda2 / ext3 rw 0 0
proc /proc proc rw 0 0
sysfs /sys sysfs rw 0 0
devpts /dev/pts devpts rw,gid=5,mode=620 0 0
tmpfs /dev/shm tmpfs rw 0 0
/dev/sda1 /boot ext3 rw 0 0
none /proc/sys/fs/binfmt_misc binfmt_misc rw 0 0
/dev/sdb1 /data/disk01 ext3 rw,noatime,nodiratime 0 0
/dev/sdc1 /data/disk02 ext3 rw,noatime,nodiratime 0 0
/dev/sdd1 /data/disk03 ext3 rw,noatime,nodiratime 0 0
/dev/sde1 /data/disk04 ext3 rw,noatime,nodiratime 0 0
/dev/sdf1 /data/disk05 ext3 rw,noatime,nodiratime 0 0
/dev/sdg1 /data/disk06 ext3 rw,noatime,nodiratime 0 0
/dev/sdh1 /data/disk07 ext3 rw,noatime,nodiratime 0 0
/dev/sdi1 /data/disk08 ext3 rw,noatime,nodiratime 0 0
设置系统环境
在每台服务器上执行如下操作:
1. 修改网络设置IP地址,主机名。
2. 关闭selinux,关闭iptables服务。sed -i 's/SELINUX=enforcing/SELINUX=disabled/' /etc/selinux/config
3. 设置/etc/hosts 文件,将hadoop列表所有机器添加到host文件中。
[root@namenode01 ~]# cat /etc/hosts
192.168.12.235
namenode01
192.168.12.236
datanode01
192.168.12.237
datanode02
192.168.12.238
datanode03
192.168.12.239
namenode02
192.168.12.240
jobtracker01
4. 修改yum.conf
sed -i 's/keepcache=0/keepcache=1/' /etc/yum.conf
5. 修改服务运行级别
/etc/init.d/iptables stop;chkconfig iptables off
6. 做ssh信任授权。
for i in datanode004 datanode005 datanode006 datanode007; do scp -rp .ssh/ root@$i:/root/; done
7. 修改dns,更新ntp时间。
[root@namenode001 ~]# for i in `cat /etc/hosts|awk '{print $2}'|grep -v localhost`; do scp /etc/resolv.conf root@$i:/etc/;ssh root@$i ntpdate dns.elong.cn;ssh root@$i hwclock -w;ssh $i chkconfig ntpd on; scp /etc/ntp.conf root@$i:/etc/;ssh $i /etc/init.d/ntpd restart; done
8. 批量重启
for i in `cat /etc/hosts|awk '{print $2}'|grep -v localhost|tac`;do ssh root@$i shutdown -r now;done
9. 拷贝软件安装包
for i in datanode004 datanode005 datanode006 datanode007; do scp jdk-6u37-linux-x64.bin mysql-connector-java-5.1.22.tar.gz cloudera-cdh-4-0.noarch.rpm root@$i:/root/; done
10. 设置mount环境
mkdir -p /opt/hadoop;yum -y install nfs-utils
安装jdk
在每台服务器上执行如下操作:
1. 安装 jdk-6u37-linux-x64.bin
chmod +x jdk-6u37-linux-x64.bin ;./jdk-6u37-linux-x64.bin
mv jdk1.6.0_37/ /usr/local/jdk
2. 安装cdh源
rpm -ivh http://archive.cloudera.com/cdh4/one-click-install/redhat/6/x86_64/cloudera-cdh-4-0.noarch.rpm
配置环境变量jdk
1. 增加jdk环境变量
[root@namenode01 ~]# cat /etc/profile
# for java env
export JAVA_HOME=/usr/local/jdk
export PATH=$JAVA_HOME/bin:$PATH
export CLASSPATH=.:$JAVA_HOME/lib/tools.jar:$JAVAHOME/lib/dt.jar
[root@namenode01 ~]# java -version
java version "1.6.0_37"
Java(TM) SE Runtime Environment (build 1.6.0_37-b06)
Java HotSpot(TM) 64-Bit Server VM (build 20.12-b01, mixed mode)
安装hadoop
1. 安装namenode
yum install hadoop-hdfs-namenode
chkconfig hadoop-hdfs-namenode on
2. 安装secondary namenode
yum install hadoop-hdfs-secondarynamenode
chkconfig hadoop-hdfs-secondarynamenode on
3. 安装jobtracker
yum install hadoop-0.20-mapreduce-jobtracker
chkconfig hadoop-0.20-mapreduce-jobtracker on
4. 安装datanode 和 tasktracker
yum install hadoop-0.20-mapreduce-tasktracker hadoop-hdfs-datanode
chkconfig hadoop-0.20-mapreduce-tasktracker on;chkconfig hadoop-hdfs-datanode on
5. 安装hadoop客户端
yum install hadoop-client
6. 将下列JAVA运行环境添加到上述各种hadoop的启动脚本的第二行中,以便于开机启动。
[root@namenode01 ~]# vi /etc/init.d/hadoop-hdfs-namenode
#!/bin/bash
export JAVA_HOME=/usr/local/jdk
export CLASSPATH=.:$JAVA_HOME/lib/tools.jar:$JAVA_HOME/lib/dt.jar
export PATH=$JAVA_HOME/bin:$PATH
配置hadoop
1. 配置文件
[root@namenode01 ~]# rm -f /etc/hadoop/conf/; cp -rp /etc/hadoop/conf.empty /etc/hadoop/conf
[root@namenode01 ~]# cat /etc/hadoop/conf/masters
namenode02
[root@namenode01 ~]# cat /etc/hadoop/conf/slaves
datanode01
datanode02
datanode03
[root@namenode01 ~]# cat /etc/hadoop/conf/core-site.xml
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
<property>
<name>fs.default.name</name>
<value>hdfs://namenode001:9000</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>/tmp/</value>
</property>
</configuration>
[root@namenode01 ~]# cat /etc/hadoop/conf/hdfs-site.xml
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
<property>
<name>dfs.name.dir</name>
<value>/data/hdfs/dfs/name</value>
<description>Name node save the metadata namespace</description>
</property>
<property>
<name>dfs.data.dir</name>
<value>/data/hdfs/dfs/data</value>
<description>save the data node dir</description>
</property>
<property>
<name>dfs.replication</name>
<value>3</value>
</property>
<property>
<name>dfs.webhdfs.enabled</name>
<value>true</value>
</property>
<!-- for second namenode -->
<property>
<name>dfs.namenode.http-address</name>
<value>namenode01:50070</value>
<description>The address and the base port on which the dfs NameNode Web UI will listen. If the port is 0, the server will start on a free port.</description>
</property>
</configuration>
[root@namenode01 ~]# cat /etc/hadoop/conf/mapred-site.xml
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
<property>
<name>mapred.job.tracker</name>
<value>jobtracker001:9001</value>
</property>
<property>
<name>mapred.local.dir</name>
<value>/tmp/hadoop/mapred/local</value>
</property>
<property>
<name>mapred.system.dir</name>
<value>/mapred/system</value>
</property>
</configuration>
2. 在namenode上执行创建目录
[root@namenode002 ~]# mkdir -p /data/disk0{1,2,3,4,5,6,7,8}/hdfs ; mkdir -p /data/disk0{1,2,3,4,5,6,7,8}/mapred/local ; chown -R hdfs /data/disk0{1,2,3,4,5,6,7,8}/hdfs ; chown -R mapred /data/disk0{1,2,3,4,5,6,7,8}/mapred/local ;
[root@namenode002 ~]# chown -R mapred /usr/lib/hadoop-0.20-mapreduce/bin/../logs/
[root@namenode002 ~]# su - hdfs
-bash-4.1$ hadoop fs -mkdir /mapred/system
-bash-4.1$ hadoop fs -chown -R mapred /mapred/system
-bash-4.1$ hdfs namenode -format
12/11/21 19:50:30 INFO namenode.NameNode: STARTUP_MSG:
/************************************************************
STARTUP_MSG: Starting NameNode
STARTUP_MSG:
host = namenode002/10.35.66.11
STARTUP_MSG:
args = [-format]
STARTUP_MSG:
version = 2.0.0-cdh4.1.2
STARTUP_MSG:
classpath = /etc/hadoop/conf:/usr/lib/hadoop/lib/guava-11.0.2.jar:/usr/lib/hadoop/lib/snappy-java-1.0.4.1.jar:/usr/lib/hadoop/lib/jsch-0.1.42.jar:/usr/lib/hadoop/lib/xmlenc-0.52.jar:/usr/lib/hadoop/lib/junit-4.8.2.jar:/usr/lib/hadoop/lib/commons-configuration-1.6.jar:/usr/lib/hadoop/lib/commons-logging-1.1.1.jar:/usr/lib/hadoop/lib/protobuf-java-2.4.0a.jar:/usr/lib/hadoop/lib/zookeeper-3.4.3-cdh4.1.2.jar:/usr/lib/hadoop/lib/mockito-all-1.8.5.jar:/usr/lib/hadoop/lib/asm-3.2.jar:/usr/lib/hadoop/lib/commons-beanutils-1.7.0.jar:/usr/lib/hadoop/lib/commons-io-2.1.jar:/usr/lib/hadoop/lib/commons-digester-1.8.jar:/usr/lib/hadoop/lib/jets3t-0.6.1.jar:/usr/lib/hadoop/lib/jackson-xc-1.8.8.jar:/usr/lib/hadoop/lib/kfs-0.3.jar:/usr/lib/hadoop/lib/commons-lang-2.5.jar:/usr/lib/hadoop/lib/commons-cli-1.2.jar:/usr/lib/hadoop/lib/commons-beanutils-core-1.8.0.jar:/usr/lib/hadoop/lib/servlet-api-2.5.jar:/usr/lib/hadoop/lib/slf4j-api-1.6.1.jar:/usr/lib/hadoop/lib/activation-1.1.jar:/usr/lib/hadoop/lib/jline-0.9.94.jar:/usr/lib/hadoop/lib/commons-el-1.0.jar:/usr/lib/hadoop/lib/jersey-server-1.8.jar:/usr/lib/hadoop/lib/jasper-compiler-5.5.23.jar:/usr/lib/hadoop/lib/jersey-json-1.8.jar:/usr/lib/hadoop/lib/jaxb-impl-2.2.3-1.jar:/usr/lib/hadoop/lib/jetty-6.1.26.cloudera.2.jar:/usr/lib/hadoop/lib/log4j-1.2.17.jar:/usr/lib/hadoop/lib/jaxb-api-2.2.2.jar:/usr/lib/hadoop/lib/jersey-core-1.8.jar:/usr/lib/hadoop/lib/commons-httpclient-3.1.jar:/usr/lib/hadoop/lib/jackson-core-asl-1.8.8.jar:/usr/lib/hadoop/lib/stax-api-1.0.1.jar:/usr/lib/hadoop/lib/jsp-api-2.1.jar:/usr/lib/hadoop/lib/slf4j-log4j12-1.6.1.jar:/usr/lib/hadoop/lib/jackson-jaxrs-1.8.8.jar:/usr/lib/hadoop/lib/paranamer-2.3.jar:/usr/lib/hadoop/lib/jsr305-1.3.9.jar:/usr/lib/hadoop/lib/commons-codec-1.4.jar:/usr/lib/hadoop/lib/jettison-1.1.jar:/usr/lib/hadoop/lib/jetty-util-6.1.26.cloudera.2.jar:/usr/lib/hadoop/lib/commons-collections-3.2.1.jar:/usr/lib/hadoop/lib/avro-1.7.1.cloudera.2.jar:/usr/lib/hadoop/lib/jackson-mapper-asl-1.8.8.jar:/usr/lib/hadoop/lib/commons-math-2.1.jar:/usr/lib/hadoop/lib/jasper-runtime-5.5.23.jar:/usr/lib/hadoop/lib/commons-net-3.1.jar:/usr/lib/hadoop/.//hadoop-common.jar:/usr/lib/hadoop/.//hadoop-auth.jar:/usr/lib/hadoop/.//hadoop-annotations-2.0.0-cdh4.1.2.jar:/usr/lib/hadoop/.//hadoop-annotations.jar:/usr/lib/hadoop/.//hadoop-common-2.0.0-cdh4.1.2.jar:/usr/lib/hadoop/.//hadoop-auth-2.0.0-cdh4.1.2.jar:/usr/lib/hadoop/.//hadoop-common-2.0.0-cdh4.1.2-tests.jar:/usr/lib/hadoop-hdfs/./:/usr/lib/hadoop-hdfs/lib/guava-11.0.2.jar:/usr/lib/hadoop-hdfs/lib/xmlenc-0.52.jar:/usr/lib/hadoop-hdfs/lib/commons-logging-1.1.1.jar:/usr/lib/hadoop-hdfs/lib/protobuf-java-2.4.0a.jar:/usr/lib/hadoop-hdfs/lib/zookeeper-3.4.3-cdh4.1.2.jar:/usr/lib/hadoop-hdfs/lib/asm-3.2.jar:/usr/lib/hadoop-hdfs/lib/commons-io-2.1.jar:/usr/lib/hadoop-hdfs/lib/commons-lang-2.5.jar:/usr/lib/hadoop-hdfs/lib/commons-cli-1.2.jar:/usr/lib/hadoop-hdfs/lib/servlet-api-2.5.jar:/usr/lib/hadoop-hdfs/lib/commons-daemon-1.0.3.jar:/usr/lib/hadoop-hdfs/lib/jline-0.9.94.jar:/usr/lib/hadoop-hdfs/lib/commons-el-1.0.jar:/usr/lib/hadoop-hdfs/lib/jersey-server-1.8.jar:/usr/lib/hadoop-hdfs/lib/jetty-6.1.26.cloudera.2.jar:/usr/lib/hadoop-hdfs/lib/log4j-1.2.17.jar:/usr/lib/hadoop-hdfs/lib/jersey-core-1.8.jar:/usr/lib/hadoop-hdfs/lib/jackson-core-asl-1.8.8.jar:/usr/lib/hadoop-hdfs/lib/jsp-api-2.1.jar:/usr/lib/hadoop-hdfs/lib/jsr305-1.3.9.jar:/usr/lib/hadoop-hdfs/lib/commons-codec-1.4.jar:/usr/lib/hadoop-hdfs/lib/jetty-util-6.1.26.cloudera.2.jar:/usr/lib/hadoop-hdfs/lib/jackson-mapper-asl-1.8.8.jar:/usr/lib/hadoop-hdfs/lib/jasper-runtime-5.5.23.jar:/usr/lib/hadoop-hdfs/.//hadoop-hdfs-2.0.0-cdh4.1.2.jar:/usr/lib/hadoop-hdfs/.//hadoop-hdfs-2.0.0-cdh4.1.2-tests.jar:/usr/lib/hadoop-hdfs/.//hadoop-hdfs.jar:/usr/lib/hadoop-yarn/.//*:/usr/lib/hadoop-0.20-mapreduce/./:/usr/lib/hadoop-0.20-mapreduce/lib/guava-11.0.2.jar:/usr/lib/hadoop-0.20-mapreduce/lib/snappy-java-1.0.4.1.jar:/usr/lib/hadoop-0.20-mapreduce/lib/jsch-0.1.42.jar:/usr/lib/hadoop-0.20-mapreduce/lib/xmlenc-0.52.jar:/usr/lib/hadoop-0.20-mapreduce/lib/junit-4.8.2.jar:/usr/lib/hadoop-0.20-mapreduce/lib/commons-configuration-1.6.jar:/usr/lib/hadoop-0.20-mapreduce/lib/commons-logging-1.1.1.jar:/usr/lib/hadoop-0.20-mapreduce/lib/protobuf-java-2.4.0a.jar:/usr/lib/hadoop-0.20-mapreduce/lib/mockito-all-1.8.5.jar:/usr/lib/hadoop-0.20-mapreduce/lib/asm-3.2.jar:/usr/lib/hadoop-0.20-mapreduce/lib/commons-beanutils-1.7.0.jar:/usr/lib/hadoop-0.20-mapreduce/lib/commons-io-2.1.jar:/usr/lib/hadoop-0.20-mapreduce/lib/commons-digester-1.8.jar:/usr/lib/hadoop-0.20-mapreduce/lib/aspectjrt-1.6.5.jar:/usr/lib/hadoop-0.20-mapreduce/lib/hsqldb-1.8.0.10.jar:/usr/lib/hadoop-0.20-mapreduce/lib/jets3t-0.6.1.jar:/usr/lib/hadoop-0.20-mapreduce/lib/jackson-xc-1.8.8.jar:/usr/lib/hadoop-0.20-mapreduce/lib/kfs-0.3.jar:/usr/lib/hadoop-0.20-mapreduce/lib/commons-lang-2.5.jar:/usr/lib/hadoop-0.20-mapreduce/lib/commons-cli-1.2.jar:/usr/lib/hadoop-0.20-mapreduce/lib/commons-beanutils-core-1.8.0.jar:/usr/lib/hadoop-0.20-mapreduce/lib/hadoop-fairscheduler-2.0.0-mr1-cdh4.1.2.jar:/usr/lib/hadoop-0.20-mapreduce/lib/servlet-api-2.5.jar:/usr/lib/hadoop-0.20-mapreduce/lib/slf4j-api-1.6.1.jar:/usr/lib/hadoop-0.20-mapreduce/lib/aspectjtools-1.6.5.jar:/usr/lib/hadoop-0.20-mapreduce/lib/activation-1.1.jar:/usr/lib/hadoop-0.20-mapreduce/lib/commons-el-1.0.jar:/usr/lib/hadoop-0.20-mapreduce/lib/jersey-server-1.8.jar:/usr/lib/hadoop-0.20-mapreduce/lib/jasper-compiler-5.5.23.jar:/usr/lib/hadoop-0.20-mapreduce/lib/jersey-json-1.8.jar:/usr/lib/hadoop-0.20-mapreduce/lib/jaxb-impl-2.2.3-1.jar:/usr/lib/hadoop-0.20-mapreduce/lib/jetty-6.1.26.cloudera.2.jar:/usr/lib/hadoop-0.20-mapreduce/lib/avro-compiler-1.7.1.cloudera.2.jar:/usr/lib/hadoop-0.20-mapreduce/lib/log4j-1.2.17.jar:/usr/lib/hadoop-0.20-mapreduce/lib/jaxb-api-2.2.2.jar:/usr/lib/hadoop-0.20-mapreduce/lib/jersey-core-1.8.jar:/usr/lib/hadoop-0.20-mapreduce/lib/commons-httpclient-3.1.jar:/usr/lib/hadoop-0.20-mapreduce/lib/jackson-core-asl-1.8.8.jar:/usr/lib/hadoop-0.20-mapreduce/lib/stax-api-1.0.1.jar:/usr/lib/hadoop-0.20-mapreduce/lib/kfs-0.2.2.jar:/usr/lib/hadoop-0.20-mapreduce/lib/jsp-api-2.1.jar:/usr/lib/hadoop-0.20-mapreduce/lib/jackson-jaxrs-1.8.8.jar:/usr/lib/hadoop-0.20-mapreduce/lib/paranamer-2.3.jar:/usr/lib/hadoop-0.20-mapreduce/lib/jsr305-1.3.9.jar:/usr/lib/hadoop-0.20-mapreduce/lib/commons-codec-1.4.jar:/usr/lib/hadoop-0.20-mapreduce/lib/jettison-1.1.jar:/usr/lib/hadoop-0.20-mapreduce/lib/ant-contrib-1.0b3.jar:/usr/lib/hadoop-0.20-mapreduce/lib/jetty-util-6.1.26.cloudera.2.jar:/usr/lib/hadoop-0.20-mapreduce/lib/commons-collections-3.2.1.jar:/usr/lib/hadoop-0.20-mapreduce/lib/avro-1.7.1.cloudera.2.jar:/usr/lib/hadoop-0.20-mapreduce/lib/jackson-mapper-asl-1.8.8.jar:/usr/lib/hadoop-0.20-mapreduce/lib/commons-math-2.1.jar:/usr/lib/hadoop-0.20-mapreduce/lib/jasper-runtime-5.5.23.jar:/usr/lib/hadoop-0.20-mapreduce/lib/commons-net-3.1.jar:/usr/lib/hadoop-0.20-mapreduce/.//hadoop-2.0.0-mr1-cdh4.1.2-test.jar:/usr/lib/hadoop-0.20-mapreduce/.//hadoop-2.0.0-mr1-cdh4.1.2-ant.jar:/usr/lib/hadoop-0.20-mapreduce/.//hadoop-examples-2.0.0-mr1-cdh4.1.2.jar:/usr/lib/hadoop-0.20-mapreduce/.//hadoop-examples.jar:/usr/lib/hadoop-0.20-mapreduce/.//hadoop-2.0.0-mr1-cdh4.1.2-tools.jar:/usr/lib/hadoop-0.20-mapreduce/.//hadoop-tools-2.0.0-mr1-cdh4.1.2.jar:/usr/lib/hadoop-0.20-mapreduce/.//hadoop-ant-2.0.0-mr1-cdh4.1.2.jar:/usr/lib/hadoop-0.20-mapreduce/.//hadoop-2.0.0-mr1-cdh4.1.2-core.jar:/usr/lib/hadoop-0.20-mapreduce/.//hadoop-test-2.0.0-mr1-cdh4.1.2.jar:/usr/lib/hadoop-0.20-mapreduce/.//hadoop-test.jar:/usr/lib/hadoop-0.20-mapreduce/.//hadoop-core.jar:/usr/lib/hadoop-0.20-mapreduce/.//hadoop-core-2.0.0-mr1-cdh4.1.2.jar:/usr/lib/hadoop-0.20-mapreduce/.//hadoop-ant.jar:/usr/lib/hadoop-0.20-mapreduce/.//hadoop-2.0.0-mr1-cdh4.1.2-examples.jar:/usr/lib/hadoop-0.20-mapreduce/.//hadoop-tools.jar
STARTUP_MSG:
build = file:///data/1/jenkins/workspace/generic-package-rhel64-6-0/topdir/BUILD/hadoop-2.0.0-cdh4.1.2/src/hadoop-common-project/hadoop-common -r f0b53c81cbf56f5955e403b49fcd27afd5f082de; compiled by 'jenkins' on Thu Nov 1 17:33:23 PDT 2012
************************************************************/
12/11/21 19:50:31 WARN common.Util: Path /data/disk01/hdfs/name should be specified as a URI in configuration files. Please update hdfs configuration.
12/11/21 19:50:31 WARN common.Util: Path /data/disk02/hdfs/name should be specified as a URI in configuration files. Please update hdfs configuration.
12/11/21 19:50:31 WARN common.Util: Path /data/disk03/hdfs/name should be specified as a URI in configuration files. Please update hdfs configuration.
12/11/21 19:50:31 WARN common.Util: Path /data/disk04/hdfs/name should be specified as a URI in configuration files. Please update hdfs configuration.
12/11/21 19:50:31 WARN common.Util: Path /data/disk05/hdfs/name should be specified as a URI in configuration files. Please update hdfs configuration.
12/11/21 19:50:31 WARN common.Util: Path /data/disk06/hdfs/name should be specified as a URI in configuration files. Please update hdfs configuration.
12/11/21 19:50:31 WARN common.Util: Path /data/disk07/hdfs/name should be specified as a URI in configuration files. Please update hdfs configuration.
12/11/21 19:50:31 WARN common.Util: Path /data/disk08/hdfs/name should be specified as a URI in configuration files. Please update hdfs configuration.
12/11/21 19:50:31 WARN common.Util: Path /data/disk01/hdfs/name should be specified as a URI in configuration files. Please update hdfs configuration.
12/11/21 19:50:31 WARN common.Util: Path /data/disk02/hdfs/name should be specified as a URI in configuration files. Please update hdfs configuration.
12/11/21 19:50:31 WARN common.Util: Path /data/disk03/hdfs/name should be specified as a URI in configuration files. Please update hdfs configuration.
12/11/21 19:50:31 WARN common.Util: Path /data/disk04/hdfs/name should be specified as a URI in configuration files. Please update hdfs configuration.
12/11/21 19:50:31 WARN common.Util: Path /data/disk05/hdfs/name should be specified as a URI in configuration files. Please update hdfs configuration.
12/11/21 19:50:31 WARN common.Util: Path /data/disk06/hdfs/name should be specified as a URI in configuration files. Please update hdfs configuration.
12/11/21 19:50:31 WARN common.Util: Path /data/disk07/hdfs/name should be specified as a URI in configuration files. Please update hdfs configuration.
12/11/21 19:50:31 WARN common.Util: Path /data/disk08/hdfs/name should be specified as a URI in configuration files. Please update hdfs configuration.
Formatting using clusterid: CID-0e567455-ca4e-40f0-aac0-c65640212c5e
12/11/21 19:50:31 INFO util.HostsFileReader: Refreshing hosts (include/exclude) list
12/11/21 19:50:31 INFO blockmanagement.DatanodeManager: dfs.block.invalidate.limit=1000
12/11/21 19:50:31 INFO blockmanagement.BlockManager: dfs.block.access.token.enable=false
12/11/21 19:50:31 INFO blockmanagement.BlockManager: defaultReplication
= 3
12/11/21 19:50:31 INFO blockmanagement.BlockManager: maxReplication
= 512
12/11/21 19:50:31 INFO blockmanagement.BlockManager: minReplication
= 1
12/11/21 19:50:31 INFO blockmanagement.BlockManager: maxReplicationStreams
= 2
12/11/21 19:50:31 INFO blockmanagement.BlockManager: shouldCheckForEnoughRacks = false
12/11/21 19:50:31 INFO blockmanagement.BlockManager: replicationRecheckInterval = 3000
12/11/21 19:50:31 INFO blockmanagement.BlockManager: encryptDataTransfer
= false
12/11/21 19:50:31 INFO namenode.FSNamesystem: fsOwner
= hdfs (auth:SIMPLE)
12/11/21 19:50:31 INFO namenode.FSNamesystem: supergroup
= supergroup
12/11/21 19:50:31 INFO namenode.FSNamesystem: isPermissionEnabled = true
12/11/21 19:50:31 INFO namenode.FSNamesystem: HA Enabled: false
12/11/21 19:50:32 INFO namenode.FSNamesystem: Append Enabled: true
12/11/21 19:50:32 INFO namenode.NameNode: Caching file names occuring more than 10 times
12/11/21 19:50:32 INFO namenode.FSNamesystem: dfs.namenode.safemode.threshold-pct = 0.9990000128746033
12/11/21 19:50:32 INFO namenode.FSNamesystem: dfs.namenode.safemode.min.datanodes = 0
12/11/21 19:50:32 INFO namenode.FSNamesystem: dfs.namenode.safemode.extension
= 30000
12/11/21 19:50:32 INFO namenode.NNStorage: Storage directory /data/disk01/hdfs/name has been successfully formatted.
12/11/21 19:50:32 INFO namenode.NNStorage: Storage directory /data/disk02/hdfs/name has been successfully formatted.
12/11/21 19:50:32 INFO namenode.NNStorage: Storage directory /data/disk03/hdfs/name has been successfully formatted.
12/11/21 19:50:32 INFO namenode.NNStorage: Storage directory /data/disk04/hdfs/name has been successfully formatted.
12/11/21 19:50:32 INFO namenode.NNStorage: Storage directory /data/disk05/hdfs/name has been successfully formatted.
12/11/21 19:50:32 INFO namenode.NNStorage: Storage directory /data/disk06/hdfs/name has been successfully formatted.
12/11/21 19:50:32 INFO namenode.NNStorage: Storage directory /data/disk07/hdfs/name has been successfully formatted.
12/11/21 19:50:32 INFO namenode.NNStorage: Storage directory /data/disk08/hdfs/name has been successfully formatted.
12/11/21 19:50:32 INFO namenode.FSImage: Saving image file /data/disk05/hdfs/name/current/fsimage.ckpt_0000000000000000000 using no compression
12/11/21 19:50:32 INFO namenode.FSImage: Saving image file /data/disk01/hdfs/name/current/fsimage.ckpt_0000000000000000000 using no compression
12/11/21 19:50:32 INFO namenode.FSImage: Saving image file /data/disk07/hdfs/name/current/fsimage.ckpt_0000000000000000000 using no compression
12/11/21 19:50:32 INFO namenode.FSImage: Saving image file /data/disk03/hdfs/name/current/fsimage.ckpt_0000000000000000000 using no compression
12/11/21 19:50:32 INFO namenode.FSImage: Saving image file /data/disk06/hdfs/name/current/fsimage.ckpt_0000000000000000000 using no compression
12/11/21 19:50:32 INFO namenode.FSImage: Saving image file /data/disk04/hdfs/name/current/fsimage.ckpt_0000000000000000000 using no compression
12/11/21 19:50:32 INFO namenode.FSImage: Saving image file /data/disk02/hdfs/name/current/fsimage.ckpt_0000000000000000000 using no compression
12/11/21 19:50:32 INFO namenode.FSImage: Saving image file /data/disk08/hdfs/name/current/fsimage.ckpt_0000000000000000000 using no compression
12/11/21 19:50:32 INFO namenode.FSImage: Image file of size 119 saved in 0 seconds.
12/11/21 19:50:32 INFO namenode.FSImage: Image file of size 119 saved in 0 seconds.
12/11/21 19:50:32 INFO namenode.FSImage: Image file of size 119 saved in 0 seconds.
12/11/21 19:50:32 INFO namenode.FSImage: Image file of size 119 saved in 0 seconds.
12/11/21 19:50:32 INFO namenode.FSImage: Image file of size 119 saved in 0 seconds.
12/11/21 19:50:32 INFO namenode.FSImage: Image file of size 119 saved in 0 seconds.
12/11/21 19:50:32 INFO namenode.FSImage: Image file of size 119 saved in 0 seconds.
12/11/21 19:50:32 INFO namenode.FSImage: Image file of size 119 saved in 0 seconds.
12/11/21 19:50:32 INFO namenode.NNStorageRetentionManager: Going to retain 1 images with txid >= 0
12/11/21 19:50:32 INFO util.ExitUtil: Exiting with status 0
12/11/21 19:50:32 INFO namenode.NameNode: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at namenode002/10.35.66.11
************************************************************/
[root@namenode002 ~]# /etc/init.d/hadoop-hdfs-namenode start
Starting Hadoop namenode:
[ OK ]
starting namenode, logging to /var/log/hadoop-hdfs/hadoop-hdfs-namenode-namenode002.out
[root@namenode002 ~]# jps
2976 Jps
2874 NameNode
[root@namenode002 ~]# /etc/init.d/hadoop-0.20-mapreduce-jobtracker start
Starting Hadoop jobtracker daemon (hadoop-jobtracker): starting jobtracker, logging to /var/log/hadoop-0.20-mapreduce/hadoop-hadoop-jobtracker-namenode002.out
[root@namenode002 ~]# jps
[ OK ]
3097 JobTracker
3135 Jps
2874 NameNode
3.在datanode上执行:
[root@datanode01 ~]# mkdir /usr/lib/hadoop/logs
[root@datanode01 ~]# chown -R hdfs /usr/lib/hadoop/logs
[root@datanode01 ~]# ln -s /usr/lib/hadoop/libexec/ /usr/lib/hadoop-hdfs/libexec
4. 拷贝所有配置到datanode
for i in `cat /etc/hosts|awk '{print $2}'|grep -v localhost|grep -v namenode001`;do ssh root@$i rm -rf /etc/hadoop/conf;scp -r /etc/hadoop/conf root@$i:/etc/hadoop/; done
检查状态
1. 主机列表:
192.168.12.235
namenode01
192.168.12.236
datanode01
192.168.12.237
datanode02
192.168.12.238
datanode03
192.168.12.239
namenode02
192.168.12.240
jobtracker01
2. 命令检查:
[root@namenode01 ~]# jps
1427 NameNode
1577 Jps
[root@namenode02 ~]# jps
1444 SecondaryNameNode
1586 Jps
[root@jobtracker01 ~]# jps
1494 JobTracker
1596 Jps
[root@datanode01 ~]# jps
1428 DataNode
1547 TaskTracker
1692 Jps
[root@datanode02 ~]# jps
1703 Jps
1557 TaskTracker
1438 DataNode
[root@datanode03 ~]# jps
1542 TaskTracker
1423 DataNode
1688 Jps
3. URL检查
访问 http://192.168.12.235:50070/dfshealth.jsp
NameNode 'namenode01:9000' (active)
Started:
Fri Nov 16 07:48:23 CST 2012
Version:
2.0.0-cdh4.1.2, f0b53c81cbf56f5955e403b49fcd27afd5f082de
Compiled:
Thu Nov 1 17:33:23 PDT 2012 by jenkins from Unknown
Upgrades:
There are no upgrades in progress.
Cluster ID:
CID-1ba166bf-7d58-49d5-a3f7-a91fd5b569b1
Block Pool ID: BP-1977465138-192.168.12.235-1352944279653
Browse the filesystem
NameNode Logs
Cluster Summary
Security is OFF
9 files and directories, 2 blocks = 11 total.
Heap Memory used 21.89 MB is 49% of Commited Heap Memory 44.62 MB. Max Heap Memory is 888.94 MB.
Non Heap Memory used 33.39 MB is 95% of Commited Non Heap Memory 34.81 MB. Max Non Heap Memory is 130 MB.
Configured Capacity
: 74.42 GB
DFS Used
: 132 KB
Non DFS Used
: 7.67 GB
DFS Remaining
: 66.75 GB
DFS Used%
: 0 %
DFS Remaining%
: 89.7 %
Block Pool Used
: 132 KB
Block Pool Used% :
0 %
DataNodes usages
: Min % Median % Max % stdev %
0 % 0 % 0 % 0 %
Live Nodes
: 3 (Decommissioned: 0)
Dead Nodes
: 0 (Decommissioned: 0)
Decommissioning Nodes
: 0
Number of Under-Replicated Blocks
: 0
NameNode Journal Status:
Current transaction ID: 319
Journal Manager
State
FileJournalManager(root=/data/hdfs/dfs/name)
EditLogFileOutputStream(/data/hdfs/dfs/name/current/edits_inprogress_0000000000000000319)
NameNode Storage:
Storage Directory
Type State
/data/hdfs/dfs/name
IMAGE_AND_EDITS Active
Hadoop, 2012.
访问 http://192.168.12.239:50090/status.jsp
SecondaryNameNode
Version:
2.0.0-cdh4.1.2, f0b53c81cbf56f5955e403b49fcd27afd5f082de
Compiled:
Thu Nov 1 17:33:23 PDT 2012 by jenkins from Unknown
SecondaryNameNode Status
Name Node Address
: namenode01/192.168.12.235:9000
Start Time
: Fri Nov 16 07:43:46 CST 2012
Last Checkpoint Time : --
Checkpoint Period
: 3600 seconds
Checkpoint Size
: 39.06 KB (= 40000 bytes)
Checkpoint Dirs
: [file:///tmp/hadoop.tmp.dir/dfs/namesecondary]
Checkpoint Edits Dirs: [file:///tmp/hadoop.tmp.dir/dfs/namesecondary]
访问 http://192.168.12.240:50030/jobtracker.jsp
jobtracker01 Hadoop Map/Reduce Administration
Quick Links State: RUNNING
Started: Fri Nov 16 07:48:18 CST 2012
Version: 2.0.0-mr1-cdh4.1.2, Unknown
Compiled: Thu Nov 1 18:05:52 PDT 2012 by jenkins from Unknown
Identifier: 201211160748
Cluster Summary (Heap Size is 27.88 MB/888.94 MB)
Running Map Tasks
Running Reduce Tasks Total Submissions Nodes Occupied Map Slots Occupied Reduce Slots Reserved Map Slots Reserved Reduce Slots Map Task Capacity Reduce Task Capacity Avg. Tasks/Node Blacklisted Nodes Excluded Nodes
0
0 0 3 0 0 0 0 6 6 4.00 0 0
Scheduling Information
Queue Name
State Scheduling Information
default
running N/A
Filter (Jobid, Priority, User, Name)
Example: 'user:smith 3200' will filter by 'smith' only in the user field and '3200' in all fields
Running Jobs
none
Retired Jobs
none
Local Logs
Log directory, Job Tracker History
Hadoop, 2012.
访问
http://192.168.12.236:50075/browseDirectory.jsp?namenodeInfoPort=50070&dir=/&nnaddr=namenode01:9000
Contents of directory /
Goto :
Name
Type
Size
Replication
Block Size
Modification Time
Permission
Owner
Group
mapred
dir
目录规划
Namenode目录:
file:///data/disk01/hdfs/name,file:///data/disk02/hdfs/name,file:///data/disk03/hdfs/name,file:///data/disk04/hdfs/name,file:///data/disk05/hdfs/name,file:///data/disk06/hdfs/name,file:///data/disk07/hdfs/name,file:///data/disk08/hdfs/name
Secondary namenode目录:
file:///data/disk01/hdfs/namesecondary,file:///data/disk02/hdfs/namesecondary,file:///data/disk03/hdfs/namesecondary,file:///data/disk04/hdfs/namesecondary,file:///data/disk05/hdfs/namesecondary,file:///data/disk06/hdfs/namesecondary,file:///data/disk07/hdfs/namesecondary,file:///data/disk08/hdfs/namesecondary
HDFS数据存储目录:
file:///data/disk01/hdfs/data,file:///data/disk02/hdfs/data,file:///data/disk03/hdfs/data,file:///data/disk04/hdfs/data,file:///data/disk05/hdfs/data,file:///data/disk06/hdfs/data,file:///data/disk07/hdfs/data,file:///data/disk08/hdfs/data
Mapred目录:
/data/disk01/mapred/local,/data/disk02/mapred/local,/data/disk03/mapred/local,/data/disk04/mapred/local,/data/disk05/mapred/local,/data/disk06/mapred/local,/data/disk07/mapred/local,/data/disk08/mapred/local
运维标准
1. 新上架服务器流程
a) 新上架服务器按照系统部标准上架流程操作。
b) 操作系统安装必须严格按照上述《安装手册》进行。
c) Hadoop
环境部署交由hadoop
运维人员安装,禁止自行安装。
d) 每台datanode
和tasktracker
角色不允许分离。
e) 开发人员使用Hadoop
必须通过统一的登陆服务器运行job
,禁止跳转到集群服务器上运行JOB
。
2. 权限管理
a) 如果开发人员需要使用Hadoop
集群,必须填写申请单,经由部门负责人及相关负责人审核后,交由Hadoop
运维人员开通账号。
b) 告知用户登陆和使用方法。
c) 开通的帐户均在hdfs://user/<USERID>
下面为用户存储空间,可以自由存放数据。
d) Hive
的存储数据仓库路径为/data/hive/warehouse/,
指定用户建立库则指定用户具备读写权限。
e) 需要修改权限请联系hadoop
运维人员。