有四台Linux服务器,其中一台为主,其它为从。服务器使用的是CentOS6.5,jdk选择1.6,hadoop选择1.0.4版本。要到实际环境中应用还要根据实际情况做修改。
如果是新装好的系统,要先配置好ip,shell脚本如下:
#!bin/bash
read "input ip:" ip
echo 'the default hostname is master'
sed -i '$aIPADDR='$ip /etc/sysconfig/network-scripts/ifcfg-eth0
sed -i '/BOOTPROTO/cBOOTPROTO="no"' /etc/sysconfig/network-scripts/ifcfg-eth0
sed -i '/IPV6INIT/cIPV6INIT="no"' /etc/sysconfig/network-scripts/ifcfg-eth0
sed -i '/NM_CONTROLLED/cNM_CONTROLLED="no"' /etc/sysconfig/network-scripts/ifcfg-eth0
service network restart
chkconfig network on
service iptables stop
chkconfig iptables off
setenforce 0
hostname master
sed -i '/HOSTNAME/cHOSTNAME=master' /etc/sysconfig/network
集群的ip配置如下,在实际环境中可以自己作调整。首先检查通信,通信成功才能下一步进行,否则安装终止。如果通信成功,才会执行接下来的步骤,则修改/etc/hosts文件和ssh_conf,然后重启ssh服务。密码要根据实际环境修改。
masterip=192.168.2.254
slave1ip=192.168.2.11
slave2ip=192.168.2.2
slave3ip=192.168.2.3
if [ -e ip.txt ];then
rm -rf ip.txt
fi
touch ip.txt
echo $masterip >>ip.txt
echo $slave1ip >>ip.txt
echo $slave2ip >>ip.txt
echo $slave3ip >>ip.txt
NEWPASS=123456
NETWORK=TRUE
echo "before you install,please make sure network is ok!!!"
echo "now test the network"
for ip in $(cat ip.txt)
do
ping $ip -c 5 &>/dev/null
if [ $? == 0 ] ;then
echo "${ip} is ok"
else
echo "${ip} conn't connected"
NETWORK=FALSE
fi
done
echo $NETWORK
if [ $NETWORK != FALSE ];then
.........
fi
为了方便说明,笔者在此处做了步骤拆分.
使用root用户实现master的root用户能够免秘钥ssh登陆其他主机。这样root用户就可以很方便的管理其他主机了。
PASS=123456
yum -y install tcl --nogpgcheck
yum -y install expect --nogpgcheck
expect <
spawn ssh-keygen
expect "Enter file in which to save the key (/root/.ssh/id_rsa):"
send "\r"
expect "Enter passphrase (empty for no passphrase):"
send "\r"
expect "Enter same passphrase again:"
send "\r"
expect eof
EOF
for ip in $(cat ip.txt)
do
expect <
spawn ssh-copy-id root@${ip}
expect "(yes/no)?"
send "yes\r"
expect "password:"
send "${PASS}\r"
expect eof
EOF
done
expect <
spawn ssh-copy-id hadoop@master
expect "(yes/no)?"
send "yes\r"
expect "password:"
send "${PASS}\r"
expect eof
EOF
完成root无秘钥登陆其他主机后,各个主机添加hadoop用户,并修改/etc/hosts文件,主机上修改sshd的配置文件并重新启动服务。
for ip in $(cat ip.txt)
do
ssh root@$ip "useradd hadoop"
ssh root@$ip "echo '123456' | passwd --stdin hadoop"
ssh root@$ip "echo $masterip' master' >>/etc/hosts"
ssh root@$ip "echo $slave1ip' slave1' >>/etc/hosts"
ssh root@$ip "echo $slave2ip' slave2' >>/etc/hosts"
ssh root@$ip "echo $slave3ip' slave3' >>/etc/hosts"
done
cp ip.txt /home/hadoop
cp hadoopsshconf.sh /home/hadoop
chown hadoop:hadoop /home/hadoop/ip.txt
chown hadoop:hadoop /home/hadoop/hadoopsshconf.sh
ssh hadoop@localhost "sh hadoopsshconf.sh"
sed -i '/#RSAAuthentication yes/cRSAAuthentication yes' /etc/ssh/sshd_config
sed -i '/#PubkeyAuthentication yes/cPubkeyAuthentication yes' /etc/ssh/sshd_config
sed -i '/#AuthorizedKeysFile .ssh/authorized_keys/AuthorizedKeysFile .ssh/authorized_keysc' /etc/ssh/sshd_config
service sshd restart
chkconfig sshd on
只有root用户能够免秘钥登陆其他主机是不够的,还要本机的hadoop用户能够免密钥登陆其他hadoop用户,上面的脚本中已经将脚本hadoopsshconf.sh拷贝到hadoop用户下方,使用ssh远程命令以hadoop用户执行即可。hadoopsshconf.sh如下:
#!/bin/bash
PASS=123456
expect <
spawn ssh-keygen
expect "Enter file in which to save the key (/root/.ssh/id_rsa):"
send "\r"
expect "Enter passphrase (empty for no passphrase):"
send "\r"
expect "Enter same passphrase again:"
send "\r"
expect eof
EOF
cat ~/.ssh/id_rsa.pub >>~/.ssh/authorized_keys
chmod 600 ~/.ssh/authorized_keys
for ip in $(cat ip.txt)
do
expect <
spawn ssh-copy-id hadoop@${ip}
expect "(yes/no)?"
send "yes\r"
expect "password:"
send "${PASS}\r"
expect eof
EOF
done
安装jdk1.6脚本,首先清除掉系统上已经安装的jdk,避免版本冲突。
#!/bin/bash
rm -rf tmp.txt
rpm -qa | grep java* > tmp.txt
line=$(cat tmp.txt)
for i in $line
do
rpm -e $i --nodeps
done
mkdir -p /usr/java
cp ./jdk-6u32-linux-x64.bin /usr/java/
cd /usr/java/
./jdk-6u32-linux-x64.bin
rm -rf jdk-6u32-linux-x64.bin
echo 'export JAVA_HOME=/usr/java/jdk1.6.0_32' >>/etc/profile
echo 'export CLASSPATH=.:$CLASSPATH:$JAVA_HOME/lib:$JAVA_HOME/jre/lib' >>/etc/profile
echo 'export PATH=$PATH:$JAVA_HOME/bin:$JAVA_HOME/jre/bin' >>/etc/profile
source /etc/profile
java -version
cd -
接下来安装hadoop1.4.0,安装在/usr下边。脚本内容包括/etc/profile的修改,hadoop配置文件,hadop-env.sh,conre-site.xml,hdfs-site.xml,mapred-site.xml,masters和slaves的配置文件修改,兵拷贝到其他机子上去。
mkdir -p /usr/hadoop
tar -zxf hadoop-1.0.4.tar.gz -C /usr/hadoop
mv /usr/hadoop/hadoop-1.0.4 /usr/hadoop/hadoop1
chown -R hadoop:hadoop /usr/hadoop
echo 'export HADOOP_HOME=/usr/hadoop/hadoop1' >> /etc/profile
echo 'export PATH=$PATH:$HADOOP_HOME/bin' >>/etc/profile
source /etc/profile
echo 'export JAVA_HOME=/usr/java/jdk1.6.0_32' >> /usr/hadoop/hadoop1/conf/hadoop-env.sh
echo 'export HADOOP_PID_DIR=/usr/hadoop/hadoop1/pids' >>/usr/hadoop/hadoop1/conf/hadoop-env.sh
sed -i '6a\\t' /usr/hadoop/hadoop1/conf/core-site.xml
sed -i '6a\\t
hdfs://master:9000 ' /usr/hadoop/hadoop1/conf/core-site.xmlsed -i '6a\\t
fs.default.name ' /usr/hadoop/hadoop1/conf/core-site.xmlsed -i '6a\\t
' /usr/hadoop/hadoop1/conf/core-site.xml sed -i '6a\\t' /usr/hadoop/hadoop1/conf/hdfs-site.xml
sed -i '6a\\t
false ' /usr/hadoop/hadoop1/conf/hdfs-site.xmlsed -i '6a\\t
dfs.permissions ' /usr/hadoop/hadoop1/conf/hdfs-site.xmlsed -i '6a\\t
' /usr/hadoop/hadoop1/conf/hdfs-site.xml sed -i '6a\\t' /usr/hadoop/hadoop1/conf/hdfs-site.xml
sed -i '6a\\t
/usr/hadoop/hadoop1/tmp/ ' /usr/hadoop/hadoop1/conf/hdfs-site.xmlsed -i '6a\\t
hadoop.tmp.dir ' /usr/hadoop/hadoop1/conf/hdfs-site.xmlsed -i '6a\\t
' /usr/hadoop/hadoop1/conf/hdfs-site.xml sed -i '6a\\t' /usr/hadoop/hadoop1/conf/hdfs-site.xml
sed -i '6a\\t
/usr/hadoop/hadoop1/data/ ' /usr/hadoop/hadoop1/conf/hdfs-site.xmlsed -i '6a\\t
dfs.data.dir ' /usr/hadoop/hadoop1/conf/hdfs-site.xmlsed -i '6a\\t
' /usr/hadoop/hadoop1/conf/hdfs-site.xml sed -i '6a\\t' /usr/hadoop/hadoop1/conf/hdfs-site.xml
sed -i '6a\\t
/usr/hadoop/hadoop1/namenode/ ' /usr/hadoop/hadoop1/conf/hdfs-site.xmlsed -i '6a\\t
dfs.name.dir ' /usr/hadoop/hadoop1/conf/hdfs-site.xmlsed -i '6a\\t
' /usr/hadoop/hadoop1/conf/hdfs-site.xml sed -i '6a\\t' /usr/hadoop/hadoop1/conf/hdfs-site.xml
sed -i '6a\\t
3 ' /usr/hadoop/hadoop1/conf/hdfs-site.xmlsed -i '6a\\t
dfs.replication ' /usr/hadoop/hadoop1/conf/hdfs-site.xmlsed -i '6a\\t
' /usr/hadoop/hadoop1/conf/hdfs-site.xml sed -i '6a\\t' /usr/hadoop/hadoop1/conf/mapred-site.xml
sed -i '6a\\t
2 ' /usr/hadoop/hadoop1/conf/mapred-site.xmlsed -i '6a\\t
mapred.tasktracker.reduce.tasks.maximum ' /usr/hadoop/hadoop1/conf/mapred-site.xmlsed -i '6a\\t
' /usr/hadoop/hadoop1/conf/mapred-site.xml sed -i '6a\\t' /usr/hadoop/hadoop1/conf/mapred-site.xml
sed -i '6a\\t
2 ' /usr/hadoop/hadoop1/conf/mapred-site.xmlsed -i '6a\\t
mapred.tasktracker.map.tasks.maximum ' /usr/hadoop/hadoop1/conf/mapred-site.xmlsed -i '6a\\t
' /usr/hadoop/hadoop1/conf/mapred-site.xml sed -i '6a\\t' /usr/hadoop/hadoop1/conf/mapred-site.xml
sed -i '6a\\t
master:9001 ' /usr/hadoop/hadoop1/conf/mapred-site.xmlsed -i '6a\\t
mapred.job.tracker ' /usr/hadoop/hadoop1/conf/mapred-site.xmlsed -i '6a\\t
' /usr/hadoop/hadoop1/conf/mapred-site.xml echo 'master' >> /usr/hadoop/hadoop1/conf/masters
sed -i '1d' /usr/hadoop/hadoop1/conf/slaves
echo 'slave1' >> /usr/hadoop/hadoop1/conf/slaves
echo 'slave2' >> /usr/hadoop/hadoop1/conf/slaves
echo 'slave3' >> /usr/hadoop/hadoop1/conf/slaves
mkdir -p /usr/hadoop/hadoop1/data/
mkdir -p /usr/hadoop/hadoop1/tmp/
chown -R hadoop:hadoop /usr/hadoop/
chmod -R 755 /usr/hadoop/hadoop1/data/
chmod -R 755 /usr/hadoop/hadoop1/tmp/
for i in $(seq 3)
do
ssh slave$i "mkdir -p /usr/hadoop"
done
scp -r /usr/hadoop/hadoop1 root@slave1:/usr/hadoop
scp -r /usr/hadoop/hadoop1 root@slave2:/usr/hadoop
scp -r /usr/hadoop/hadoop1 root@slave3:/usr/hadoop
for i in $(seq 3)
do
ssh slave$i "chown -R hadoop:hadoop /usr/hadoop"
ssh slave$i "chmod -R 755 /usr/hadoop/hadoop1/data/"
ssh slave$i "chmod -R 755 /usr/hadoop/hadoop1/tmp/"
done
这样就完成了hadoop集群的自动化安装。然后就是格式化,启动并验证了