该博客主要帮助实现Hadoop完全分布式环境的搭建:(已经安装好Ubuntu前提下,并且保证虚拟机可以相互ping同还有上网情况下)
事先准备:
jdk-7u51-linux-x64.tar
hadoop-2.5.2
第一大步骤:
1创建root密码:sudo passwd root
2新增一个用户:sudo adduser hadoop
3切换到root
su root
执行
给sudoers增加写权限:chmod u+w /etc/sudoers
编译sudoers文件:nano /etc/sudoers (也可以vi进行编辑,本人比较懒,懒得下载就直接用nano)
在rootALL=(ALL) ALL下方增加hadoop ALL=(ALL)NOPASSWD:ALL
去掉sudoers文件的写权限:chmod u-w /etc/sudoers
4第一步:sudo nano/etc/hostname改成自己的用户名
sudo nano /etc/hosts 将127.0.1.1注释掉
并添加上集群所有的ip和hostname
本人的集群如下:
ip hostname
192.168.218.130 master
192.168.218.131 slaver1
192.168.218.132 slaver2
(这里建议配置奇数个节点就行,以后设置zookeeper方便点)
重启网络sudo/etc/init.d/networking restart
所有节点执行以上一样的操作!
退出使用hadoop登录
第二大步骤:
安装jdk和hadoop
安装jdk
sudo tar -xzvf /home/hadoop/jdk-7u51-linux-x64.tar /usr/lib/jvm/;
tar -xzvf /home/hadoop/hadoop-2.5.2.tar.gz
安装ssh(最头痛的);
sudo apt-get install openssh-server
若安装失败
sudo cp/etc/apt/sources.list /etc/apt/sources.list.bak//做好备份
sudo gedit /etc/apt/sources.list
替换为以下形式:
debhttp://ubuntu.uestc.edu.cn/ubuntu/ precise main restricted universe multiverse
deb http://ubuntu.uestc.edu.cn/ubuntu/ precise-backports main restricteduniverse multiverse
deb http://ubuntu.uestc.edu.cn/ubuntu/ precise-proposed main restricteduniverse multiverse
deb http://ubuntu.uestc.edu.cn/ubuntu/ precise-security main restricteduniverse multiverse
deb http://ubuntu.uestc.edu.cn/ubuntu/ precise-updates main restricted universemultiverse
deb-src http://ubuntu.uestc.edu.cn/ubuntu/ precise main restricted universemultiverse
deb-src http://ubuntu.uestc.edu.cn/ubuntu/ precise-backports main restricteduniverse multiverse
deb-src http://ubuntu.uestc.edu.cn/ubuntu/ precise-proposed main restricteduniverse multiverse
deb-src http://ubuntu.uestc.edu.cn/ubuntu/ precise-security main restricteduniverse multiverse
deb-src http://ubuntu.uestc.edu.cn/ubuntu/ precise-updates main restricteduniverse multiverse
替换完之后执行sudo apt-getupdate
(遇到的问题:我的ubuntu机器上出现下面这个错误。
Reading package lists... Error!
E: Encountered a section with no Package: header
E: Problem with MergeList/var/lib/apt/lists/ftp.sjtu.edu.cn_ubuntu_dists_precise-security_restricted_binary-i386_Packages
E: The package lists or status file could not beparsed or opened.
虽然不知道是怎么回事,但是google出来的结果提示可以按如下方法解决,记录之:
sudo rm /var/lib/apt/lists/* -vf
sudo apt-get update)
继续执行 sudo apt-getinstall ssh
之后:
ssh-keygen -t rsa
(.ssh在/home/hadoop/.ssh)
cat~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
插曲:
1)修改文件"authorized_keys" (hadoop用户下修改)
chmod 600 ~/.ssh/authorized_keys
2)设置SSH配置
用root用户登录服务器修改SSH配置文件"/etc/ssh/sshd_config"的下列内容。
RSAAuthentication yes # 启用 RSA认证
PubkeyAuthentication yes # 启用公钥私钥配对认证方式
AuthorizedKeysFile .ssh/authorized_keys # 公钥文件路径(和上面生成的文件同)
设置完之后记得重启SSH服务,才能使刚才设置有效。
service sshd restart
进入master的.ssh目录(在hadoop@master中)
scp authorized_keys hadoop@slaver1:~/.ssh/authorized_keys_from_master
scp authorized_keyshadoop@slaver2:~/.ssh/authorized_keys_from_master
进入slaver1的.ssh目录(在 hadoop@slaver1中)
scp authorized_keys hadoop@master:~/.ssh/authorized_keys_from_slaver1
scp authorized_keyshadoop@slaver2:~/.ssh/authorized_keys_from_slaver1
进入slaver2的.ssh目录(在 hadoop@slaver2中)
scp authorized_keys hadoop@master:~/.ssh/authorized_keys_from_slaver2
scp authorized_keyshadoop@slaver1:~/.ssh/authorized_keys_from_slaver2
之后再进入master
在目录/home/hadoop/.ssh
cat authorized_keys_from_slaver1 >> authorized_keys
cat authorized_keys_from_slaver2 >> authorized_keys
进入slaver1
在目录/home/hadoop/.ssh
cat authorized_keys_from_master >> authorized_keys
cat authorized_keys_from_slaver2 >> authorized_keys
进入slaver2
在目录/home/hadoop/.ssh
cat authorized_keys_from_master >> authorized_keys
cat authorized_keys_from_slaver1 >> authorized_keys
第三部其实讲白一点就是直接将各个虚拟机对应的公钥,复制到~/.ssh/authorized_keys中
都要启动者,执行ssh虚拟机时才能运行
第四大步骤:
设置java环境变量(根据自己安装目录自己设置)
hadoop@master:~$sudo nano /etc/profile
再添加:
exportJAVA_HOME=/usr/lib/jvm/jdk1.7.0_51
exportJRE_HOME=/usr/lib/jvm/jdk1.7.0_51/jre
exportCLASSPATH=$CLASSPATH:$JAVA_HOME/lib:$JAVA_HOME/jre/lib
exportPATH=$JAVA_HOME/bin:$JAVA_HOME/jre/bin:$PATH:$HOME/bin
hadoop@master:~$sudo nano /etc/environment
再添加:
exportJAVA_HOME=/usr/lib/jvm/jdk1.7.0_51
exportJRE_HOME=/usr/lib/jvm/jdk1.7.0_51/jre
exportCLASSPATH=$CLASSPATH:$JAVA_HOME/lib:$JAVA_HOME/jre/lib
scp authorized_keys hadoop@slaver1:~/.ssh/authorized_keys_from_master
scp /usr/lib/jvm/jdk1.7.0_51 hadoop@slaver1:/usr/lib/jvm/jdk1.7.0_51
scp /usr/lib/jvm/jdk1.7.0_51 hadoop@slaver2:/usr/lib/jvm/jdk1.7.0_51
然后在设置相应的环境变量就行了
java –version或者javac查看
(sudo ufw disable)关闭防火墙,以后启动集群经常用到的命令
第五部分:
Hadoop
hadoop@master:~/hadoop-2.5.2$sudo mkdir hdfs
hadoop@master:~/hadoop-2.5.2$sudo mkdir hdfs/name
hadoop@master:~/hadoop-2.5.2$sudo mkdir hdfs/data
hadoop@master:~/hadoop-2.5.2$sudo mkdir tmp
修改权限,保证文件都是在hadoop下操作
hadoop@master:~/hadoop-2.5.2$sudo chown -R hadoop:hadoop hdfs
hadoop@master:~/hadoop-2.5.2$sudo chown -R hadoop:hadoop tmp
hadoop@master:~/hadoop-2.5.2/etc/hadoop$ nano hadoop-env.sh
修改里面的JAVA_HOME
hadoop@master:~/hadoop-2.5.2/etc/hadoop$ nano yarn-env.sh
修改里面的JAVA_HOME
hadoop@master:~/hadoop-2.5.2/etc/hadoop$ nano slaves
(这个文件里面保存所有slave节点)
hadoop@master:~/hadoop-2.5.2$ nano etc/hadoop/core-site.xml
内容如下:
编辑mapred-site.xml(需要复制mapred-site.xml.template,并命名为mapred-site.xml)
hadoop@master:~/hadoop-2.5.2$ nano etc/hadoop/mapred-site.xml
hadoop@master:~/hadoop-2.5.2/etc/hadoop$ nano yarn-site.xml
hadoop@master:~/hadoop-2.5.2/etc/hadoop$ nano hdfs-site.xml
内容如下:
scp -r /home/hadoop/hadoop-2.5.2 hadoop@slaver1:/home/hadoop
scp -r /home/hadoop/hadoop-2.5.2 hadoop@slaver2:/home/hadoop
hadoop@master:~/hadoop-2.5.2$bin/hdfs namenode –format或者bin/hadoop namenode format
[[email protected]]$ sbin/start-all.sh
通过查找/home/hadoop/hadoop-2.5.2/bin或者/home/hadoop/hadoop-2.5.2/sbin下的文件可以执行各种命令
PS:安装有一段时间了,突然间想写一下博客,然后就匆匆花了两个钟写了,有什么错漏的敬请纠正!基本上我是按照这个配置的,集群正常启动!