搭建有三台主机的Hadoop集群:
原材料: 笔记本、vm虚拟机、centOS镜像、CDH安装包
安装好linux系统后,为演示方便,我们将三台主机分别命名为hadoop0、hadoop1、hadoop2;
主机与ip对应关系:
Hostname |
ip |
hadoop0 |
192.168.1.100 |
Hadoop1 |
192.168.1.101 |
Hadoop2 |
192.168.1.102 |
以修改hadoop0为例:
你有两种选择:
①临时生效,重启后又恢复原值(个人不建议):
[root@localhost~]# hostname
localhost.localdomain
[root@localhost~]# hostname hadoop0
[root@localhost~]# hostname
hadoop0
②修改配置文件(永久生效):
[root@localhost~]# vi /etc/sysconfig/network
NETWORKING=yes #启动网络
NETWORKING_IPV6=no#关闭ipv6
HOSTNAME=hadoop0 #主机名
[root@localhost~]# setup
[root@hadooop0~]# cat /etc/sysconfig/network-scripts/ifcfg-eth0
DEVICE=eth0
BOOTPROTO=none
HWADDR=00:0c:29:17:e1:ba
IPV6INIT=yes
NM_CONTROLLED=yes
ONBOOT=yes
TYPE=Ethernet
UUID="525eaf08-af4a-411b-986e-990ddff828e4"
IPADDR=192.168.1.100
NETMASK=255.255.255.0
USERCTL=no
关防火墙:
① 临时生效,执行后立即起效
[root@hadoop0 ~]#service iptables stop
[root@hadoop0 ~]#service iptables status
iptables:Firewall is not running.
② 永久生效,重启后起效:
[root@hadoop0 ~]#chkconfig iptables off
[root@hadoop0 ~]#chkconfig iptables --list
iptables 0:off 1:off 2:off 3:off 4:off 5:off 6:off
关闭selinux:
① 临时生效:
[root@hadoop0 ~]#setenforce 0
[root@hadoop0 ~]#getenforce
Permissive
② 永久生效
修改/etc/selinux/config 文件
将SELINUX=enforcing改为SELINUX=disabled
重启机器即可
[root@hadoop0~]# vi /etc/selinux/config
# This file controlsthe state of SELinux on the system.
# SELINUX= can take oneof these three values:
# enforcing - SELinux security policy isenforced.
# permissive - SELinux prints warningsinstead of enforcing.
# disabled - No SELinux policy is loaded.
#SELINUX=enforcing
SELINUX=disabled
# SELINUXTYPE= can takeone of these two values:
# targeted - Targeted processes areprotected,
# mls - Multi Level Security protection.
SELINUXTYPE=targeted
[root@hadoop0~]# vi /etc/hosts
127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4
::1 localhost localhost.localdomainlocalhost6 localhost6.localdomain6
192.168.1.100 hadoop0
192.168.1.101 hadoop1
192.168.1.102hadoop2
这里在进行cdh安装的时候会有安装jdk这一步,所以可以跳过的。
不过如果希望自己安装,那还是可以看一下。看看linux多少位的,否则,64位的linux安装32的jdk有时需要依赖包,避免蛋疼,杰哥劝你还是老老实实的下64位的吧!
[root@hadoop0~]# getconf LONG_BIT
64
[root@hadoop0~]# ./jdk-6u45-linux-x64.bin
直到提示Done,安装完成。
修改配置文件:
经验告诉我们修改任何系统文件前先都要备份,免得你改乱了,以下该原则不在赘述!
[root@hadoop0usr]# vi /etc/profile
在最后加入
export JAVA_HOME=/usr/jdk1.6.0_45/
export PATH=$JAVA_HOME/bin:$PATH
exportCLASSPATH=.:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar
生效:
[root@hadoop0usr]# source /etc/profile
验证生效:
[root@hadoop0usr]# java -version
java version"1.6.0_45"
Java(TM) SE RuntimeEnvironment (build 1.6.0_45-b06)
JavaHotSpot(TM) 64-Bit Server VM (build 20.45-b01, mixed mode)
[root@hadoop0 ~]# ssh-keygen -t rsa
Generatingpublic/private rsa key pair.
Enter file in which tosave the key (/root/.ssh/id_rsa):
Created directory'/root/.ssh'.
Enter passphrase (emptyfor no passphrase):
Enter same passphraseagain:
Your identification hasbeen saved in /root/.ssh/id_rsa.
Your public key hasbeen saved in /root/.ssh/id_rsa.pub.
The key fingerprint is:
d4:76:ba:61:18:a0:bf:61:78:b3:65:a1:b4:a8:3a:47root@hadoop0
The key's randomartimage is:
+--[ RSA 2048]----+
| . |
| . . . |
| . . + o . |
| = + = o |
| o O S + |
| E. o B . o |
| .. o . |
|... |
|.o |
+-----------------+
[root@hadoop0 ~]#
[root@hadoop0~]# cd /root/.ssh/
[[email protected]]# ll
total8
-rw-------1 root root 1675 Oct 27 14:57 id_rsa
-rw-r--r-- 1 rootroot 394 Oct 27 14:57 id_rsa.pub
追加私钥到公钥中:
[root@hadoop0 .ssh]# cat ~/.ssh/id_rsa.pub >>~/.ssh/authorized_keys
[root@hadoop0 .ssh]# scp~/.ssh/authorized_keys root@hadoop1:~/.ssh/
root@hadoop1'spassword:
authorized_keys 100% 394 0.4KB/s 00:00
验证无密码访问:
[root@hadoop0 .ssh]#ssh hadoop2
Last login: Tue Oct 2714:49:38 2015 from 192.168.1.103
[root@hadoop2 ~]#
挂载光盘镜像到html目录:
[root@hadoop0 html]#mount -o loop CentOS-6.6-x86_64-bin-DVD1.iso /var/www/html/centosIso
[root@hadoop0html]# pwd
/var/www/html
[root@hadoop0 html]# cpcentosIso/ linuxiso
修改配置文件:
[[email protected]]# pwd
/etc/yum.repos.d
[[email protected]]# ll
total 24
-rw-r--r--. 1 root root1991 Oct 23 2014 CentOS-Base.repo
-rw-r--r--. 1 rootroot 647 Oct 23 2014 CentOS-Debuginfo.repo
-rw-r--r--. 1 rootroot 289 Oct 23 2014 CentOS-fasttrack.repo
-rw-r--r--. 1 rootroot 630 Oct 23 2014 CentOS-Media.repo
-rw-r--r--.1 root root 5394 Oct 23 2014CentOS-Vault.repo
CentOS-Base.repo 是yum网络源的配置文件 CentOS-Media.repo 是yum本地源的配置文件
修改 CentOS-Media.repo
[root@hadoop0 yum.repos.d]# viCentOS-Media.repo
[c6-media]
name=CentOS-$releasever- Media
baseurl=http://192.168.1.100/linuxiso/
gpgcheck=1
enabled=1
gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-CentOS-6
其它几台主机使用同样的配置文件,来读取hadoop0的http服务上的资源即可
[[email protected]]# mv CentOS-Base.repo CentOS-Base.repo.bak
否则执行yum install的时候,会报网络连接错误
[root@hadoop0 yum.repos.d]# yum clean all
[root@hadoop0 yum.repos.d]# yum makecache
Loaded plugins: fastestmirror, refresh-packagekit,security
Determining fastest mirrors
c6-media | 4.0kB 00:00
c6-media/group_gz | 216 kB 00:00
c6-media/filelists_db | 6.0 MB 00:00
c6-media/primary_db |4.5 MB 00:00
c6-media/other_db | 2.8MB 00:00
Metadata Cache Created
验证:
[root@hadoop0 yum.repos.d]# yum list
[root@hadoop0 yum.repos.d]# yum -y install samba
Loaded plugins: fastestmirror, refresh-packagekit,security
Setting up Install Process
Loading mirror speeds from cached hostfile
Resolving Dependencies
--> Running transaction check
---> Package samba.x86_64 0:3.6.23-12.el6 will beinstalled
--> Finished Dependency Resolution
………………此处略去部分日志
………………
Installed:
samba.x86_640:3.6.23-12.el6
Complete!
[root@hadoop0 yum.repos.d]#
[root@hadoop0 ~]# chkconfig httpd on
[root@hadoop0 cloudera]# service httpd start
Starting httpd: httpd: Could not reliably determine theserver's fully qualified domain name, using 192.168.1.100 for ServerName
[ OK ]
[root@hadoop0 cloudera]# service httpd status
httpd (pid 2510) isrunning...
[root@hadoop0 cloudera]#
创建到cloudera安装文件的软链接
[root@hadoop0 html]# ln -s /var/cloudera/ cloudera
先验证是否安装
[root@hadoop0 ~]# rpm -q ntp
ntp-4.2.6p5-1.el6.centos.x86_64
设置开机自启:
[root@hadoop0 ~]# chkconfig ntpd on
启动:
[root@hadoop0 ~]# service ntpd start
[root@hadoop0 ~]# ntpstat
unsynchronised
time serverre-starting
polling serverevery 8 s
设置时间并写入BIOS:
[root@hadoop0 ~]# date -s 11/03/2015
Tue Nov 3 00:00:00PST 2015
[root@hadoop0 ~]# date -s 10:50:00
Tue Nov 3 10:50:00PST 2015
[root@hadoop0 ~]# clock –w
修改服务端配置:
[root@hadoop0 ~]# vi /etc/ntp.conf
# the administrative functions.
restrict 127.0.0.1
restrict -6 ::1
# Hosts on local network are less restricted.
restrict192.168.1.0 mask 255.255.255.0 nomodify notrap
server 127.127.1.0
fudge 127.127.1.0stratum 1
includefile /etc/ntp/crypto/pw
keys /etc/ntp/keys
[root@hadoop0 ~]# service ntpd restart
[root@hadoop0 ~]# ntpstat
synchronised to local net at stratum 2
time correct towithin 7948 ms
polling serverevery 64 s
修改客户端配置(采用crontab定时执行的话可以不配置):
[root@hadoop1 ~]# vi /etc/ntp.conf
restrict 127.0.0.1
restrict -6 ::1
server192.168.1.100
手工同步:
[root@hadoop1 ~]# ntpdate 192.168.1.100
3 Nov 13:46:52ntpdate[16993]: step time server 192.168.1.100 offset 57626.250506 sec
[root@hadoop1 ~]# date
Tue Nov 3 13:47:23PST 2015
在客户端设置定时任务,每5分钟同步一次:
[root@hadoop1 ~]# crontab -e
*/5 * * * * /usr/sbin/ntpdate 192.168.1.100 >/root/ntpdate.log 2>&1
常见报错:
① the NTP socket is in use, exiting
[root@hadoop1 ~]# /usr/sbin/ntpdate 192.168.1.100
3 Nov 15:23:37ntpdate[20372]: the NTP socket is in use, exiting
[root@hadoop1 ~]# service ntpd stop
Shutting down ntpd: [ OK ]
[root@hadoop1 init.d]# /usr/sbin/ntpdate 192.168.1.100
3 Nov 15:27:50ntpdate[20474]: adjust time server 192.168.1.100 offset -0.000149 sec
Loaded plugins: fastestmirror, refresh-packagekit,security
Setting up Install Process
Loading mirror speeds from cached hostfile
cloudera-manager | 951 B 00:00
cloudera-manager/primary |4.1 kB 00:00
cloudera-manager
………………
Installed:
createrepo.noarch0:0.9.9-22.el6
Dependency Installed:
deltarpm.x86_640:3.5-0.5.20090913git.el6 python-deltarpm.x86_64 0:3.5-0.5.20090913git.el6
Complete!
[root@hadoop0 html]#
[root@hadoop0 html]# yum list | grep cloudera
cloudera-manager-agent.x86_64 5.3.8-1.cm538.p0.271.el6 cloudera-manager
cloudera-manager-daemons.x86_64 5.3.8-1.cm538.p0.271.el6 cloudera-manager
cloudera-manager-server.x86_64 5.3.8-1.cm538.p0.271.el6 cloudera-manager
cloudera-manager-server-db-2.x86_64 5.3.8-1.cm538.p0.271.el6 cloudera-manager
enterprise-debuginfo.x86_64 5.3.8-1.cm538.p0.271.el6 cloudera-manager
jdk.x86_64 2000:1.6.0_31-fcs cloudera-manager
oracle-j2sdk1.7.x86_64 1.7.0+update67-1 cloudera-manager
配置yum源:
安装createrepo服务
[root@hadoop0 cm5.3.0]# ll
total 516
-rwxr-xr-x 1 root root 514295 Oct 27 21:16cloudera-manager-installer.bin
drwxr-xr-x 2 root root 4096 Oct 27 23:06 repodata
-rw-r--r-- 1 root root 1690 Oct 27 21:16 RPM-GPG-KEY-cloudera
drwxr-xr-x 4 root root 4096 Oct 27 21:16 RPMS
[root@hadoop0 cloudera]# yum -y install createrepo
分别在cm5.3.0对应安装文件目录下执行 createrepo .创建 repodata 文件
[root@hadoop0 cm5.3.0]# createrepo .
Spawning worker 0 with 7 pkgs
Workers Finished
Gathering worker results
Saving Primary metadata
Saving file lists metadata
Saving other metadata
Generating sqlite DBs
Sqlite DBs complete
将cm配置文件的cloudera-manager.repo 放到目录 /etc/yum.repos.d
[root@hadoop0 yum.repos.d]# more cloudera-manager.repo
[cloudera-manager]
name = Cloudera Manager, Version 5.3.0
baseurl = http://192.168.1.100/cloudera/cm5.3.0
gpgkey =http://192.168.1.100/cloudera/cm5.3.0/RPM-GPG-KEY-cloudera
gpgcheck = 1
默认配置会下载到本地目录:
[root@hadoop0 parcel-repo]# pwd
/opt/cloudera/parcel-repo
[root@hadoop0 parcel-repo]# ll
total 1464928
-rw-r--r-- 1 root root 1500028710 Nov 1 09:05 CDH-5.3.8-1.cdh5.3.8.p0.5-el6.parcel
-rw-r--r-- 1 root root 41 Nov 1 09:05 CDH-5.3.8-1.cdh5.3.8.p0.5-el6.parcel.sha
-rw-r--r-- 1 root root 42655 Nov 1 09:05 manifest.json
drwxr-xr-x 2 root root 4096 Nov 1 19:19 repodata
[root@hadoop0 cloudera]# ll
total 512
-rw-r--r-- 1 root root 514295 Oct 27 19:37cloudera-manager-installer.bin
drwxr-xr-x 3 root root 4096 Oct 27 19:37 cm5.3.0
drwxr-xr-x 2 root root 4096 Oct 27 19:37 parcel
[root@hadoop0 cloudera]# chmod 755 cloudera-manager-installer.bin
执行安装,根据提示一路点击next就OK啦!
[root@hadoop0 cloudera]# ./cloudera-manager-installer.bin--skip_repo_package=1
至此,cm安装完成。
安装过程用cm的控制台页面,根据提示进行配置及添加相应服务,如下图:
如果在安装到最后,获取心跳的时候,总是报超时的话,那么,基本是下面几个原因:
Hosts文件配置问题,仔细查看hosts主机名与ip是否对应。(安装过程中因为一个主机的大小写没注意,看了好久,最后才找到)
检查防火墙,selinux是否关闭。
基本以上都做到了的话,那么你的安装会很顺畅。我觉得,有两点很重要。 一就是仔细,一定要仔细,按照步骤来,基本不会有太大问题;二就是有了问题多看日志,基本上根据日志,通过经验及查资料都能最后解决。
集群安装完成后,其它的Hbase,HDFS,Hive,Impala等相关资源,自己根据需要,在clouderaManager里面根据需要点击添加服务进行添加就好了。
[root@hadoop0 .ssh]# service cloudera-scm-server stop
Stopping cloudera-scm-server: [ OK ]
[root@hadoop0 .ssh]# service cloudera-scm-server-db stop
waiting for server to shut down.... done
server stopped
[root@hadoop0 .ssh]#
[root@hadoop0 .ssh]# yum remove cloudera-manager-server
[root@hadoop0 .ssh]# yum remove cloudera-manager-server-db
[root@localhost cmf]# service cloudera-scm-agenthard_stop_confirmed
Stopping cloudera-scm-agent: [ OK ]
supervisord is already stopped
[root@hadoop0 .ssh]# yum remove 'cloudera-manager-*' hadoop hue-common 'bigtop-*'
[root@hadoop0 .ssh]# rm -Rf /usr/share/cmf/var/lib/cloudera* /var/cache/yum/cloudera*
[root@hadoop0 .ssh]# rm /tmp/.scm_prepare_node.lock
安装过程会有如下提示:
Cloudera 建议将 /proc/sys/vm/swappiness 设置为 0。当前设置为 60。使用 sysctl 命令在运行时更改该设置并编辑 /etc/sysctl.conf 以在重启后保存该设置。您可以继续进行安装,但可能会遇到问题,Cloudera Manager 报告您的主机由于交换运行状况不佳。以下主机受到影响:hadoop[0-2]
处理:
[root@hadoop0 ~]# sysctl -q vm.swappiness
vm.swappiness = 60
也就是说,你的内存在使用到100-60=40%的时候,就开始出现有交换分区的使用。大家知道,内存的速度会比磁盘快很多,这样子会加大系统io,同时造的成大量页的换进换出,严重影响系统的性能,所以我们在操作系统层面,要尽可能使用内存,对该参数进行调整。
[root@hadoop0 etc]# vi /etc/sysctl.conf
#在最后增加一行
vm.swappiness = 10
[root@hadoop0 etc]# sysctl -q vm.swappiness
vm.swappiness = 10
修改 /etc 下面的profile,增加一行 ulimit -n 65535, 将最大并发打开文件数设置为65535
source /etc/profile生效环境变量。