标签: Cloudera-Manager CDH Hadoop 部署 集群
摘要:管理、部署Hadoop集群需要工具,Cloudera Manager便是其一。本文详细记录了以在线方式部署CDH集群>的步骤。
以Apache Hadoop为主导的大数据技术的出现,使得中小型公司对于大数据的存储与处理也拥有了武器。
目前Hadoop比较流行的主要有2个版本,Apache和Cloudera版本。
Apache Hadoop:维护人员比较多,更新频率比较快,但是稳定性比较差。
Cloudera Hadoop(CDH):CDH:Cloudera公司的发行版本,基于Apache Hadoop的二次开发,优化了组件兼容和交互接口、简化安装配置、增加Cloudera兼容特性。
大数据平台CDH集群 cdh-5.70-rpm_install 详细过程
Part 1 install cdh server
1.1 Ready install resources
CentOS Linux release 7.1.1503 (Core) cm-5.7.0
cloudera-manager-installer.bin
adduser deploy
centos7.1 在安装过程时,网络配置,设置静态IP
vim /etc/sysconfig/network-scripts/ifcfg-eth0
设置静态ip,以及指定ip地址
DEVICE="eth0"
BOOTPROTO="static"
IPADDR=192.168.1.110
NM_CONTROLLED="yes"
ONBOOT="yes"
TYPE="Ethernet"
DNS1=8.8.8.8
DNS2=8.8.4.4
GATEWAY=192.168.1.1
1.2 网络配置(所有节点)
修改hostname为 cdh-server7
RedHat 的 hostname,就修改 /etc/sysconfig/network文件,将里面的 HOSTNAME 这一行修改成 HOSTNAME=NEWNAME,其中 NEWNAME 就是你要设置的 hostname。
Debian发行版的 hostname 的配置文件是 /etc/hostname
修改ip与主机名的对应关系
[root@cdh-server7 ~]# vi /etc/hosts #修改ip与主机名的对应关系:
192.168.181.190 node190
192.168.181.198 node198
192.168.181.196 node196
重启网络服务生效
[root@cdh-server7 ~]# service network restart
关闭SELINUX
查看SELINUX状态
[root@cdh-server7 ~]#getenforce
若 SELINUX 没有关闭,按照下述方式关闭
vi /etc/selinux/config
修改SELinux=disabled。重启生效,可以等后面都设置完了重启主机
# This file controls the state of SELinux on the system.
# SELINUX= can take one of these three values:
# enforcing - SELinux security policy is enforced.
# permissive - SELinux prints warnings instead of enforcing.
# disabled - SELinux is fully disabled.
SELINUX=disabled
# SELINUXTYPE= type of policy in use. Possible values are:
# targeted - Only targeted network daemons are protected.
# strict - Full SELinux protection.
SELINUXTYPE=targeted
[root@cdh-server7 ~]# ping www.baidu.com
以上步骤执行完毕后,重启主机
reboot
重启后再次检查下以上几点,确保环境配置正确。
1.3 卸载 openjdk (所有节点)
注意 : 如果没有openjdk, 则不需要卸载,默认 centos7 没有
[root@cdh-server7 deploy]# rpm -qa | grep java
[root@cdh-server7 deploy]# rpm -qa | grep jdk
# if exist java or jdk, uninstall, erase it. example under this...
[root@cdh-server7 deploy]# rpm -e --nodeps java-1.5.0-gcj-1.5.0.0-29.1.el6.x86_64
[root@cdh-server7 deploy]# rpm -e --nodeps java-1.6.0-openjdk-1.6.0.0-1.66.1.13.0.el6.x86_64
[root@cdh-server7 deploy]# rpm -e --nodeps java-1.7.0-openjdk-1.7.0.45-2.4.3.3.el6.x86_64
1.4 卸载 centOS7 默认mysql
[root@cdh-server7 deploy]# rpm -qa | grep mariadb
[root@cdh-server7 deploy]# rpm -e --nodeps mariadb-libs-5.5.41-2.el7_0.x86_64
1.5 Cloudera Manager安装
下载资源文件https://archive.cloudera.com/cm5/redhat/7/x86_64/cm/cloudera-manager.repo
将cloudera-manager.repo文件拷贝到所有节点的/etc/yum.repos.d/文件夹下
[root@node196 ]# cd /home/deploy/cdh
[root@node196 cdh]# wget https://archive.cloudera.com/cm5/redhat/7/x86_64/cm/cloudera-manager.repo
[root@cdh-server7 cdh]# mv cloudera-manager.repo /etc/yum.repos.d/
验证repo文件是否起效
yum list|grep cloudera
[root@cdh-server7 cdh]# yum list | grep cloudera
cloudera-manager-agent.x86_64 5.7.0-1.cm560.p0.54.el7 cloudera-manager
cloudera-manager-daemons.x86_64 5.7.0-1.cm560.p0.54.el7 cloudera-manager
cloudera-manager-server.x86_64 5.7.0-1.cm560.p0.54.el7 cloudera-manager
cloudera-manager-server-db-2.x86_64 5.7.0-1.cm560.p0.54.el7 cloudera-manager
enterprise-debuginfo.x86_64 5.7.0-1.cm560.p0.54.el7 cloudera-manager
oracle-j2sdk1.7.x86_64 1.7.0+update67-1 cloudera-manager
如果列出的不是你安装的版本,执行下面命令重试
yum clean all
yum list | grep cloudera
上传下列 rpm 包 到 [root@cdh-server7] 的 /home/deploy/cdh/cloudera-rpms (任意目录)
cd /home/deploy/cdh/cloudera-rpms
cloudera-manager-agent-5.7.0-1.cm560.p0.54.el7.x86_64.rpm
cloudera-manager-daemons-5.7.0-1.cm560.p0.54.el7.x86_64.rpm
cloudera-manager-server-5.7.0-1.cm560.p0.54.el7.x86_64.rpm ## agent not use
cloudera-manager-server-db-2-5.7.0-1.cm560.p0.54.el7.x86_64.rpm ## agent not use
enterprise-debuginfo-5.7.0-1.cm560.p0.54.el7.x86_64.rpm
oracle-j2sdk1.7-1.7.0+update67-1.x86_64.rpm
说明 : 可从https://archive.cloudera.com/cm5/redhat/7/x86_64/cm/5/RPMS/x86_64/ 下载相关rpm包
切换到rpms目录下,执行
[root@cdh-server7 cdh]# cd /home/deploy/cdh/cloudera-rpms/
[root@cdh-server7 cloudera-rpms]# yum -y install *.rpm
1.6 拷贝资源包到目标目录
从 http://archive.cloudera.com/cdh5/parcels/5.7.0/ 下载资源包
将之前下载的Parcel那3个文件拷贝到/opt/cloudera/parcel-repo目录下(如果没有该目录,请自行创建)
[root@cdh-server7 cdh]# cp CDH-5.7.0-1.cdh5.7.0.p0.45-el7.parcel /opt/cloudera/parcel-repo/CDH-5.7.0-1.cdh5.7.0.p0.45-el7.parcel
[root@cdh-server7 cdh]# cp CDH-5.7.0-1.cdh5.7.0.p0.45-el7.parcel.sha1 /opt/cloudera/parcel-repo/CDH-5.7.0-1.cdh5.7.0.p0.45-el7.parcel.sha
[root@cdh-server7 cdh]# cp manifest.json /opt/cloudera/parcel-repo/manifest.json
1.7 配置 java 环境变量
设置JAVA_HOME
[root@cdh-server7 cdh]#vi /etc/profile
export JAVA_HOME=/usr/java/jdk1.7.0_67-cloudera/
export PATH=$JAVA_HOME/bin:$PATH
[root@cdh-server7 cdh]#source /etc/profile
关闭防火墙
[root@cdh-server7 deploy]#systemctl stop firewalld.service #centos7,关闭防火墙
以上步骤执行完毕后,重启主机
reboot
1.8 安装CM (只在主节点)
以下两步骤请只在主节点上执行 :
-
进入该目录,给bin文件赋予可执行权限
[root@cdh-server7 cdh]# chmod a+x ./cloudera-manager-installer.bin
-
安装CM (该步骤, 可能是不需要的)
[root@cdh-server7 cdh]# ./cloudera-manager-installer.bin
开始启动server端
[root@cdh-server7 cdh]# cd /etc/init.d/
[root@cdh-server7 init.d]# ./cloudera-scm-server-db start
[root@cdh-server7 init.d]# ./cloudera-scm-server start
Starting cloudera-scm-server: [ OK ]
[root@cdh-server7 init.d]# tail -f /var/log/cloudera-scm-server/cloudera-scm-server.log
注意 :
机器重启之后,默认启动会导致异常
需要按照该先启动cloudera-scm-server-db,再启动cloudera-scm-server的顺序执行
1.9 浏览器访问验证(主节点)
CM安装成功后浏览器输入http://ip:7180, 用户名和密码都输入admin,进入web管理界面。
通过浏览器访问验证
http://192.168.181.190:7180/
如果打不开改网页,等待2分钟后。这个服务启动是需要一定时间的。
选择部署的版本,这里我们选择免费版的就可以了。
如果不会设置,那么请参考 最靠谱的安装指南 http://www.jianshu.com/p/57179e03795f
安装服务时,数据库选择默认的嵌入式数据库
Part 2 安装 agent
this step is similar, but I can't be sure, exactly right.
安装 agent ,可以在单独的机器,主节点,可以只当做主,随意你
为agent做配置,启动agent (所有节点)
agent 安装大部分最好和 server 安装步骤相同,避免启动后出问题
2.1 网络配置
修改ip与主机名的对应关系
[root@cdh-agent1 ~]# vi /etc/hosts #修改ip与主机名的对应关系:
192.168.181.190 cdh-server7(node190)
192.168.181.198 cdh-agent1(node198)
192.168.181.196 cdh-agent2(node196)
重启网络服务生效
[root@cdh-server7 ~]# service network restart
关闭SELINUX
查看SELINUX状态
[root@cdh-server7 ~]#getenforce
若 SELINUX 没有关闭,按照下述方式关闭
vi /etc/selinux/config
修改SELinux=disabled。重启生效,可以等后面都设置完了重启主机
# This file controls the state of SELinux on the system.
# SELINUX= can take one of these three values:
# enforcing - SELinux security policy is enforced.
# permissive - SELinux prints warnings instead of enforcing.
# disabled - SELinux is fully disabled.
SELINUX=disabled
# SELINUXTYPE= type of policy in use. Possible values are:
# targeted - Only targeted network daemons are protected.
# strict - Full SELinux protection.
SELINUXTYPE=targeted
[root@cdh-server7 ~]# ping www.baidu.com
2.2 卸载 openjdk (所有节点)
注意 : 如果没有openjdk, 则不需要卸载,默认 centos7 没有
[root@cdh-server7 deploy]# rpm -qa | grep java
[root@cdh-server7 deploy]# rpm -qa | grep jdk
# if exist java or jdk, uninstall, erase it. example under this...
[root@cdh-server7 deploy]# rpm -e --nodeps java-1.5.0-gcj-1.5.0.0-29.1.el6.x86_64
[root@cdh-server7 deploy]# rpm -e --nodeps java-1.6.0-openjdk-1.6.0.0-1.66.1.13.0.el6.x86_64
[root@cdh-server7 deploy]# rpm -e --nodeps java-1.7.0-openjdk-1.7.0.45-2.4.3.3.el6.x86_64
2.3 卸载centOS7默认的mysql
[root@cdh-server7 deploy]# rpm -qa | grep mariadb
[root@cdh-server7 deploy]# rpm -e --nodeps mariadb-libs-5.5.41-2.el7_0.x86_64
2.4 cloudera-manager.repo
上传cloudera-manager.repo 到 cdh-agent1
[root@cdh-agent1 cdh]# cp cloudera-manager.repo /etc/yum.repos.d/
transparent_hugepage
echo never > /sys/kernel/mm/transparent_hugepage/enabled
echo never > /sys/kernel/mm/transparent_hugepage/defrag
vi /etc/rc.local 在文件尾放入 如下两条语句
echo never > /sys/kernel/mm/transparent_hugepage/enabled
echo never > /sys/kernel/mm/transparent_hugepage/defrag
chmod +x /etc/rc.local
调整swappiness
echo 10 > /proc/sys/vm/swappiness
# vi /etc/sysctl.conf
vm.swappiness = 10
2.5 ~/cdh/cloudera-rpms
上传下列rpm包到cdh-agent1的/home/deploy/cdh/cloudera-rpms
cloudera-manager-agent-5.7.0-1.cm560.p0.54.el7.x86_64.rpm
cloudera-manager-daemons-5.7.0-1.cm560.p0.54.el7.x86_64.rpm
enterprise-debuginfo-5.7.0-1.cm560.p0.54.el7.x86_64.rpm
oracle-j2sdk1.7-1.7.0+update67-1.x86_64.rpm
[root@cdh-agent1 init.d]# cd /home/deploy/cdh/cloudera-rpms/
[root@cdh-agent1 init.d]# yum -y install *.rpm
设置JAVA_HOME
[root@cdh-server7 cdh]#vi /etc/profile
export JAVA_HOME=/usr/java/jdk1.7.0_67-cloudera/
export PATH=$JAVA_HOME/bin:$PATH
[root@cdh-server7 cdh]#source /etc/profile
关闭防火墙
[root@cdh-server7 deploy]#systemctl stop firewalld.service #centos7,关闭防火墙
以上步骤执行完毕后,重启主机
reboot
[root@cdh-agent1 init.d]# vi /etc/cloudera-scm-agent/config.ini
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
# Hostname of the CM server.
#server_host=localhost
server_host=cdh-server7(node190)
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
[root@cdh-server7 cdh]# cd /etc/init.d/
[root@cdh-server7 init.d]# ./cloudera-scm-agent start
Starting cloudera-scm-agent: [ OK ]
[root@cdh-server deploy]# tail -f /var/log//cloudera-scm-agent/cloudera-scm-agent.log
注意 :
安装YARN NodeManager失败时,需要删除 /yarn /var/lib/hadoop-yarn 目录再重新添加
CDH最靠谱的安装指南 : http://www.jianshu.com/p/57179e03795f
Part 3 恢复启动 Our 集群
3.1 确定 firewalld close
systemctl start firewalld.service#启动firewall
systemctl stop firewalld.service#停止firewall
systemctl disable firewalld.service#禁止firewall开机启动
注意 : 操作之前确定 firewalld 是关闭的
[root@node19x flag]$ vim /etc/rc.local (/etc/rc.local 对应貌似相对dir /ect/init.d)
1 #!/bin/bash
2 # THIS FILE IS ADDED FOR COMPATIBILITY PURPOSES
3 #
4 # It is highly advisable to create own systemd services or udev rules
5 # to run scripts during boot instead of using this file.
6 #
7 # In contrast to previous versions due to parallel execution during boot
8 # this script will NOT be run after all other services.
9 #
10 # Please note that you must run 'chmod +x /etc/rc.d/rc.local' to ensure
11 # that this script will be executed during boot.
12
13 touch /var/lock/subsys/local
14 echo never > /sys/kernel/mm/transparent_hugepage/enabled
15 echo never > /sys/kernel/mm/transparent_hugepage/defrag
16 service ntpd start
17 service elasticsearch start
3.2 启动server端、cm
only at server node
[root@cdh-server7 cdh]# cd /etc/init.d/
[root@cdh-server7 init.d]# ./cloudera-scm-server-db start
[root@cdh-server7 init.d]# ./cloudera-scm-server start
Starting cloudera-scm-server: [ OK ]
[root@cdh-server7 init.d]# tail -f /var/log/cloudera-scm-server/cloudera-scm-server.log
// 等待日志 7180 启动成功, 访问 : http://node190:7180/cmf/home
注意 :
机器重启之后,默认启动会导致异常
需要按照该先启动cloudera-scm-server-db,再启动cloudera-scm-server的顺序执行
一般以下 agent 是自动启动的
[root@node190 init.d]# ./cloudera-scm-agent start
cloudera-scm-agent is already running
node190:./cloudera-scm-agent start
node19x:./cloudera-scm-agent start
node19x:./cloudera-scm-agent start
...
3.3 CM页面上启动各服务
CM 页面上重启 service monitor
CM 页面上重启 host monitor
CM 页面上启动各项服务 (如 : ZK, Flume, YARN, HDFS, Hive, Sqoop, Spark etc..)
3.4 各个节点启动 ES
[deploy@node190 init.d]# ll
total 44
-rwxr-xr-x 1 root root 8671 Apr 2 04:52 cloudera-scm-agent
lrwxrwxrwx. 1 root root 58 Apr 18 16:55 elasticsearch -> /home/deploy/elasticsearch-1.7.1/bin/service/elasticsearch
-rw-r--r--. 1 root root 13948 Sep 16 2015 functions
-rwxr-xr-x. 1 root root 2989 Sep 16 2015 netconsole
-rwxr-xr-x. 1 root root 6630 Sep 16 2015 network
-rw-r--r--. 1 root root 1160 Apr 1 00:45 README
deploy
[deploy@node190 init.d]# ./elasticsearch start
[deploy@node19x init.d]# ./elasticsearch start
[deploy@node19x init.d]# ./elasticsearch start
...
http://node190:9200/_plugin/bigdesk/#cluster
等待同步数据完成,一般会很快,等待 Status 从 RED 变为 green 状态
http://node190:9200/_plugin/head/
3.5 启动 kibana
[deploy@node196 ~]#
cd /home/deploy/kibana-4.1.1-linux-x64
./bin/kibana > kibana.log 2>&1 & --@deploy