CDH 简易版离线安装
一、虚拟机搭建
准备一台32G内存的电脑,安装虚拟机VMware-workstation。虚拟机下载地址:http://download3.vmware.com/software/wkst/file/VMware-Workstation-Full-14.1.2-8497320.x86_64.bundle。根据自己的电脑系统下载不同的版本,我下载的是VMware-Workstation-Full-14.1.2-8497320.x86_64.bundle。安装完虚拟机后,下载操作系统镜像CentOS-7-x86_64-DVD-1804.iso(这是我选择的版本,你们可以选择不同的版本),创建一个新的虚拟机,至于虚拟机如何创建请自行解决。
经过上面的一系列的操作,目前拥有三台虚拟机
master 内存 16G 磁盘 150G
slave1 内存 6G 磁盘 150G
slave2 内存 6G 磁盘 150G
二、虚拟机配置
1.修改所有的主机名,这样便于管理。
hostnamectl set-hostname master
hostnamectl set-hostname slave1
hostnamectl set-hostname slave2
2.配置静态IP
首先,选择NAT网络连接模式
然后,点击Edit编辑虚拟机网络设置,进入VMware network edit ,选中vmnet8 ,将Use local DHCP service to distribute IP addresses to VMs 前面的勾去掉。
接着,进入 /etc/sysconfig/network-scripts中查看现有的配置文件然后修改其中的配置文件,其中有个类似ifcfg-enth0的文件是你的网络名字
TYPE=Ethernet
PROXY_METHOD=none
BROWSER_ONLY=no
BOOTPROTO=static
DEFROUTE=yes
IPV4_FAILURE_FATAL=no
IPV6INIT=yes
IPV6_AUTOCONF=yes
IPV6_DEFROUTE=yes
IPV6_FAILURE_FATAL=no
IPV6_ADDR_GEN_MODE=stable-privacy
NAME=$'\751\605\615\747\675\656 1'
UUID=2bfdf6df-9fd6-44e3-ade7-5a397cf8d2e4
ONBOOT=yes
IPADDR=172.16.247.135
GATEWAY=172.16.247.2
NETMASK=255.255.255.0
PREFIX=24
上面主要修改红色字体部分,其中BOOTPROTO=static 表示静态,IPADDR=172.16.247.135 表示静态IP地址
最后,保存退出,执行
重启网络
service network restart
查看IP
ifconfig
ping网络
ping www.baidu.com
3.编辑hosts 文件 添加ip地址
vi /etc/hosts
添加以下配置,你对应的三台机器的IP地址和对应的主机名
172.16.247.135 master
172.16.247.132 slave1
172.16.247.136 slave2
然后将这个文件分别拷贝到各个节点上
scp /etc/hosts root@slave1:/etc/hosts
scp /etc/hosts root@slave2:/etc/hosts
4.配置SSH免密登陆
主要分为两个步骤:首先在所有的节点生成公钥
ssh-keygen -t rsa
然后将所有的节点执行拷贝公钥
ssh-copy-id root@master
ssh-copy-id root@slave1
ssh-copy-id root@slave2
当然也可以公钥添加到认证文件中,并设置authorized_keys的访问权限:https://blog.csdn.net/johnzhc/article/details/81119030
5.关闭selinux和防火墙
vi /etc/selinux/config
SELINUX=disabled
[hadoop@master network-scripts]$ cat /etc/selinux/config
# This file controls the state of SELinux on the system.
# SELINUX= can take one of these three values:
# enforcing - SELinux security policy is enforced.
# permissive - SELinux prints warnings instead of enforcing.
# disabled - No SELinux policy is loaded.
SELINUX=disabled
# SELINUXTYPE= can take one of three two values:
# targeted - Targeted processes are protected,
# minimum - Modification of targeted policy. Only selected processes are protected.
# mls - Multi Level Security protection.
SELINUXTYPE=targeted
关闭防火墙和查看防火墙状态:
systemctl stop firewalld
systemctl disable firewalld
systemctl status firewalld
6.安装NTP时间同步
yum install -y ntp #安装ntp服务(所有节点)
vi /etc/ntp.conf #编辑ntp服务的配置文件(所有节点)
主节点master的ntp.conf修改红色部分,蓝色要注释掉
# Note: Monitoring will not be disabled with the limited restriction flag.
#disable monitor
restrict default nomodify
restrict default nomodify notrap
server 127.127.1.0
fudge 127.127.1.0 stratum 10
includefile /etc/ntp/crypto/pw
keys /etc/ntp/keys
# 中国这边最活跃的时间服务器 : http://www.pool.ntp.org/zone/cn
server 0.cn.pool.ntp.org
server 0.asia.pool.ntp.org
server 3.asia.pool.ntp.org# allow update time by the upper server
# 允许上层时间服务器主动修改本机时间
restrict 0.cn.pool.ntp.org nomodify notrap noquery
restrict 0.asia.pool.ntp.org nomodify notrap noquery
restrict 3.asia.pool.ntp.org nomodify notrap noquery# Undisciplined Local Clock. This is a fake driver intended for backup
# and when no outside source of synchronized time is available.
# 外部时间服务器不可用时,以本地时间作为时间服务
从节点slave的ntp.conf修改红色部分,紫色要注释掉
# with symmetric key cryptography.
keys /etc/ntp/keys# Specify the key identifiers which are trusted.
#trustedkey 4 8 42# Specify the key identifier to use with the ntpdc utility.
#requestkey 8# Specify the key identifier to use with the ntpq utility.
#controlkey 8# Enable writing of statistics records.
#statistics clockstats cryptostats loopstats peerstats# Disable the monitoring facility to prevent amplification attacks using ntpdc
# monlist command when default restrict does not include the noquery flag. See
# CVE-2013-5211 for more details.
# Note: Monitoring will not be disabled with the limited restriction flag.
#disable monitor
server master prefer #master 是指你的主机名
restrict default nomodify notrap nopeer noquery
restrict -6 default kod nomodify notrap nopeer noquery
设置开机启动ntp服务
#关闭chronyd服务
systemctl disable chronyd.service
#开机自启动
systemctl enable ntpd.service
以上都配置完成后,执行
systemctl start ntpd #开启ntp服务
ntpstat #查看ntp运行状态
synchronised to NTP server (172.16.247.135) at stratum 3
time correct to within 350 ms
polling server every 1024 s #出现这个表示成功同步
7.卸载Centos 系统自带的JDK
rpm -qa | grep jdk #查看系统自带的jdk
yum -y remove xxjdk #删除所有的jdk
8.CM 和CDH下载以及JDK和java驱动
Cloudera Manager下载地址:http://archive.cloudera.com/cm5/cm/5/cloudera-manager-centos7-cm5.15.0_x86_64.tar.gz
CDH安装包地址:http://archive.cloudera.com/cdh5/parcels/latest/,由于我们的操作系统为CentOS7.2,需要下载以下文件:
CDH-5.15.0-1.cdh5.15.0.p0.21-el7.parcel
CDH-5.15.0-1.cdh5.15.0.p0.21-el7.parcel.sha1
manifest.json
JDK 可以去官网下载 下载地址:http://www.oracle.com/technetwork/java/javase/downloads/jdk8-downloads-2133151.html
下载版本为: jdk-8u172-linux-x64.rpm
mysql的java驱动 下载地址:https://dev.mysql.com/downloads/connector/j/5.1.html
下载版本为:mysql-connector-java-5.1.46.tar.gz
9.安装CM和jdk以及mysql驱动
将下载好的安装包分发到各个节点上,并解压缩。
1.将和cloudera-manager-daemons-5.15.0-1.cm5150.p0.62.el7.x86_64.rpm以及cloudera-manager-server-5.15.0-1.cm5150.p0.62.el7.x86_64.rpm三个安装包传入管理节点(master节点)/tmp 目录下,当然其他目录也可以。但是tmp目录可以使得解压后的rpm包重启后删除不占内存。
2.将cloudera-manager-agent-5.15.0-1.cm5150.p0.62.el7.x86_64.rpm和cloudera-manager-daemons-5.15.0-1.cm5150.p0.62.el7.x86_64.rpm两个安装包传入所有从节点上(slave1和slave2节点)的/tmp目录下
3.将jdk-8u172-linux-x64.rpm安装包传入所有节点上/tmp目录,复制语句类似下面:
scp cloudera-manager-agent-5.15.0-1.cm5150.p0.62.el7.x86_64.rpm root@master:/tmp
4.然后解压所有对应的安装包(所有节点)
yum localinstall *.rpm
5.配置JAVA_HOME变量(所有节点)
echo "JAVA_HOME=/usr/java/latest/" >> /etc/environment
6.安装mysql驱动程序
将mysql-connector-java-5.1.46.tar.gz解压mysql-connector-java-5.1.46后将解压后包中的mysql-connector-java-5.1.46-bin.jar重命名为mysql-connector-java.jar传入 /usr/share/java目录里面。
tar mysql-connector-java-5.1.46.tar.gz sudo mkdir -p /usr/share/java cd mysql-connector-java-5.1.46 sudo cp mysql-connector-java-5.1.46-bin.jar /usr/share/java/mysql-connector-java.jar
10.数据库安装
1.在master节点安装MariaDB(Mysql)这里安装MariDB,若要安装mysql,可参考https://blog.csdn.net/johnzhc/article/details/81119030。
sudo yum install mariadb-server #安装maridb sudo systemctl enable mariadb #设置开机启动 sudo systemctl start mariadb #启动mariadb sudo /usr/bin/mysql_secure_installation #配置mariadb
可参考文档https://www.cloudera.com/documentation/cdh/5-1-x/CDH5-Installation-Guide/CDH5-Installation-Guide.html
2.为CDH创建数据库和用户
mysql -u root -p
输入密码登陆mysql ,然后创建多个数据库,并完成授权。
CREATE DATABASE scm DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci; GRANT ALL ON scm.* TO 'scm'@'%' IDENTIFIED BY 'scm'; CREATE DATABASE amon DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci; GRANT ALL ON amon.* TO 'amon'@'%' IDENTIFIED BY 'amon'; CREATE DATABASE rman DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci; GRANT ALL ON rman.* TO 'rman'@'%' IDENTIFIED BY 'rman'; CREATE DATABASE hue DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci; GRANT ALL ON hue.* TO 'hue'@'%' IDENTIFIED BY 'hue'; CREATE DATABASE hive DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci; GRANT ALL ON hive.* TO 'hive'@'%' IDENTIFIED BY 'hive'; CREATE DATABASE sentry DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci; GRANT ALL ON sentry.* TO 'sentry'@'%' IDENTIFIED BY 'sentry'; CREATE DATABASE oozie DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci; GRANT ALL ON oozie.* TO 'oozie'@'%' IDENTIFIED BY 'oozie';
最后,退出数据库进行数据库的初始化,执行语句类似下面。
exit #退出数据库 /user/share/cmf/schema/scm_prepare_database.sh (databaseType) (databaseName) (databaseuser) (databasepassword)
例如scm数据库: /usr/share/cmf/schema/scm_prepare_database.sh mysql scm scm scm
11.安装CDH
1.在master节点创建parcel-repo仓库
mkdir -p /opt/cloudera/parcel-repo chown cloudera-scm:cloudera-scm /opt/cloudera/parcel-repo
2.将CDH安装包复制到/opt/cloudera/parcel-repo 目录下。
CDH-5.15.0-1.cdh5.15.0.p0.21-el7.parcel
CDH-5.15.0-1.cdh5.15.0.p0.21-el7.parcel.sha1
manifest.json
然后将CDH-5.15.0-1.cdh5.15.0.p0.21-el7.parcel.sha1 重命名CDH-5.15.0-1.cdh5.15.0.p0.21-el7.parcel.sha
3.修改slave1和slave2的/etc/cloudera-scm-agent/config.ini
将server_host改为管理节点的网络名本例为master
4.分别启动cloudera-scm-server和cloudera-scm-agent
在主节点启动agent和server执行以下命令
systemctl start cloudera-scm-agent systemctl start cloudera-scm-server
在slave1和slave2执行以下代码
systemctl start cloudera-scm-agent
5.进入http://master:7180,默认的用户名和密码均为admin开始添加集群,下面是一种添加服务的顺序。
hdfs-> yarn-> hive-> impala-> zookeeper-> hbase-> oozie-> hue->sqoop->kafka->spark
12.总结
安装过程主要遇到的坑:
1.静态IP的配置
2.时间同步ntp服务
3.安装服务的对应文件夹的权限问题。主要就是查看日志更改对应文件的权限:
chmod 777 xxx chown root XXX
4.kafka安装
在安装界面点击主机
点击parcel
点击KAFKA分配,并激活
然后添加kafka服务,并再配置的设置如下的参数:
kafak mirrormaker:
Destination Broker list slave1:9092
source list slave1:9092
topical whitelist slave1:9092
kafak Broker
Advertiesd Host slave1
java heap size of broker 256