目前Hadoop比较流行的主要有2个版本,Apache和Cloudera版本。
Cloudera Manager 是用于管理cdh集群的端到端应用程序,统一管理和安装。CDH除了可以通过cm安装也可以通过yum,tar,rpm安装。主要由如下几部分组成:
服务端/Server:
Cloudera Manager 的核心。主要用于管理 web server 和应用逻辑。它用于安装软件,配置,开始和停止服务,以及管理服务运行的集群。
代理/agent:
安装在每台主机上。它负责启动和停止进程,部署配置,触发安装和监控主机。
数据库/Database:
存储配置和监控信息。通常可以在一个或多个数据库服务器上运行的多个逻辑数据库。例如,所述的 Cloudera 管理器服务和监视,后台程序使用不同的逻辑数据库。
Cloudera Repository:由cloudera manager 提供的软件分发库。
客户端/Clients:
提供了一个与 Server 交互的接口。
主机名 | IP | CM管理软件 |
---|---|---|
nn01 | 192.168.18.110 | Cloudera Manager Server&Agent ,MariaDB |
dn01 | 192.168.18.111 | Cloudera Manager Agent |
dn02 | 192.168.18.112 | Cloudera Manager Agent |
dn03 | 192.168.18.113 | Cloudera Manager Agent |
编辑/etc/hostname,修改主机名,并使用命令hostname使其立刻生效。编辑文件/etc/hosts,增加如下内容。
192.168.18.110 nn01.yunlu.cn nn01
192.168.18.111 dn01.yunlu.cn dn01
192.168.18.112 dn02.yunlu.cn dn02
192.168.18.113 dn03.yunlu.cn dn03
# systemctl stop firewalld.service && systemctl disable firewalld.service
# sed -i 's#SELINUX=enforcing#SELINUX=disabled#g' /etc/selinux/config
# setenforce 0
chrony既可作时间服务器服务端,也可作客户端。chrony性能比ntp要好很多,且chrony配置简单、管理方便。 但是此次我们采用定时任务同步网络时间的方法。
# echo "$((RANDOM%60)) $((RANDOM%24)) * * * /usr/sbin/ntpdate time1.aliyun.com" >> /var/spool/cron/root
# echo never > /sys/kernel/mm/transparent_hugepage/defrag
# echo never > /sys/kernel/mm/transparent_hugepage/enabled
# echo "vm.swappiness = 10" >> /etc/sysctl.conf
# sysctl -p
# ssh-keygen -t rsa
# ssh-copy-id [nn01,dn01-dn03]
总共四次拷贝,每次拷贝按提示输入
yes
和相应节点的密码。
# wget https://archive.cloudera.com/cm6/6.2.0/redhat7/yum/cloudera-manager.repo -P /etc/yum.repos.d/
# rpm --import https://archive.cloudera.com/cm6/6.2.0/redhat7/yum/RPM-GPG-KEY-cloudera
使用在线安装会比较慢,建议先把需要的rpm下载下来,进行离线安装或者建私有仓库,涉及下面三个软件包:
cloudera-manager-agent-6.2.0-968826.el7.x86_64.rpm
cloudera-manager-server-6.2.0-968826.el7.x86_64.rpm
cloudera-manager-daemons-6.2.0-968826.el7.x86_64.rpm
# rpm -ivh jdk-8u202-linux-x64.rpm
建议离线安装,把rpm包下载到服务器上面,传到其他节点一份,再本地安装,速度会快很多。
# yum localinstall cloudera-manager-daemons-6.2.0-968826.el7.x86_64.rpm -y
# yum localinstall cloudera-manager-agent-6.2.0-968826.el7.x86_64.rpm -y
# yum localinstall cloudera-manager-server-6.2.0-968826.el7.x86_64.rpm -y
# yum localinstall cloudera-manager-daemons-6.2.0-968826.el7.x86_64.rpm -y
# yum localinstall cloudera-manager-agent-6.2.0-968826.el7.x86_64.rpm -y
此次安装 mysql 是按照官网教程安装的,链接地址:
https://www.cloudera.com/documentation/enterprise/6/6.0/topics/cm_ig_mysql.html#cmig_topic_5_5
# wget http://repo.mysql.com/mysql-community-release-el7-5.noarch.rpm
# rpm -ivh mysql-community-release-el7-5.noarch.rpm
# yum install mysql-server -y
# systemctl start mysqld
[mysqld]
datadir=/var/lib/mysql
socket=/var/lib/mysql/mysql.sock
transaction-isolation = READ-COMMITTED
# Disabling symbolic-links is recommended to prevent assorted security risks;
# to do so, uncomment this line:
symbolic-links = 0
key_buffer_size = 32M
max_allowed_packet = 32M
thread_stack = 256K
thread_cache_size = 64
query_cache_limit = 8M
query_cache_size = 64M
query_cache_type = 1
max_connections = 550
#expire_logs_days = 10
#max_binlog_size = 100M
#log_bin should be on a disk with enough free space.
#Replace '/var/lib/mysql/mysql_binary_log' with an appropriate path for your
#system and chown the specified folder to the mysql user.
log_bin=/var/lib/mysql/mysql_binary_log
#In later versions of MySQL, if you enable the binary log and do not set
#a server_id, MySQL will not start. The server_id must be unique within
#the replicating group.
server_id=1
binlog_format = mixed
read_buffer_size = 2M
read_rnd_buffer_size = 16M
sort_buffer_size = 8M
join_buffer_size = 8M
# InnoDB settings
innodb_file_per_table = 1
innodb_flush_log_at_trx_commit = 2
innodb_log_buffer_size = 64M
innodb_buffer_pool_size = 4G
innodb_thread_concurrency = 8
innodb_flush_method = O_DIRECT
innodb_log_file_size = 512M
[mysqld_safe]
log-error=/var/log/mysqld.log
pid-file=/var/run/mysqld/mysqld.pid
sql_mode=STRICT_ALL_TABLES
以上配置的含义,请参考本节开头的文档。
# /usr/bin/mysql_secure_installation
# systemctl enable mysqld
用于各节点连接数据库。
# wget https://dev.mysql.com/get/Downloads/Connector-J/mysql-connector-java-5.1.46.tar.gz
# tar xf mysql-connector-java-5.1.46.tar.gz
# mkdir -p /usr/share/java/
# cd mysql-connector-java-5.1.46
# cp mysql-connector-java-5.1.46-bin.jar /usr/share/java/mysql-connector-java.jar
Service | Database | User |
---|---|---|
Cloudera Manager Server | scm | scm |
Activity Monitor | amon | amon |
Reports Manager | rman | rman |
Sentry Server | sentry | sentry |
Cloudera Navigator Audit Server | nav | nav |
Cloudera Navigator Metadata Server | navms | navms |
Hive Metastore Server | hive | hive |
Hue | hue | hue |
Oozie | oozie | oozie |
cdh.sql
文件中CREATE DATABASE scm DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci;
GRANT ALL ON scm.* TO 'scm'@'%' IDENTIFIED BY 'scm';
CREATE DATABASE amon DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci;
GRANT ALL ON amon.* TO 'amon'@'%' IDENTIFIED BY 'amon';
CREATE DATABASE rman DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci;
GRANT ALL ON rman.* TO 'rman'@'%' IDENTIFIED BY 'rman';
CREATE DATABASE hue DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci;
GRANT ALL ON hue.* TO 'hue'@'%' IDENTIFIED BY 'hue';
CREATE DATABASE hive DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci;
GRANT ALL ON hive.* TO 'hive'@'%' IDENTIFIED BY 'hive';
CREATE DATABASE sentry DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci;
GRANT ALL ON sentry.* TO 'sentry'@'%' IDENTIFIED BY 'sentry';
CREATE DATABASE nav DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci;
GRANT ALL ON nav.* TO 'nav'@'%' IDENTIFIED BY 'nav';
CREATE DATABASE navms DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci;
GRANT ALL ON navms.* TO 'navms'@'%' IDENTIFIED BY 'navms';
CREATE DATABASE oozie DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci;
GRANT ALL ON oozie.* TO 'oozie'@'%' IDENTIFIED BY 'oozie';
# mysql -uroot -p < ./cdh.sql
# /opt/cloudera/cm/schema/scm_prepare_database.sh mysql scm scm
接着,输入scm数据库密码
CM安装成功之后,接下来我们就可以通过CM安装CDH的方式构建企业大数据平台。所以首先需要把CDH的parcels包下载到CM主服务器上。同样的,我们为了加速我们的安装,我们可以把需要下载的软件包提前下载下来,也可以创建CDH私有仓库。
# cd /opt/cloudera/parcel-repo
# wget https://archive.cloudera.com/cdh6/6.2.0/parcels/CDH-6.2.0-1.cdh6.2.0.p0.967373-el7.parcel
# wget https://archive.cloudera.com/cdh6/6.2.0/parcels/manifest.json
# sha1sum CDH-6.2.0-1.cdh6.2.0.p0.967373-el7.parcel | awk '{ print $1 }' > CDH-6.2.0-1.cdh6.2.0.p0.967373-el7.parcel.sha
# chown -R cloudera-scm:cloudera-scm /opt/cloudera/parcel-repo/*
# systemctl start cloudera-scm-server
如果启动中有什么问题,可以查看日志。
# tail -f /var/log/cloudera-scm-server/cloudera-scm-server.log
在最后显示的日志中,有显示启动监听的端口。
Started ServerConnector@da518cb{SSL,[ssl, http/1.1]}{0.0.0.0:7183}
Started ServerConnector@a77165b{HTTP/1.1,[http/1.1]}{0.0.0.0:7180}