本文是在CentOS7.4 下进行CDH6集群的完全离线部署。CDH5集群与CDH6集群的部署区别比较大。
说明:本文内容所有操作都是在root用户下进行的。
首先一些安装CDH6集群的必须文件要先在外网环境先下载好。
CM6 RPM:https://archive.cloudera.com/cm6/6.3.0/redhat7/yum/RPMS/x86_64/
需要下载该链接下的所有RPM文件,保存到cloudera-repos
目录下。
ASC文件:https://archive.cloudera.com/cm6/6.3.0/allkeys.asc
同时还需要下载一个asc文件,同样保存到cloudera-repos
目录下:
[root@node01 upload]# tree cloudera-repos/
cloudera-repos/
├── allkeys.asc
├── cloudera-manager-agent-6.3.0-1281944.el7.x86_64.rpm
├── cloudera-manager-daemons-6.3.0-1281944.el7.x86_64.rpm
├── cloudera-manager-server-6.3.0-1281944.el7.x86_64.rpm
├── cloudera-manager-server-db-2-6.3.0-1281944.el7.x86_64.rpm
├── enterprise-debuginfo-6.3.0-1281944.el7.x86_64.rpm
└── oracle-j2sdk1.8-1.8.0+update181-1.x86_64.rpm
要求使用5.1.26以上版本的jdbc驱动,可点击这里直接下载mysql-connector-java-5.1.47.tar.gz
注意:不要尝试使用FTP搭建CM的YUM库!
首先安装httpd
和createrepo
:
yum -y install httpd createrepo
启动httpd
服务并设置开机自启动:
systemctl start httpd
systemctl enable httpd
然后进入到前面准备好的存放Cloudera Manager RPM包的目录cloudera-repos
下:
cd /data6/upload/cloudera-repos/
生成RPM元数据:
createrepo .
chmod 777 -R cloudera-repos
然后将cloudera-repos
目录移动到httpd的html目录下:
mv cloudera-repos /var/www/html/
确保可以通过浏览器查看到这些RPM包:
接着在创建cm6的repo文件(每个节点都需要配置):
cd /etc/yum.repos.d
vim cloudera-manager.repo
添加如下内容:
[cloudera-manager]
name=Cloudera Manager 6.3.0
baseurl=http://node01/cloudera-repos/
gpgcheck=0
enabled=1
保存,退出,然后执行yum clean all && yum makecache
命令:
[root@master02 ~]# yum clean all && yum makecache
Loaded plugins: fastestmirror, langpacks
Cleaning repos: ChinaUnicom-Packages cloudera-manager
Cleaning up everything
Maybe you want: rm -rf /var/cache/yum, to also free up space taken by orphaned data from disabled or removed repos
Loaded plugins: fastestmirror, langpacks
ChinaUnicom-Packages | 3.6 kB 00:00:00
cloudera-manager | 2.9 kB 00:00:00
(1/7): ChinaUnicom-Packages/group_gz | 156 kB 00:00:00
(2/7): ChinaUnicom-Packages/filelists_db | 3.1 MB 00:00:00
(3/7): ChinaUnicom-Packages/primary_db | 3.1 MB 00:00:00
(4/7): ChinaUnicom-Packages/other_db | 1.2 MB 00:00:00
(5/7): cloudera-manager/filelists_db | 118 kB 00:00:00
(6/7): cloudera-manager/other_db | 1.0 kB 00:00:00
(7/7): cloudera-manager/primary_db | 8.6 kB 00:00:00
Determining fastest mirrors
Metadata Cache Created
这一步只需要在CM Server节点上操作。
执行下面的命令:
# 安装openjdk8
yum install oracle-j2sdk1.8
# 安装 cm manager(只需在server节点安装)
yum install cloudera-manager-daemons cloudera-manager-agent cloudera-manager-server
将会需要很多依赖包,所以说还是有必要搭一个局域网内yum源的,或者手动安装rpm包
Cloudera Manager Server安装完成后,进入到本地Parcel存储库目录:
cd /opt/cloudera/parcel-repo
将第一部分下载的CDH parcels文件上传至该目录下,然后执行修改sha文件:
mv /data6/upload/parcels/* /opt/cloudera/parcel-repo/
mv CDH-6.3.0-1.cdh6.3.0.p0.1279813-el7.parcel.sha1 CDH-6.3.0-1.cdh6.3.0.p0.1279813-el7.parcel.sha
然后执行下面的命令修改文件所有者:
chown -R cloudera-scm:cloudera-scm /opt/cloudera/parcel-repo
最终/opt/cloudera/parcel-repo
目录内容如下:
├── CDH-6.3.0-1.cdh6.3.0.p0.1279813-el7.parcel
├── CDH-6.3.0-1.cdh6.3.0.p0.1279813-el7.parcel.sha
└── manifest.json
MySQL的安装在环境准备部分中已经有说明,这里就跳过MySQL安装了。
CDH官方给的有一份推荐的MySQL的配置内容:
[mysqld]
datadir=/var/lib/mysql
socket=/var/lib/mysql/mysql.sock
transaction-isolation = READ-COMMITTED
# Disabling symbolic-links is recommended to prevent assorted security risks;
# to do so, uncomment this line:
symbolic-links = 0
key_buffer_size = 32M
max_allowed_packet = 32M
thread_stack = 256K
thread_cache_size = 64
query_cache_limit = 8M
query_cache_size = 64M
query_cache_type = 1
max_connections = 550
#expire_logs_days = 10
#max_binlog_size = 100M
#log_bin should be on a disk with enough free space.
#Replace '/var/lib/mysql/mysql_binary_log' with an appropriate path for your
#system and chown the specified folder to the mysql user.
log_bin=/var/lib/mysql/mysql_binary_log
#In later versions of MySQL, if you enable the binary log and do not set
#a server_id, MySQL will not start. The server_id must be unique within
#the replicating group.
server_id=1
binlog_format = mixed
read_buffer_size = 2M
read_rnd_buffer_size = 16M
sort_buffer_size = 8M
join_buffer_size = 8M
# InnoDB settings
innodb_file_per_table = 1
innodb_flush_log_at_trx_commit = 2
innodb_log_buffer_size = 64M
innodb_buffer_pool_size = 4G
innodb_thread_concurrency = 8
innodb_flush_method = O_DIRECT
innodb_log_file_size = 512M
[mysqld_safe]
log-error=/var/log/mysqld.log
pid-file=/var/run/mysqld/mysqld.pid
sql_mode=STRICT_ALL_TABLES
从前面下载好的mysql-connector-java-5.1.47.tar.gz
包中解压出mysql-connector-java-5.1.47-bin.jar
文件,将mysql-connector-java-5.1.47-bin.jar
文件上传至CM Server节点上的/usr/share/java/
目录下并重命名为mysql-connector-java.jar
(如果/usr/share/java/
目录不存在,需要手动创建):
tar zxvf mysql-connector-java-5.1.47.tar.gz
mkdir -p /usr/share/java/
cp mysql-connector-java-5.1.47-bin.jar /usr/share/java/mysql-connector-java.jar
根据所需要安装的服务参照下表创建对应的数据库以及数据库用户,数据库必须使用utf8编码,创建数据库时要记录好用户名及对应密码:
服务名 | 数据库名 | 用户名 |
---|---|---|
Cloudera Manager Server | scm | scm |
Activity Monitor | amon | amon |
Reports Manager | rman | rman |
Hue | hue | hue |
Hive Metastore Server | metastore | hive |
Sentry Server | sentry | sentry |
Cloudera Navigator Audit Server | nav | nav |
Cloudera Navigator Metadata Server | navms | navms |
Oozie | oozie | oozie |
创建数据库及对应用户:
# scm
CREATE DATABASE scm DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci;
GRANT ALL ON scm.* TO 'scm'@'%' IDENTIFIED BY 'scm';
# amon
CREATE DATABASE amon DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci;
GRANT ALL ON amon.* TO 'amon'@'%' IDENTIFIED BY 'amon';
# rman
CREATE DATABASE rman DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci;
GRANT ALL ON rman.* TO 'rman'@'%' IDENTIFIED BY 'rman';
# hue
CREATE DATABASE hue DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci;
GRANT ALL ON hue.* TO 'hue'@'%' IDENTIFIED BY 'hue';
# hive
CREATE DATABASE metastore DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci;
GRANT ALL ON metastore.* TO 'hive'@'%' IDENTIFIED BY 'hive';
# sentry
CREATE DATABASE sentry DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci;
GRANT ALL ON sentry.* TO 'sentry'@'%' IDENTIFIED BY 'sentry';
# nav
CREATE DATABASE nav DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci;
GRANT ALL ON nav.* TO 'nav'@'%' IDENTIFIED BY 'nav';
# navms
CREATE DATABASE navms DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci;
GRANT ALL ON navms.* TO 'navms'@'%' IDENTIFIED BY 'navms';
# oozie
CREATE DATABASE oozie DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci;
GRANT ALL ON oozie.* TO 'oozie'@'%' IDENTIFIED BY 'oozie';
# flush
FLUSH PRIVILEGES;
Cloudera Manager Server包含一个配置数据库的脚本。
mysql数据库与CM Server是同一台主机
执行命令:/opt/cloudera/cm/schema/scm_prepare_database.sh mysql scm scm
mysql数据库与CM Server不在同一台主机上
执行命令:/opt/cloudera/cm/schema/scm_prepare_database.sh mysql -h
[root@master02 ~]# /opt/cloudera/cm/schema/scm_prepare_database.sh mysql -h 10.172.54.51 --scm-host 10.172.54.52 scm scm
Enter SCM password:
JAVA_HOME=/usr/java/jdk1.8.0_181-cloudera
Verifying that we can write to /etc/cloudera-scm-server
Creating SCM configuration file in /etc/cloudera-scm-server
Executing: /usr/java/jdk1.8.0_181-cloudera/bin/java -cp /usr/share/java/mysql-connector-java.jar:/usr/share/java/oracle-connector-java.jar:/usr/share/java/postgresql-connector-java.jar:/opt/cloudera/cm/schema/../lib/* com.cloudera.enterprise.dbutil.DbCommandExecutor /etc/cloudera-scm-server/db.properties com.cloudera.cmf.db.
[ main] DbCommandExecutor INFO Successfully connected to database.
All done, your SCM database is configured correctly!
systemctl start cloudera-scm-server
然后等待Cloudera Manager Server启动,可能需要稍等一会儿,可以通过命令tail -f /var/log/cloudera-scm-server/cloudera-scm-server.log
去监控服务启动状态。
当看到INFO WebServerImpl:com.cloudera.server.cmf.WebServerImpl: Started Jetty server.
日志打印出来后,说明服务启动成功,可以通过浏览器访问Cloudera Manager WEB界面了。
打开浏览器,访问地址:http://
,默认账号和密码都为admin:
首先是Cloudera Manager的欢迎页面,点击页面右下角的【继续】按钮进行下一步:
勾选接受条款,点击【继续】进行下一步:
这里我就选择免费版了:
选择版本以后会出现第二个欢迎界面,不过这个是安装集群的欢迎页:
这一步是要搜索并选择用于安装CDH集群的主机,在主机名称后面的输入框中输入各个节点的hostname,中间使用英文逗号分隔开,然后点击搜索,在结果列表中勾选要安装CDH的节点即可:
Cloudera Manager Agent
这里选择自定义,填写上面使用httpd搭建好的Cloudera Manager YUM 库URL:
CDH and other software
如果我们之前的【配置本地Parcel存储库】步骤操作无误的话,这里会自动选择【使用Parcel】,并加载出CDH版本,确认无误后点击【继续】:
因此,不需要自己手动安装 Cloudera Manager Agent了
这一步骤我就不再勾选安装JDK了,因为我在环境准备部分已经安装过了。取消勾选,然后继续:
用于配置集群主机之间的SSH登录,填写root用户的密码,根据集群配置填写合适的【同时安装数量】值即可:
到这一步会自动进行节点Agent的安装,稍等一会儿,即可安装完成:
这一步同样是自动安装,分配步骤的速度主要取决于网络环境,耐心等待即可:
等待检查完成即可:
Cloudera 建议将 /proc/sys/vm/swappiness
设置为最大值 10。当前设置为 30。使用 sysctl
命令在运行时更改该设置并编辑 /etc/sysctl.conf
,以在重启后保存该设置。您可以继续进行安装,但 Cloudera Manager 可能会报告您的主机由于交换而运行状况不良。
临时修改:
sysctl vm.swappiness=10
cat /proc/sys/vm/swappiness
这里我们的修改已经生效,但是如果我们重启了系统,又会变成30.
永久修改:
在/etc/sysctl.conf
文件里添加如下参数:
vm.swappiness=10
或者:
echo 'vm.swappiness=10'>> /etc/sysctl.conf
已启用透明大页面压缩,可能会导致重大性能问题。请运行echo never > /sys/kernel/mm/transparent_hugepage/defrag
和echo never > /sys/kernel/mm/transparent_hugepage/enabled
以禁用此设置,然后将同一命令添加到 /etc/rc.local
等初始化脚本中,以便在系统重启时予以设置。
安装上面的提示执行即可;
这里我选择自定义服务,Zookeeper, HDFS,Yarn:
可以先安装基础组件,然后用到啥在安装啥
如果所有服务都安装,可能安装过程中会出现很多问题
CDH会自动给出一个角色分配,如果觉得不合理,我们可以手动调整一下,注意角色分配均衡:
Java HotSpot(TM) 64-Bit Server VM warning: ignoring option PermSize=128m; support was removed in 8.0
Exception in thread "main" java.lang.UnsatisfiedLinkError: Could not load library. Reasons: [Can't load library: /run/cloudera-scm-agent/process/68-cloudera-mgmt-SERVICEMONITOR/libleveldbjni-1.8.so, Can't load library: /run/cloudera-scm-agent/process/68-cloudera-mgmt-SERVICEMONITOR/libleveldbjni.so, no leveldbjni64-1.8 in java.library.path, no leveldbjni-1.8 in java.library.path, no leveldbjni in java.library.path, /run/cloudera-scm-agent/process/68-cloudera-mgmt-SERVICEMONITOR/libleveldbjni-64-1-4792575431304239050.8: libstdc++.so.6: : , /tmp/libleveldbjni-64-1-6079277982211108711.8: libstdc++.so.6: : ]
出错原因:当前节点的glibc升级有关。既然不存在leveldbjni的库,那便给他安装一个。
安装leveldbjni库的方式非常有趣:
1) 首先下载leveldbjni-all-1.8.jar
2)解压该jar包,在\META-INF\native\linux64目录下找到libleveldbjni.so文件
如果因为其他原因,需要卸载Cloudera Manager,在各节点执行如下步骤即可。
systemctl stop cloudera-scm-server
systemctl stop cloudera-scm-agent
yum -y remove 'cloudera-manager-*'
yum clean all
umount cm_processes
umount /var/run/cloudera-scm-agent/process
rm -Rf /usr/share/cmf /var/lib/cloudera* /var/cache/yum/cloudera* /var/log/cloudera* /var/run/cloudera*
rm -rf /tmp/.scmpreparenode.lock
rm -Rf /var/lib/flume-ng /var/lib/hadoop* /var/lib/hue /var/lib/navigator /var/lib/oozie /var/lib/solr /var/lib/sqoop* /var/lib/zookeeper
rm -rf /var/lib/hadoop-* /var/lib/impala /var/lib/solr /var/lib/zookeeper /var/lib/hue /var/lib/oozie /var/lib/pgsql /var/lib/sqoop2 /data/dfs/ /data/impala/ /data/yarn/ /dfs/ /impala/ /yarn/ /var/run/hadoop-*/ /var/run/hdfs-*/ /usr/bin/hadoop* /usr/bin/zookeeper* /usr/bin/hbase* /usr/bin/hive* /usr/bin/hdfs /usr/bin/mapred /usr/bin/yarn /usr/bin/sqoop* /usr/bin/oozie /etc/hadoop* /etc/zookeeper* /etc/hive* /etc/hue /etc/impala /etc/sqoop* /etc/oozie /etc/hbase* /etc/hcatalog
systemctl stop mariadb
yum -y remove mariadb-*
rm -rf /var/lib/mysql
rm -rf /var/log/mysqld.log
rm -rf /usr/lib64/mysql
rm -rf /usr/share/mysql
rm -rf /opt/cloudera
ParcelUpdateService:com.cloudera.parcel.components.ParcelDownloaderImpl: Unable to retrieve remote parcel repository manifest
2019-08-27 20:35:50,469 INFO WebServerImpl:com.cloudera.server.cmf.WebServerImpl: Started Jetty server.
2019-08-27 20:35:50,600 ERROR ParcelUpdateService:com.cloudera.parcel.components.ParcelDownloaderImpl: Unable to retrieve remote parcel repository manifest
java.util.concurrent.ExecutionException: java.net.UnknownHostException: archive.cloudera.com: Name or service not known
at com.ning.http.client.providers.netty.future.NettyResponseFuture.abort(NettyResponseFuture.java:231)
at com.ning.http.client.providers.netty.request.NettyRequestSender.abort(NettyRequestSender.java:422)
at com.ning.http.client.providers.netty.request.NettyRequestSender.sendRequestWithNewChannel(NettyRequestSender.java:290)
at com.ning.http.client.providers.netty.request.NettyRequestSender.sendRequestWithCertainForceConnect(NettyRequestSender.java:142)
at com.ning.http.client.providers.netty.request.NettyRequestSender.sendRequest(NettyRequestSender.java:117)
at com.ning.http.client.providers.netty.NettyAsyncHttpProvider.execute(NettyAsyncHttpProvider.java:87)
at com.ning.http.client.AsyncHttpClient.executeRequest(AsyncHttpClient.java:506)
at com.ning.http.client.AsyncHttpClient$BoundRequestBuilder.execute(AsyncHttpClient.java:229)
at com.cloudera.parcel.components.ParcelDownloaderImpl.getRepositoryInfoFuture(ParcelDownloaderImpl.java:592)
at com.cloudera.parcel.components.ParcelDownloaderImpl.getRepositoryInfo(ParcelDownloaderImpl.java:544)
at com.cloudera.parcel.components.ParcelDownloaderImpl.syncRemoteRepos(ParcelDownloaderImpl.java:357)
at com.cloudera.parcel.components.ParcelDownloaderImpl$1.run(ParcelDownloaderImpl.java:464)
at com.cloudera.parcel.components.ParcelDownloaderImpl$1.run(ParcelDownloaderImpl.java:459)
at com.cloudera.cmf.persist.ReadWriteDatabaseTaskCallable.call(ReadWriteDatabaseTaskCallable.java:36)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.net.UnknownHostException: archive.cloudera.com: Name or service not known
at java.net.Inet4AddressImpl.lookupAllHostAddr(Native Method)
at java.net.InetAddress$2.lookupAllHostAddr(InetAddress.java:928)
at java.net.InetAddress.getAddressesFromNameService(InetAddress.java:1323)
at java.net.InetAddress.getAllByName0(InetAddress.java:1276)
at java.net.InetAddress.getAllByName(InetAddress.java:1192)
at java.net.InetAddress.getAllByName(InetAddress.java:1126)
at java.net.InetAddress.getByName(InetAddress.java:1076)
at com.ning.http.client.NameResolver$JdkNameResolver.resolve(NameResolver.java:28)
at com.ning.http.client.providers.netty.request.NettyRequestSender.remoteAddress(NettyRequestSender.java:358)
at com.ning.http.client.providers.netty.request.NettyRequestSender.connect(NettyRequestSender.java:369)
at com.ning.http.client.providers.netty.request.NettyRequestSender.sendRequestWithNewChannel(NettyRequestSender.java:283)
... 15 more
不影响使用
安装hive报错:org.apache.hadoop.hive.metastore.HiveMetaException: Failed to retrieve schema tables from Hive Metastore DB,Not supported
[root@master01 ~]# rpm -qa|grep mysql-connector-java
mysql-connector-java-5.1.25-3.el7.noarch
jdbc版本不对,要求使用5.1.26以上版本的jdbc驱动