CentOS7 Cloudera Manager6 完全离线安装 CDH6 集群

本文是在CentOS7.4 下进行CDH6集群的完全离线部署。CDH5集群与CDH6集群的部署区别比较大。

说明:本文内容所有操作都是在root用户下进行的。

文件下载

首先一些安装CDH6集群的必须文件要先在外网环境先下载好。

Cloudera Manager 6.3.0

CM6 RPM:https://archive.cloudera.com/...
需要下载该链接下的所有RPM文件,保存到cloudera-repos目录下。

ASC文件:https://archive.cloudera.com/...
同时还需要下载一个asc文件,同样保存到cloudera-repos目录下:

[root@node01 upload]# tree cloudera-repos/
cloudera-repos/
├── allkeys.asc
├── cloudera-manager-agent-6.3.0-1281944.el7.x86_64.rpm
├── cloudera-manager-daemons-6.3.0-1281944.el7.x86_64.rpm
├── cloudera-manager-server-6.3.0-1281944.el7.x86_64.rpm
├── cloudera-manager-server-db-2-6.3.0-1281944.el7.x86_64.rpm
├── enterprise-debuginfo-6.3.0-1281944.el7.x86_64.rpm
└── oracle-j2sdk1.8-1.8.0+update181-1.x86_64.rpm

MySQL JDBC驱动

要求使用5.1.26以上版本的jdbc驱动,可点击这里直接下载mysql-connector-java-5.1.47.tar.gz

配置Cloudera Manager yum库

注意:不要尝试使用FTP搭建CM的YUM库!

首先安装httpdcreaterepo

yum -y install httpd createrepo

启动httpd服务并设置开机自启动:

systemctl start httpd
systemctl enable httpd

然后进入到前面准备好的存放Cloudera Manager RPM包的目录cloudera-repos下:

cd /data6/upload/cloudera-repos/

生成RPM元数据:

createrepo .
chmod 777 -R cloudera-repos

然后将cloudera-repos目录移动到httpd的html目录下:

mv cloudera-repos /var/www/html/

确保可以通过浏览器查看到这些RPM包:

接着在创建cm6的repo文件(每个节点都需要配置):

cd /etc/yum.repos.d
vim cloudera-manager.repo

添加如下内容:

[cloudera-manager]
name=Cloudera Manager 6.3.0
baseurl=http://node01/cloudera-repos/
gpgcheck=0
enabled=1

保存,退出,然后执行yum clean all && yum makecache命令:

[root@master02 ~]# yum clean all && yum makecache
Loaded plugins: fastestmirror, langpacks
Cleaning repos: ChinaUnicom-Packages cloudera-manager
Cleaning up everything
Maybe you want: rm -rf /var/cache/yum, to also free up space taken by orphaned data from disabled or removed repos
Loaded plugins: fastestmirror, langpacks
ChinaUnicom-Packages                                                                                                                                                                                               | 3.6 kB  00:00:00     
cloudera-manager                                                                                                                                                                                                   | 2.9 kB  00:00:00     
(1/7): ChinaUnicom-Packages/group_gz                                                                                                                                                                               | 156 kB  00:00:00     
(2/7): ChinaUnicom-Packages/filelists_db                                                                                                                                                                           | 3.1 MB  00:00:00     
(3/7): ChinaUnicom-Packages/primary_db                                                                                                                                                                             | 3.1 MB  00:00:00     
(4/7): ChinaUnicom-Packages/other_db                                                                                                                                                                               | 1.2 MB  00:00:00     
(5/7): cloudera-manager/filelists_db                                                                                                                                                                               | 118 kB  00:00:00     
(6/7): cloudera-manager/other_db                                                                                                                                                                                   | 1.0 kB  00:00:00     
(7/7): cloudera-manager/primary_db                                                                                                                                                                                 | 8.6 kB  00:00:00     
Determining fastest mirrors
Metadata Cache Created

安装Cloudera Manager Server

这一步只需要在CM Server节点上操作。
执行下面的命令:

# 安装openjdk8
yum install oracle-j2sdk1.8
# 安装 cm manager(只需在server节点安装)
yum install cloudera-manager-daemons cloudera-manager-agent cloudera-manager-server
将会需要很多依赖包,所以说还是有必要搭一个局域网内yum源的,或者手动安装rpm包

配置本地Parcel存储库

Cloudera Manager Server安装完成后,进入到本地Parcel存储库目录:

cd /opt/cloudera/parcel-repo

将第一部分下载的CDH parcels文件上传至该目录下,然后执行修改sha文件:

mv /data6/upload/parcels/* /opt/cloudera/parcel-repo/
mv CDH-6.3.0-1.cdh6.3.0.p0.1279813-el7.parcel.sha1 CDH-6.3.0-1.cdh6.3.0.p0.1279813-el7.parcel.sha

然后执行下面的命令修改文件所有者:

chown -R cloudera-scm:cloudera-scm /opt/cloudera/parcel-repo

最终/opt/cloudera/parcel-repo目录内容如下:

├── CDH-6.3.0-1.cdh6.3.0.p0.1279813-el7.parcel
├── CDH-6.3.0-1.cdh6.3.0.p0.1279813-el7.parcel.sha
└── manifest.json

安装数据库

MySQL的安装在环境准备部分中已经有说明,这里就跳过MySQL安装了。

数据库配置

CDH官方给的有一份推荐的MySQL的配置内容:

[mysqld]
datadir=/var/lib/mysql
socket=/var/lib/mysql/mysql.sock
transaction-isolation = READ-COMMITTED
# Disabling symbolic-links is recommended to prevent assorted security risks;
# to do so, uncomment this line:
symbolic-links = 0
key_buffer_size = 32M
max_allowed_packet = 32M
thread_stack = 256K
thread_cache_size = 64
query_cache_limit = 8M
query_cache_size = 64M
query_cache_type = 1
max_connections = 550
#expire_logs_days = 10
#max_binlog_size = 100M
#log_bin should be on a disk with enough free space.
#Replace '/var/lib/mysql/mysql_binary_log' with an appropriate path for your
#system and chown the specified folder to the mysql user.
log_bin=/var/lib/mysql/mysql_binary_log
#In later versions of MySQL, if you enable the binary log and do not set
#a server_id, MySQL will not start. The server_id must be unique within
#the replicating group.
server_id=1
binlog_format = mixed
read_buffer_size = 2M
read_rnd_buffer_size = 16M
sort_buffer_size = 8M
join_buffer_size = 8M
# InnoDB settings
innodb_file_per_table = 1
innodb_flush_log_at_trx_commit  = 2
innodb_log_buffer_size = 64M
innodb_buffer_pool_size = 4G
innodb_thread_concurrency = 8
innodb_flush_method = O_DIRECT
innodb_log_file_size = 512M
[mysqld_safe]
log-error=/var/log/mysqld.log
pid-file=/var/run/mysqld/mysqld.pid
sql_mode=STRICT_ALL_TABLES

配置mysql jdbc驱动

从前面下载好的mysql-connector-java-5.1.47.tar.gz包中解压出mysql-connector-java-5.1.47-bin.jar文件,将mysql-connector-java-5.1.47-bin.jar文件上传至CM Server节点上的/usr/share/java/目录下并重命名为mysql-connector-java.jar(如果/usr/share/java/目录不存在,需要手动创建):

tar zxvf mysql-connector-java-5.1.47.tar.gz
mkdir -p /usr/share/java/
cp mysql-connector-java-5.1.47-bin.jar /usr/share/java/mysql-connector-java.jar

创建CDH所需要的数据库

根据所需要安装的服务参照下表创建对应的数据库以及数据库用户,数据库必须使用utf8编码,创建数据库时要记录好用户名及对应密码:

服务名 数据库名 用户名
Cloudera Manager Server scm scm
Activity Monitor amon amon
Reports Manager rman rman
Hue hue hue
Hive Metastore Server metastore hive
Sentry Server sentry sentry
Cloudera Navigator Audit Server nav nav
Cloudera Navigator Metadata Server navms navms
Oozie oozie oozie

创建数据库及对应用户:

# scm
CREATE DATABASE scm DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci;
GRANT ALL ON scm.* TO 'scm'@'%' IDENTIFIED BY 'scm';

# amon
CREATE DATABASE amon DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci;
GRANT ALL ON amon.* TO 'amon'@'%' IDENTIFIED BY 'amon';

# rman
CREATE DATABASE rman DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci;
GRANT ALL ON rman.* TO 'rman'@'%' IDENTIFIED BY 'rman';

# hue
CREATE DATABASE hue DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci; 
GRANT ALL ON hue.* TO 'hue'@'%' IDENTIFIED BY 'hue';

# hive
CREATE DATABASE metastore DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci;
GRANT ALL ON metastore.* TO 'hive'@'%' IDENTIFIED BY 'hive';

# sentry
CREATE DATABASE sentry DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci;   
GRANT ALL ON sentry.* TO 'sentry'@'%' IDENTIFIED BY 'sentry';

# nav
CREATE DATABASE nav DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci;      
GRANT ALL ON nav.* TO 'nav'@'%' IDENTIFIED BY 'nav';

# navms
CREATE DATABASE navms DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci;
GRANT ALL ON navms.* TO 'navms'@'%' IDENTIFIED BY 'navms';

# oozie
CREATE DATABASE oozie DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci;
GRANT ALL ON oozie.* TO 'oozie'@'%' IDENTIFIED BY 'oozie';

# flush
FLUSH PRIVILEGES;

设置Cloudera Manager 数据库

Cloudera Manager Server包含一个配置数据库的脚本。

  • mysql数据库与CM Server是同一台主机
    执行命令:/opt/cloudera/cm/schema/scm_prepare_database.sh mysql scm scm
  • mysql数据库与CM Server不在同一台主机上
    执行命令:/opt/cloudera/cm/schema/scm_prepare_database.sh mysql -h --scm-host scm scm
[root@master02 ~]# /opt/cloudera/cm/schema/scm_prepare_database.sh mysql -h 10.172.54.51 --scm-host 10.172.54.52 scm scm  
Enter SCM password: 
JAVA_HOME=/usr/java/jdk1.8.0_181-cloudera
Verifying that we can write to /etc/cloudera-scm-server
Creating SCM configuration file in /etc/cloudera-scm-server
Executing:  /usr/java/jdk1.8.0_181-cloudera/bin/java -cp /usr/share/java/mysql-connector-java.jar:/usr/share/java/oracle-connector-java.jar:/usr/share/java/postgresql-connector-java.jar:/opt/cloudera/cm/schema/../lib/* com.cloudera.enterprise.dbutil.DbCommandExecutor /etc/cloudera-scm-server/db.properties com.cloudera.cmf.db.
[                          main] DbCommandExecutor              INFO  Successfully connected to database.
All done, your SCM database is configured correctly!

安装CDH节点

启动Cloudera Manager Server服务

systemctl start cloudera-scm-server

然后等待Cloudera Manager Server启动,可能需要稍等一会儿,可以通过命令tail -f /var/log/cloudera-scm-server/cloudera-scm-server.log去监控服务启动状态。

当看到INFO WebServerImpl:com.cloudera.server.cmf.WebServerImpl: Started Jetty server.日志打印出来后,说明服务启动成功,可以通过浏览器访问Cloudera Manager WEB界面了。

访问Cloudera Manager WEB界面

打开浏览器,访问地址:http://:7180,默认账号和密码都为admin

欢迎页面

首先是Cloudera Manager的欢迎页面,点击页面右下角的【继续】按钮进行下一步:

### 接受条款

勾选接受条款,点击【继续】进行下一步:

### 版本选择

这里我就选择免费版了:

第二个欢迎界面

选择版本以后会出现第二个欢迎界面,不过这个是安装集群的欢迎页:

选择主机

这一步是要搜索并选择用于安装CDH集群的主机,在主机名称后面的输入框中输入各个节点的hostname,中间使用英文逗号分隔开,然后点击搜索,在结果列表中勾选要安装CDH的节点即可:

指定存储库

Cloudera Manager Agent

这里选择自定义,填写上面使用httpd搭建好的Cloudera Manager YUM 库URL:

CDH and other software

如果我们之前的【配置本地Parcel存储库】步骤操作无误的话,这里会自动选择【使用Parcel】,并加载出CDH版本,确认无误后点击【继续】:

因此,不需要自己手动安装 Cloudera Manager Agent了

JDK安装选项

这一步骤我就不再勾选安装JDK了,因为我在环境准备部分已经安装过了。取消勾选,然后继续:

SSH登录配置

用于配置集群主机之间的SSH登录,填写root用户的密码,根据集群配置填写合适的【同时安装数量】值即可:

安装Agent

到这一步会自动进行节点Agent的安装,稍等一会儿,即可安装完成:

安装Parcels

这一步同样是自动安装,分配步骤的速度主要取决于网络环境,耐心等待即可:

主机检查

等待检查完成即可:

Cloudera 建议将 /proc/sys/vm/swappiness 设置为最大值 10。当前设置为 30。使用 sysctl 命令在运行时更改该设置并编辑 /etc/sysctl.conf,以在重启后保存该设置。您可以继续进行安装,但 Cloudera Manager 可能会报告您的主机由于交换而运行状况不良。

解决方法:

临时修改:

sysctl vm.swappiness=10
cat /proc/sys/vm/swappiness

这里我们的修改已经生效,但是如果我们重启了系统,又会变成30.

永久修改:

/etc/sysctl.conf 文件里添加如下参数:

vm.swappiness=10

或者:

echo 'vm.swappiness=10'>> /etc/sysctl.conf

已启用透明大页面压缩,可能会导致重大性能问题。请运行echo never > /sys/kernel/mm/transparent_hugepage/defragecho never > /sys/kernel/mm/transparent_hugepage/enabled以禁用此设置,然后将同一命令添加到 /etc/rc.local 等初始化脚本中,以便在系统重启时予以设置。

安装上面的提示执行即可;

安装CDH集群

选择服务类型

这里我选择自定义服务,Zookeeper, HDFS,Yarn:

可以先安装基础组件,然后用到啥在安装啥

如果所有服务都安装,可能安装过程中会出现很多问题

角色分配

CDH会自动给出一个角色分配,如果觉得不合理,我们可以手动调整一下,注意角色分配均衡:

数据库设置

错误问题

no leveldbjni-1.8 in java.library.path

Java HotSpot(TM) 64-Bit Server VM warning: ignoring option PermSize=128m; support was removed in 8.0
Exception in thread "main" java.lang.UnsatisfiedLinkError: Could not load library. Reasons: [Can't load library: /run/cloudera-scm-agent/process/68-cloudera-mgmt-SERVICEMONITOR/libleveldbjni-1.8.so, Can't load library: /run/cloudera-scm-agent/process/68-cloudera-mgmt-SERVICEMONITOR/libleveldbjni.so, no leveldbjni64-1.8 in java.library.path, no leveldbjni-1.8 in java.library.path, no leveldbjni in java.library.path, /run/cloudera-scm-agent/process/68-cloudera-mgmt-SERVICEMONITOR/libleveldbjni-64-1-4792575431304239050.8: libstdc++.so.6: : , /tmp/libleveldbjni-64-1-6079277982211108711.8: libstdc++.so.6: : ]

出错原因:当前节点的glibc升级有关。既然不存在leveldbjni的库,那便给他安装一个。

安装leveldbjni库的方式非常有趣:

1) 首先下载leveldbjni-all-1.8.jar

2)解压该jar包,在META-INFnativelinux64目录下找到libleveldbjni.so文件

3) 将libleveldbjni.so文件上传到1中java.library.path中

卸载Cloudera Manager

如果因为其他原因,需要卸载Cloudera Manager,在各节点执行如下步骤即可。

systemctl stop cloudera-scm-server
systemctl stop cloudera-scm-agent
yum -y remove 'cloudera-manager-*'

yum clean all
umount cm_processes
umount /var/run/cloudera-scm-agent/process

rm -Rf /usr/share/cmf /var/lib/cloudera* /var/cache/yum/cloudera* /var/log/cloudera* /var/run/cloudera*
rm -rf /tmp/.scmpreparenode.lock
rm -Rf /var/lib/flume-ng /var/lib/hadoop* /var/lib/hue /var/lib/navigator /var/lib/oozie /var/lib/solr /var/lib/sqoop* /var/lib/zookeeper
 
rm -rf /var/lib/hadoop-* /var/lib/impala /var/lib/solr /var/lib/zookeeper /var/lib/hue /var/lib/oozie  /var/lib/pgsql  /var/lib/sqoop2  /data/dfs/  /data/impala/ /data/yarn/  /dfs/ /impala/ /yarn/  /var/run/hadoop-*/ /var/run/hdfs-*/ /usr/bin/hadoop* /usr/bin/zookeeper* /usr/bin/hbase* /usr/bin/hive* /usr/bin/hdfs /usr/bin/mapred /usr/bin/yarn /usr/bin/sqoop* /usr/bin/oozie /etc/hadoop* /etc/zookeeper* /etc/hive* /etc/hue /etc/impala /etc/sqoop* /etc/oozie /etc/hbase* /etc/hcatalog

systemctl stop mariadb
yum -y remove mariadb-*
rm -rf /var/lib/mysql
rm -rf /var/log/mysqld.log
rm -rf /usr/lib64/mysql
rm -rf /usr/share/mysql
rm -rf /opt/cloudera

问题

ParcelUpdateService:com.cloudera.parcel.components.ParcelDownloaderImpl: Unable to retrieve remote parcel repository manifest

2019-08-27 20:35:50,469 INFO WebServerImpl:com.cloudera.server.cmf.WebServerImpl: Started Jetty server.
2019-08-27 20:35:50,600 ERROR ParcelUpdateService:com.cloudera.parcel.components.ParcelDownloaderImpl: Unable to retrieve remote parcel repository manifest
java.util.concurrent.ExecutionException: java.net.UnknownHostException: archive.cloudera.com: Name or service not known
        at com.ning.http.client.providers.netty.future.NettyResponseFuture.abort(NettyResponseFuture.java:231)
        at com.ning.http.client.providers.netty.request.NettyRequestSender.abort(NettyRequestSender.java:422)
        at com.ning.http.client.providers.netty.request.NettyRequestSender.sendRequestWithNewChannel(NettyRequestSender.java:290)
        at com.ning.http.client.providers.netty.request.NettyRequestSender.sendRequestWithCertainForceConnect(NettyRequestSender.java:142)
        at com.ning.http.client.providers.netty.request.NettyRequestSender.sendRequest(NettyRequestSender.java:117)
        at com.ning.http.client.providers.netty.NettyAsyncHttpProvider.execute(NettyAsyncHttpProvider.java:87)
        at com.ning.http.client.AsyncHttpClient.executeRequest(AsyncHttpClient.java:506)
        at com.ning.http.client.AsyncHttpClient$BoundRequestBuilder.execute(AsyncHttpClient.java:229)
        at com.cloudera.parcel.components.ParcelDownloaderImpl.getRepositoryInfoFuture(ParcelDownloaderImpl.java:592)
        at com.cloudera.parcel.components.ParcelDownloaderImpl.getRepositoryInfo(ParcelDownloaderImpl.java:544)
        at com.cloudera.parcel.components.ParcelDownloaderImpl.syncRemoteRepos(ParcelDownloaderImpl.java:357)
        at com.cloudera.parcel.components.ParcelDownloaderImpl$1.run(ParcelDownloaderImpl.java:464)
        at com.cloudera.parcel.components.ParcelDownloaderImpl$1.run(ParcelDownloaderImpl.java:459)
        at com.cloudera.cmf.persist.ReadWriteDatabaseTaskCallable.call(ReadWriteDatabaseTaskCallable.java:36)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:748)
Caused by: java.net.UnknownHostException: archive.cloudera.com: Name or service not known
        at java.net.Inet4AddressImpl.lookupAllHostAddr(Native Method)
        at java.net.InetAddress$2.lookupAllHostAddr(InetAddress.java:928)
        at java.net.InetAddress.getAddressesFromNameService(InetAddress.java:1323)
        at java.net.InetAddress.getAllByName0(InetAddress.java:1276)
        at java.net.InetAddress.getAllByName(InetAddress.java:1192)
        at java.net.InetAddress.getAllByName(InetAddress.java:1126)
        at java.net.InetAddress.getByName(InetAddress.java:1076)
        at com.ning.http.client.NameResolver$JdkNameResolver.resolve(NameResolver.java:28)
        at com.ning.http.client.providers.netty.request.NettyRequestSender.remoteAddress(NettyRequestSender.java:358)
        at com.ning.http.client.providers.netty.request.NettyRequestSender.connect(NettyRequestSender.java:369)
        at com.ning.http.client.providers.netty.request.NettyRequestSender.sendRequestWithNewChannel(NettyRequestSender.java:283)
        ... 15 more
不影响使用

安装hive报错:org.apache.hadoop.hive.metastore.HiveMetaException: Failed to retrieve schema tables from Hive Metastore DB,Not supported

[root@master01 ~]# rpm -qa|grep mysql-connector-java
mysql-connector-java-5.1.25-3.el7.noarch
jdbc版本不对,要求使用5.1.26以上版本的jdbc驱动

你可能感兴趣的:(大数据,cdh5)