win10上通过VMwarePro15安装CentOS-7镜像后搭建cdh6.3.2集群

背景:随着Cloudera Enterprise 6.3.3发布再也没有免费版了,于是我便萌生在win10上通过VMwarePro15安装CentOS-7-x86_64-DVD-1810镜像后搭建cdh6.3.2集群。但由于Cloudera仅提供CDH-6.3.1的rpm安装包用于安装CM和CDH-6.3.2的parcel包用于搭建集群,如下:

win10上通过VMwarePro15安装CentOS-7镜像后搭建cdh6.3.2集群_第1张图片

win10上通过VMwarePro15安装CentOS-7镜像后搭建cdh6.3.2集群_第2张图片

具体安装步骤参考一:如何在Redhat7.6安装CDH6.3.3;参考二:CDH6.3.1企业集群真正离线部署(rpm+http file部署方式) 全网最细,配套视频和文档安装包,生产可实践

在此说一下我遇到的坑:

1.虽然网上搭建cdh6.x的教程中都是提倡如“cdh6.3.0的rpm+cdh6.3.0的parcel包”进行搭建,但官方所提供的只能使用“cdh6.3.1的rpm+cdh6.3.2的parcel包”。亲测可以正常搭建cdh6.3.2集群且运行正常。关于cdh6.3.1/2比cdh6.3.0优化项暂不讨论,不过cdh上各组件的版本号均是一样的,参见:https://archive.cloudera.com/cdh6/6.3.2/redhat7/yum/RPMS/noarch/

2.通过scm_prepare_database.sh脚本自动生成cdh所需要的CM、AM、RM等库时若报错:Unable to find JDBC driver for database type: MySQL问题,如下图所示:

[root@cdh632_master01 ~]# /opt/cloudera/cm/schema/scm_prepare_database.sh mysql cmf cmf '你的mysql数据库的用户所对应的密码'
JAVA_HOME=/usr/java/jdk1.8.0_181-cloudera
Verifying that we can write to /etc/cloudera-scm-server
Creating SCM configuration file in /etc/cloudera-scm-server
Executing:  /usr/java/jdk1.8.0_181-cloudera/bin/java -cp :/opt/cloudera/cm/schema/../lib/* com.cloudera.enterprise.dbutil.DbCommandExecutor /etc/cloudera-scm-server/db.properties com.cloudera.cmf.db.
2020-03-03 15:04:59,945 [main] INFO  com.cloudera.enterprise.dbutil.DbCommandExecutor  - Unable to find JDBC driver for database type: MySQL
2020-03-03 15:04:59,946 [main] ERROR com.cloudera.enterprise.dbutil.DbCommandExecutor  - JDBC Driver com.mysql.jdbc.Driver not found.
2020-03-03 15:04:59,946 [main] ERROR com.cloudera.enterprise.dbutil.DbCommandExecutor  - Exiting with exit code 3
--> Error 3, giving up (use --force if you wish to ignore the error)

除了网上普遍指出的所有主机未正确重命名mysql-connector-java-5.1.47.jar并放置到固定位置外:

mkdir /usr/share/java && cp /root/jcz/mysql-connector-java-5.1.47.jar /usr/share/java/mysql-connector-java.jar

而我实际上:之所以我遇到“找不到mysql驱动jar包”的解决方案与别人不同,是因为我此台机器上未安装cloudera-manager-server导致的。之所以我此台机器上未安装此server,是因为在cdh集群中客户机不需要安装server,即此台机器肯定无法启动server

3.在启动agent后,查看状态时报错

[root@cdh632_worker02 ~]# systemctl status cloudera-scm-agent
● cloudera-scm-agent.service - Cloudera Manager Agent Service
   Loaded: loaded (/usr/lib/systemd/system/cloudera-scm-agent.service; enabled; vendor preset: disabled)
   Active: inactive (dead) since Tue 2020-03-03 15:44:40 CST; 44s ago
  Process: 6057 ExecStart=/opt/cloudera/cm-agent/bin/cm agent (code=exited, status=0/SUCCESS)
 Main PID: 6057 (code=exited, status=0/SUCCESS)

Mar 03 15:44:40 cdh632_worker02 cm[6057]: [03/Mar/2020 15:44:40 +0000] 6057 MainThread agent        INFO     Re-using pre-existing directory.../cgroups
Mar 03 15:44:40 cdh632_worker02 cm[6057]: [03/Mar/2020 15:44:40 +0000] 6057 MainThread agent        INFO     Re-using pre-existing directory.../process
Mar 03 15:44:40 cdh632_worker02 cm[6057]: [03/Mar/2020 15:44:40 +0000] 6057 MainThread tmpfs        INFO     Reusing mounted tmpfs at /var/r.../process
Mar 03 15:44:40 cdh632_worker02 cm[6057]: [03/Mar/2020 15:44:40 +0000] 6057 MainThread main         ERROR    Top-level exception: Hostname i...aracter.
Mar 03 15:44:40 cdh632_worker02 cm[6057]: Traceback (most recent call last):
Mar 03 15:44:40 cdh632_worker02 cm[6057]: File "/opt/cloudera/cm-agent/lib/python2.7/site-packages/cmf/main.py", line 105, in main_impl
Mar 03 15:44:40 cdh632_worker02 cm[6057]: ag.configure_service()
Mar 03 15:44:40 cdh632_worker02 cm[6057]: File "/opt/cloudera/cm-agent/lib/python2.7/site-packages/cmf/agent.py", line 608, in configure_service
Mar 03 15:44:40 cdh632_worker02 cm[6057]: raise Exception("Hostname is invalid; it contains an underscore character.")
Mar 03 15:44:40 cdh632_worker02 cm[6057]: Exception: Hostname is invalid; it contains an underscore character.
Hint: Some lines were ellipsized, use -l to show in full.

根据Exception: Hostname is invalid; it contains an underscore character.这句话显示“主机名不能包含下划线”,经查:https://stackoverflow.com/questions/17830232/cloudera-agent-giving-error-hostname-is-invalid-it-contains-an-underscore-ch

然后只需运行:hostnamectl --static set-hostname cdh632-worker02进行主机名重命名,切记虚拟机VMware Pro15上不需要修改主机名以及win10上不需要修改VirtualMachines目录下自动生成的该主机的好多文件名。

3.接下来启动server和agent后,浏览器输入master的ip:7180进行群集安装突然想起来重命名后未再次进行免密ssh,而之前设置的ssh如下:

[root@worker11 ~]# ssh-keygen
[root@worker11 ~]# ls ~/.ssh/
id_rsa  id_rsa.pub
[root@worker11 ~]# netstat -tlunp | grep sshd
-bash: netstat: command not found
[root@worker11 ~]# yum install net-tools
[root@worker11 ~]# netstat -tlunp | grep sshd
tcp        0      0 0.0.0.0:22              0.0.0.0:*               LISTEN      43404/sshd
tcp6       0      0 :::22                   :::*                    LISTEN      43404/sshd
[root@worker11 ~]# ssh-copy-id -i /root/.ssh/id_rsa.pub root@master11
Now try logging into the machine, with:   "ssh 'root@master11'"
[root@worker11 ~]# ls ~/.ssh/
authorized_keys(新增)  id_rsa  id_rsa.pub  known_hosts(此时才新增)

当遇到所有三台主机在群集安装时Install Parcels环节均无法进行“分配”,如下图所示:

win10上通过VMwarePro15安装CentOS-7镜像后搭建cdh6.3.2集群_第3张图片

通过点击蓝色字体的那三台主机进入到CM上查看“检查所有主机”,如下图所示:win10上通过VMwarePro15安装CentOS-7镜像后搭建cdh6.3.2集群_第4张图片

首先可以肯定的是所有的三台主机上jdk是不可能有任何问题的,因为我安装的是cloudera下载rpm包时所提供的oracle-j2sdk1.8-1.8.0+update181-1.x86_64.rpm安装包。查看agent状态(systemctl status cloudera-scm-agent)没有任何问题,百思不得其解。

鉴于此,我错误地卸载并重装server和agent

[root@cdh632-master01 ~]# systemctl stop cloudera-scm-agent
[root@cdh632-master01 ~]# systemctl stop cloudera-scm-server
[root@cdh632-master01 ~]# yum -y remove cloudera-manager-agent
[root@cdh632-master01 ~]# yum -y remove cloudera-manager-server
[root@cdh632-master01 ~]# yum -y remove cloudera-manager-daemons
[root@cdh632-master01 ~]# cd /var/www/html/cloudera-repos/
[root@cdh632-master01 cloudera-repos]# sudo yum install -y cloudera-manager-daemons-6.3.1-1466458.el7.x86_64.rpm
[root@cdh632-master01 cloudera-repos]# sudo yum install -y cloudera-manager-agent-6.3.1-1466458.el7.x86_64.rpm
[root@cdh632-master01 cloudera-repos]# sudo yum install -y cloudera-manager-server-6.3.1-1466458.el7.x86_64.rpm
[root@cdh632-master01 cloudera-repos]# /opt/cloudera/cm/schema/scm_prepare_database.sh mysql cmf cmf '你的mysql数据库的用户所对应的密码'
[root@cdh632-master01 cloudera-repos]# sed -i "s/server_host=localhost/server_host=cdh632-master01/g" /etc/cloudera-scm-agent/config.ini
[root@cdh632-master01 cloudera-repos]# systemctl start cloudera-scm-server
[root@cdh632-master01 cloudera-repos]# systemctl start cloudera-scm-agent
[root@cdh632-master01 cloudera-repos]# systemctl status cloudera-scm-server
[root@cdh632-master01 cloudera-repos]# systemctl status cloudera-scm-agent

再次出现无法“分配”的相同情况,此时才发现卸载并重装的尝试是错误的。

于是查看agent的详细日志(位置在vi /var/log/cloudera-scm-agent/cloudera-scm-agent.log),如下图所示:

Error, CM server guid updated, expected xxxx, received yyyy的解决方案:

rm -rf /var/lib/cloudera-scm-agent/cm_guid
systemctl restart cloudera-scm-agent

于是成功安装cdh6.3.2集群。

你可能感兴趣的:(cdh)