1.管理:对集群进行管理,如添加、删除节点等操作。
2.监控:监控集群的健康情况,对设置的各种指标和系统运行情况进行全面监控。
3.诊断:对集群出现的问题进行诊断,对出现的问题给出建议解决方案。
4.集成:对hadoop的多组件进行整合。
CDH (Cloudera’s Distribution, including Apache Hadoop),是Hadoop众多分支中的一种,由Cloudera维护,基于稳定版本的Apache Hadoop构建,并集成了很多补丁,可直接用于生产环境。
查看系统:cat /etc/redhat-release
CentOS Linux release 7.7.1908 (Core)
查看机器名称:hostname
S0
完整集群环境:
IP | 主机名 |
---|---|
192.168.3.9 | S0 |
192.168.3.18 | S1 |
192.168.3.161 | S2 |
192.168.3.136 | S3 |
所有节点都需要执行 生成秘钥 然后发送到其他所有节点 实现ssh免密码登录
如下是主节点的操作:
yum -y install openssh-clients #安装ssh
ssh-keygen -t rsa #一直按回车 生成秘钥
ssh-copy-id 192.168.3.18 #发送到192.168.3.18节点
ssh-copy-id 192.168.3.161 #发送到192.168.3.161节点
ssh-copy-id 192.168.3.136 #发送到192.168.3.136节点
其他节点也如此操作,需修改发送到节点地址
hostnamectl set-hostname S0
S0为主机名称,具体的内容根据实际的IP和主机名自行修改
vi /etc/sysconfig/network
host
vi /etc/hosts
配置以下内容:
192.168.3.9 S0
192.168.3.18 S1
192.168.3.161 S2
192.168.3.163 S3
具体的内容根据实际的IP和主机名自行修改
将hosts文件拷贝到其他机器
scp /etc/hosts [email protected]:/etc/
scp /etc/hosts [email protected]:/etc/
scp /etc/hosts [email protected]:/etc/
查看防火墙状态
firewall-cmd --state
停止firewall
systemctl stop firewalld.service
禁止firewall开机启动
systemctl disable firewalld.service
selinux
查看SELinux状态
/usr/sbin/sestatus -v
如果SELinux status参数为enable,即开启状态
getenforce
也可以用这个命令检查
临时关闭selinux,命令:setenforce 0
永久关闭selinux,修改配置文件需要重启机器
vi /etc/selinux/config
#将SELINUX=enforcing改成SELINUX=disabled
将selinux 文件拷贝到其他机器
scp /etc/sysconfig/selinux [email protected]:/etc/sysconfig/
scp /etc/sysconfig/selinux [email protected]:/etc/sysconfig/
scp /etc/sysconfig/selinux [email protected]:/etc/sysconfig/
再次强调,
重启各服务器
首先要准备java环境 安装jdk 设置JAVA_HOME环境变量
/usr/java/jdk1.8.0_251-amd64
注意:
jdk要安装在/usr/java/ 里
,否则Cloudera Manager找不到会报错
也可选用rpm -i jdk-8u251-linux-x64.rpm
安装
配置环境变量
vi /etc/profile
export JAVA_HOME=/usr/java/jdk1.8.0_251-amd64
export JRE_HOME=${JAVA_HOME}/jre
export CLASSPATH=.:${JAVA_HOME}/lib:${JRE_HOME}/lib
export PATH=${JAVA_HOME}/bin:$PATH
重新加载profile使配置生效
source /etc/profile
环境变量配置完成,测试环境变量是否生效
echo $JAVA_HOME
java -version
yum install ntp -y
设置开机启动
chkconfig ntpd on
设置时间同步
ntpdate -u s2c.time.edu.cn
查看服务状态
ntpq -p
安装mysql
wget -i -c http://dev.mysql.com/get/mysql57-community-release-el7-10.noarch.rpm
yum -y install mysql57-community-release-el7-10.noarch.rpm
yum -y install mysql-community-server
设置开机启动
systemctl enable mysqld.service
启动服务并查看服务状态
systemctl start mysqld.service
systemctl status mysqld.service
查看初始密码
grep "password" /var/log/mysqld.log
进入数据库
mysql -uroot -p # 回车后会提示输入密码,密码为上面的初始密码
修改默认密码
ALTER USER 'root'@'localhost' IDENTIFIED BY 'new password';
原因是因为MySQL有密码设置的规范,具体是与validate_password_policy的值有关
如果不需要密码策略,在my.cnf文件中添加如下配置禁用即可,这样就可以设置简单密码了:
vi /etc/my.cnf
validate_password = off
yum安装的默认在/etc文件夹下,修改完后记得需要重新启动MySQL服务
设置远程登录
GRANT ALL PRIVILEGES ON *.* TO 'root'@'%' IDENTIFIED BY '1qaz@wsx' WITH GRANT OPTION;
FLUSH PRIVILEGES;
安装数据库驱动
mkdir -p /usr/share/java
cp mysql-connector-java-5.1.44-bin.jar /usr/share/java/mysql-connector-java.jar
CM我选的是5,下载地址:http://archive.cloudera.com/cm5/cm/5/
CHD选择要与CM版本对应,下载地址:http://archive.cloudera.com/cdh5/parcels/5.16.2/
需要下载3个文件, 注意: 一定要修改sha1名称为sha
将上述三个文件上传至主节点
在主节点进行以下配置,我的S0
创建安装目录并解压安装介质
mkdir /opt/cloudera-manager
tar xzf cloudera-manager*.tar.gz -C /opt/cloudera-manager
安装数据库驱动
mkdir -p /usr/share/java
cp mysql-connector-java-5.1.44-bin.jar /usr/share/java/mysql-connector-java.jar
一定要这个名字(mysql-connector-java.jar),否则会报错Unable to find JDBC driver for database type: MySQL
创建系统用户cloudera-scm
useradd --system --home=/opt/cloudera-manager/cm-5.16.2/run/cloudera-scm-server --no-create-home --shell=/bin/false --comment "Cloudera SCM User" cloudera-scm
创建server存储目录
mkdir /var/lib/cloudera-scm-server
chown cloudera-scm:cloudera-scm /var/lib/cloudera-scm-server
创建hadoop安装包存储目录
mkdir -p /opt/cloudera/parcels;
chown cloudera-scm:cloudera-scm /opt/cloudera/parcels
配置agent的server指向
vi /opt/cloudera-manager/cm-5.16.2/etc/cloudera-scm-agent/config.ini
#将server_host修改为cloudera manager server的主机名,对于本示例而言,也就是server主机。
部署CDH离线安装包
mkdir -p /opt/cloudera/parcel-repo;
chown cloudera-scm:cloudera-scm /opt/cloudera/parcel-repo;
将CDH相关文件上传至此目录下
上面由于使用外部mysql,此时,需要进行指定:
/opt/cloudera-manager/cm-5.16.2/share/cmf/schema/scm_prepare_database.sh mysql scm -hlocalhost -uroot -p1qaz@WSX --scm-host localhost scm scm
#格式:scm_prepare_database.sh 数据库类型、要创建数据库名称、数据库服务器地址、具有创建权限的数据库用户名、具有创建权限的数据库密码、cm server服务器地址、访问新创建数据的用户名、访问新创建数据的密码
详情请参考:https://www.cnblogs.com/xiqing/p/9645724.html
启动Cloudera Manager Server
/opt/cloudera-manager/cm-5.16.2/etc/init.d/cloudera-scm-server start
启动Cloudera Manager Agent
/opt/cloudera-manager/cm-5.16.2/etc/init.d/cloudera-scm-agent start
在本地数据创建以下数据库,用户后续进去安装
CREATE DATABASE amon DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci;
GRANT ALL ON amon.* TO 'amon'@'%' IDENTIFIED BY 'root';
CREATE DATABASE rman DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci;
GRANT ALL ON rman.* TO 'rman'@'%' IDENTIFIED BY 'root';
CREATE DATABASE hue DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci;
GRANT ALL ON hue.* TO 'hue'@'%' IDENTIFIED BY 'root';
CREATE DATABASE metastore DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci;
GRANT ALL ON metastore.* TO 'hive'@'%' IDENTIFIED BY 'root';
CREATE DATABASE sentry DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci;
GRANT ALL ON sentry.* TO 'sentry'@'%' IDENTIFIED BY 'root';
CREATE DATABASE nav DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci;
GRANT ALL ON nav.* TO 'nav'@'%' IDENTIFIED BY 'root';
CREATE DATABASE navms DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci;
GRANT ALL ON navms.* TO 'navms'@'%' IDENTIFIED BY 'root';
CREATE DATABASE oozie DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci;
GRANT ALL ON oozie.* TO 'oozie'@'%' IDENTIFIED BY 'root';
在除了server服务器外的其他的服务器都要执行以下步骤进行对agent的部署
创建安装目录并解压安装介质
mkdir /opt/cloudera-manager
tar xzf cloudera-manager*.tar.gz -C /opt/cloudera-manager
安装数据库驱动
mkdir -p /usr/share/java
cp mysql-connector-java-5.1.44-bin.jar /usr/share/java/mysql-connector-java.jar
创建系统用户cloudera-scm
useradd --system --home=/opt/cloudera-manager/cm-5.16.2/run/cloudera-scm-server --no-create-home --shell=/bin/false --comment "Cloudera SCM User" cloudera-scm
创建server存储目录
mkdir /var/lib/cloudera-scm-server
chown cloudera-scm:cloudera-scm /var/lib/cloudera-scm-server
创建hadoop安装包存储目录
mkdir -p /opt/cloudera/parcels;
chown cloudera-scm:cloudera-scm /opt/cloudera/parcels
配置agent的server指向
vi /opt/cloudera-manager/cm-5.16.2/etc/cloudera-scm-agent/config.ini
#将server_host修改为cloudera manager server的主机名,对于本示例而言,也就是server主机。
启动Cloudera Manager Agent
/opt/cloudera-manager/cm-5.16.2/etc/init.d/cloudera-scm-agent start
浏览器访问ip:7180
用户名:admin 密码:admin
到此为止,cloudera manager就安装完成。
一、启动服务报错 pstree: 未找到命令
/opt/cm-5.16.2/etc/init.d/cloudera-scm-agent:行109: pstree: 未找到命令
/opt/cm-5.16.2/etc/init.d/cloudera-scm-server:行109: pstree: 未找到命令
原因:缺少psmisc工具
解决方法:yum install -y psmisc
二、HDFS namenode或datanode无法启动
删除全部节点的dfs目录:rm -rf dfs
在主节点以ROOT身份进行命名节点格式化:hadoop namenode -format
之后对dfs目录进行重新赋权:chown hdfs:hadoop -R /dfs/
三、hive客户端的hdfs权限认证问题
Permission denied: user=user, access=READ_EXECUTE, inode="/user/root":root:supergroup:drwx------
hadoop fs -chmod -R 777 /user
vi /etc/profile
添加:export HADOOP_USER_NAME=hdfs(hdfs为最高权限)
source /etc/profile
四、impala故障
select count(*) from impala_100yi;
Query: select count(*) from impala_100yi
Query submitted at: 2019-02-14 14:07:33 (Coordinator: http://cdh004:25000)
Query progress can be monitored at: http://cdh004:25000/query_plan?query_id=5248ba412c4dcffa:306f374700000000
WARNINGS: TransmitData() to 172.15.106.223:27000 failed: Invalid argument: Client connection negotiation failed: client connection to 172.15.106.223:27000: unable to find SASL plugin: PLAIN
问题处理
yum install gcc python-devel cyrus-sasl* -y
然后重启集群的agent和集群服务
五、变更cloudera-scm-server地址
关闭全部cloudera-scm-agent节点
/opt/cloudera-manager/cm-5.16.2/etc/init.d/cloudera-scm-agent stop
关闭原server节点
/opt/cloudera-manager/cm-5.16.2/etc/init.d/cloudera-scm-server stop
更变全部节点的server指向
vi /opt/cloudera-manager/cm-5.16.2/etc/cloudera-scm-agent/config.ini
#将server_host修改为cloudera manager server的主机名,对于本示例而言,也就是server主机
更改数据地址
cd /opt/cloudera-manager/cm-5.16.2/etc/cloudera-scm-server
# Auto-generated by scm_prepare_database.sh on 2020年 07月 14日 星期二 18:07:20 CST
#
# For information describing how to configure the Cloudera Manager Server
# to connect to databases, see the "Cloudera Manager Installation Guide."
#
com.cloudera.cmf.db.type=mysql
com.cloudera.cmf.db.host=S0
com.cloudera.cmf.db.name=scm
com.cloudera.cmf.db.user=root
com.cloudera.cmf.db.setupType=EXTERNAL
com.cloudera.cmf.db.password=1qaz@wsx
在变更后的server节点
/opt/cloudera-manager/cm-5.16.2/etc/init.d/cloudera-scm-server restart
/opt/cloudera-manager/cm-5.16.2/etc/init.d/cloudera-scm-agent restart
启动全部cloudera-scm-agent节点
/opt/cloudera-manager/cm-5.16.2/etc/init.d/cloudera-scm-agent start