CDH6.3.2安装手册
更新文档见 https://gitee.com/baomili/bigdata-notes 最新最全
https://blog.csdn.net/BoomLee
推荐一个好用的梯子 https://sockboom.shop/auth/register?affid=176961 点我
安装文档私我获取
https://www.cloudera.com/products/open-source/apache-hadoop/key-cdh-components.html
cm下载地址:https://archive.cloudera.com/cm6/6.2.1/repo-as-tarball/
cdh下载地址:https://archive.cloudera.com/cdh6/6.2.1/parcels/
# 从2021年1月31日开始,所有Cloudera软件都需要有效的订阅,并且只能通过付费墙进行访问。
# 解决方案:用下面的库代替clouder官方库
http://ro-bucharest-repo.bigstepcloud.com/cloudera-repos/
Component | Component Version | Changes Information |
---|---|---|
Apache Avro | 1.8.2 | Changes |
Apache Flume | 1.9.0 | Changes |
Apache Hadoop | 3.0.0 | Changes |
Apache HBase | 2.1.2 | Changes |
HBase Indexer | 1.5 | Changes |
Apache Hive | 2.1.1 | Changes |
Hue | 4.3.0 | Changes |
Apache Impala | 3.2.0 | Changes |
Apache Kafka | 2.1.0 | Changes |
Kite SDK | 1.0.0 | Changes |
Apache Kudu | 1.9.0 | Changes |
Apache Solr | 7.4.0 | Changes |
Apache Oozie | 5.1.0 | Changes |
Apache Parquet | 1.9.0 | Changes |
Parquet-format | 2.3.1 | Changes |
Apache Pig | 0.17.0 | Changes |
Apache Sentry | 2.1.0 | Changes |
Apache Spark | 2.4.0 | Changes |
Apache Sqoop | 1.4.7 | Changes |
Apache ZooKeeper | 3.4.5 | Changes |
安装文档参考 https://docs.cloudera.com/documentation/enterprise/6/6.3/topics/installation.html
# 查看操作系统版本
cat /etc/redhat-release
# 查看内存
cat /proc/meminfo
# 查看cpu核数
cat /proc/cpuinfo| grep "processor"| wc -l
# 查看cpu个数
cat /proc/cpuinfo| grep "physical id"| sort| uniq| wc -l
# 查看磁盘容量
df -h
# vim /etc/sysconfig/network-scripts/ifcfg-ens33
IPADDR="192.168.219.150"
BOOTPROTO="static"
NETMASK="255.255.255.0"
GATEWAY="192.168.219.2"
hostname-ctl set-hostname 主机名
添加主机映射
vim /etc/hosts
10.0.6.112 cdh-112
10.0.6.113 cdh-113
10.0.6.114 cdh-114
10.0.6.115 cdh-115
10.0.6.116 cdh-116
rpm -qa|grep jdk
java-1.8.0-openjdk-1.8.0.242.b08-1.el7.x86_64
java-1.7.0-openjdk-headless-1.7.0.251-2.6.21.1.el7.x86_64
java-1.8.0-openjdk-headless-1.8.0.242.b08-1.el7.x86_64
java-1.7.0-openjdk-1.7.0.251-2.6.21.1.el7.x86_64
copy-jdk-configs-3.3-10.el7_5.noarch
# 删除以上安装包
rpm -e --nodeps java-1.8.0-openjdk-1.8.0.242.b08-1.el7.x86_64
rpm -e --nodeps java-1.7.0-openjdk-headless-1.7.0.251-2.6.21.1.el7.x86_64
rpm -e --nodeps java-1.8.0-openjdk-headless-1.8.0.242.b08-1.el7.x86_64
rpm -e --nodeps java-1.7.0-openjdk-1.7.0.251-2.6.21.1.el7.x86_64
rpm -e --nodeps copy-jdk-configs-3.3-10.el7_5.noarch
rpm -ivh oracle-j2sdk1.8-1.8.0+update181-1.x86_64.rpm
# 默认安装位置
/usr/java/jdk1.8.0_181-cloudera
# 配置环境变量
vim /etc/profile 结尾处添加
export JAVA_HOME=/usr/java/jdk1.8.0_181-cloudera
export CLASSPATH=.:$JAVA_HOME/jre/lib/rt.jar:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar
export PATH=$PATH:$JAVA_HOME/bin
# 环境变量生效
source /etc/profile
// 查看防火墙状态
# systemctl status firewalld.service
// 关闭防火墙
# systemctl stop firewalld.service
// 关闭开机禁用防火墙自启
# systemctl disable firewalld.service
# vim /etc/selinux/config
SELINUX=disabled (修改)
所有节点同时安装
# yum install -y ntp
//查看状态
# service ntpd status
//配置ntpd
# vim /etc/ntp.conf
//启动
# service ntpd restart
// 设置开机自启
# chkconfig ntpd on
// 设置同步时间
ntpdate -u ntp.aliyun.com
//修改时区
# timedatectl set-timezone Asia/Shanghai
vim /etc/ntp.conf
// master节点配置(ntp.conf)
# Use public servers from the pool.ntp.org project.
# Please consider joining the pool (http://www.pool.ntp.org/join.html).
#server 0.centos.pool.ntp.org iburst
#server 1.centos.pool.ntp.org iburst
#server 2.centos.pool.ntp.org iburst
#server 3.centos.pool.ntp.org iburst
server ntp.aliyun.com
//slaves节点配置
# Use public servers from the pool.ntp.org project.
# Please consider joining the pool (http://www.pool.ntp.org/join.html).
#server 0.centos.pool.ntp.org iburst
#server 1.centos.pool.ntp.org iburst
#server 2.centos.pool.ntp.org iburst
#server 3.centos.pool.ntp.org iburst
# 指向master节点
server node-01
系统时间同步到硬件
hwclock --systohc
添加定时任务
*/10 * * * * /usr/sbin/ntpdate ntp.aliyun.com
卸载自带 MySQL
rpm -qa | grep mysql
rpm -e --nodeps mysql //强力删除
下载
wget http://repo.mysql.com/mysql-community-release-el7-5.noarch.rpm
yum 安装
rpm -ivh mysql-community-release-el7-5.noarch.rpm
yum install mysql-server
MySQL 启动
systemctl start mysqld.service
登录数据库
mysql -uroot -p
设置密码
update mysql.user set password=PASSWORD('123456') where User='root';
开启 root 远程访问
grant all privileges on *.* to 'root'@'%' identified by '123456' with grant option;
删除用户
delete from mysql.user where user='ambari';
drop user "ambari"@"%";
刷新权限并退出
flush privileges;
exit;
创建文件夹
mkdir -p /usr/share/java
驱动下载
wget https://dev.mysql.com/get/Downloads/Connector-J/mysql-connector-java-5.1.46.tar.gz
tar zxvf mysql-connector-java-5.1.46.tar.gz
cd mysql-connector-java-5.1.46
cp mysql-connector-java-5.1.46-bin.jar /usr/share/java/mysql-connector-java.jar
生成秘钥
ssh-keygen -t rsa
拷贝到其他节点(包括本身)
ssh-copy-id -i cdh-112
yum install -y bind-utils psmisc cyrus-sasl-plain cyrus-sasl-gssapi fuse portmap fuse-libs /lib/lsb/init-functions httpd mod_ssl openssl-devel python-psycopg2 MySQL-python libxslt
useradd --system --no-create-home --shell=/bin/false --comment "Cloudera SCM User" cloudera-scm
1)创建各组件需要的数据库
GRANT ALL ON scm.* TO 'scm'@'%' IDENTIFIED BY '123456';
CREATE DATABASE hive DEFAULT CHARSET utf8 COLLATE utf8_general_ci;
CREATE DATABASE oozie DEFAULT CHARSET utf8 COLLATE utf8_general_ci;
CREATE DATABASE hue DEFAULT CHARSET utf8 COLLATE utf8_general_ci;
节点 | cdh-112 | cdh-113 | cdh-114 |
---|---|---|---|
服务 | cloudera-scm-server cloudera-scm-agent | cloudera-scm-agent | cloudera-scm-agent |
创建文件夹
mkdir /opt/cloudera-manager
解压
tar -zxvf cm6.3.1-redhat7.tar.gz
移动
cd cm6.3.1/RPMS/x86_64/
cp cloudera-manager-agent-6.3.1-1466458.el7.x86_64.rpm /opt/cloudera-manager/
cp cloudera-manager-server-6.3.1-1466458.el7.x86_64.rpm /opt/cloudera-manager/
cp cloudera-manager-daemons-6.3.1-1466458.el7.x86_64.rpm /opt/cloudera-manager/
cd /opt/cloudera-manager/
ll
total 1185872
-rw-r--r-- 1 2001 2001 10483568 Sep 25 2019 cloudera-manager-agent-6.3.1-1466458.el7.x86_64.rpm
-rw-r--r-- 1 2001 2001 1203832464 Sep 25 2019 cloudera-manager-daemons-6.3.1-1466458.el7.x86_64.rpm
-rw-r--r-- 1 2001 2001 11488 Sep 25 2019 cloudera-manager-server-6.3.1-1466458.el7.x86_64.rpm
scp -r /opt/cloudera-manager/ root@cdh-112:/opt/
scp -r /opt/cloudera-manager/ root@cdh-113:/opt/
rpm -ivh /opt/cloudera-manager/cloudera-manager-daemons-6.3.1-1466458.el7.x86_64.rpm
rpm -ivh /opt/cloudera-manager/cloudera-manager-agent-6.3.1-1466458.el7.x86_64.rpm
或者使用yum -y install命令,示例:yum -y install cloudera-manager-agent-6.3.1-1466458.el7.x86_64.rpm
指定 server的 ip
vim /etc/cloudera-scm-agent/config.ini
server_host=cdh-112
[root@cdh-112]# rpm -ivh /opt/cloudera-manager/cloudera-manager-server-6.3.1-1466458.el7.x86_64.rpm
mkdir -p /opt/cloudera/parcel-repo
total 2033432
-rw-r--r-- 1 root root 2082186246 May 21 11:10 CDH-6.3.2-1.cdh6.3.2.p0.1605554-el7.parcel
-rw-r--r-- 1 root root 40 May 21 10:56 CDH-6.3.2-1.cdh6.3.2.p0.1605554-el7.parcel.sha1
-rw-r--r-- 1 root root 33887 May 21 10:56 manifest.json
修改.sha1名称为.sha
mv CDH-6.3.2-1.cdh6.3.2.p0.1605554-el7.parcel.sha1 CDH-6.3.2-1.cdh6.3.2.p0.1605554-el7.parcel.sha
vim /etc/cloudera-scm-server/db.properties
com.cloudera.cmf.db.type=mysql
com.cloudera.cmf.db.host=cdh-112:3306
com.cloudera.cmf.db.name=scm
com.cloudera.cmf.db.user=scm
com.cloudera.cmf.db.password=123456
com.cloudera.cmf.db.setupType=EXTERNAL
/opt/cloudera/cm/schema/scm_prepare_database.sh mysql -hlocalhost -uroot -p scm scm
[root@cdh-112 cloudera-manager]# /opt/cloudera/cm/schema/scm_prepare_database.sh mysql -hlocalhost -uroot -p scm scm
Enter database password:
Enter SCM password:
JAVA_HOME=/usr/java/jdk1.8.0_181-cloudera
Verifying that we can write to /etc/cloudera-scm-server
Creating SCM configuration file in /etc/cloudera-scm-server
Executing: /usr/java/jdk1.8.0_181-cloudera/bin/java -cp /usr/share/java/mysql-connector-java.jar:/usr/share/java/oracle-connector-java.jar:/usr/share/java/postgresql-connector-java.jar:/opt/cloudera/cm/schema/../lib/* com.cloudera.enterprise.dbutil.DbCommandExecutor /etc/cloudera-scm-server/db.properties com.cloudera.cmf.db.
[ main] DbCommandExecutor INFO Successfully connected to database.
All done, your SCM database is configured correctly
# 官网地址:https://docs.cloudera.com/documentation/enterprise/latest/topics/prepare_cm_database.html
systemctl start cloudera-scm-server
日志地址
/var/log/cloudera-scm-server/cloudera-scm-server.log
设置为开启自启
chkconfig cloudera-scm-server on
systemctl start cloudera-scm-agent
日志地址
/var/log/cloudera-scm-agent/cloudera-scm-agent.log
设置为开机自启
chkconfig cloudera-scm-agent on
启动后初始化有点慢,等一会儿再登录web页面
http://cdh-112:7180
# 用户名/密码:admin/admin
Continue 即可 需要等待一会儿
所有节点执行
sysctl vm.swappiness=10
echo 'vm.swappiness=10'>> /etc/sysctl.conf
echo never > /sys/kernel/mm/transparent_hugepage/defrag
echo never > /sys/kernel/mm/transparent_hugepage/enabled
执行完后, 点击 Run Again
之前已经创建了 hive oozie hue 的数据库
oozie 页面无法使用
安装包中的 ext-2.2.zip 上传到 /var/lib/oozie/目录下解压即可
复制mysql 驱动到
cp /usr/share/java/mysql-connector-java.jar /opt/cloudera/parcels/CDH-6.3.2-1.cdh6.3.2.p0.1605554/lib/hive/lib
重启hive
删除数据库scm
清空/var/lib/cloudera-scm-server和/var/log/cloudera-scm-agent
清空/opt/cloudera/parcel-cache和/opt/cloudera/parcels
清空server节点/opt/cloudera/parcel-repo下非下载的文件
rm -rf /var/lib/cloudera-scm-agent/cm_guid
6.卸载
rpm -e --nodeps cloudera-manager-daemons-6.3.1-1466458.el7.x86_64
rpm -e --nodeps cloudera-manager-agent-6.3.1-1466458.el7.x86_64
rpm -e --nodeps cloudera-manager-server-6.3.1-1466458.el7.x86_64
参考 https://blog.csdn.net/wzy0623/article/details/102946646
rm -rf /var/log/*
rm -rf /opt/cloudera*
rm -rf /etc/systemd/system/multi-user.target.wants/cloudera*
rm -rf /etc/default/cloudera*
rm -rf /etc/cloudera*
rm -rf /var/lib/cloudera*
rm -rf /var/log/cloudera*
rm -rf /usr/lib/systemd/system/cloudera*
rm -rf /run/cloudera*
rm -rf /sys/fs/cgroup/systemd/system.slice/cloudera*
rm -rf /etc/security/limits.d/cloudera*
rm -rf /var/lib/yum/repos/x86_64/7/cloudera*
rm -rf /var/cache/yum/x86_64/7/cloudera*
rm -rf /tmp/*
rm -rf /var/lib/hadoop-*
rm -rf /var/lib/impala
rm -rf /var/lib/solr
rm -rf /var/lib/zookeeper
rm -rf /var/lib/hue
rm -rf /var/lib/oozie
rm -rf /var/lib/pgsql
rm -rf /var/lib/sqoop2
rm -rf /data/dfs/
rm -rf /data/impala/
rm -rf /data/yarn/
rm -rf /dfs/
rm -rf /impala/
rm -rf /yarn/
rm -rf /var/run/hadoop-*/
rm -rf /var/run/hdfs-*/
rm -rf /usr/bin/hadoop*
rm -rf /usr/bin/zookeeper*
rm -rf /usr/bin/hbase*
rm -rf /usr/bin/hive*
rm -rf /usr/bin/hdfs
rm -rf /usr/bin/mapred
rm -rf /usr/bin/yarn
rm -rf /usr/bin/sqoop*
rm -rf /usr/bin/oozie
rm -rf /etc/hadoop*
rm -rf /etc/zookeeper*
rm -rf /etc/hive*
rm -rf /etc/hue
rm -rf /etc/impala
rm -rf /etc/sqoop*
rm -rf /etc/oozie
rm -rf /etc/hbase*
rm -rf /etc/hcatalog
rm -rf /var/lib/alternatives/impala-conf
rm -rf /var/lib/alternatives/impalad
rm -rf /var/lib/alternatives/impala-collect-diagnostics
rm -rf /var/lib/alternatives/impala-shell
rm -rf /var/lib/alternatives/impala-collect-minidumps
rm -rf /etc/alternatives/impala-shell
rm -rf /etc/alternatives/impalad
rm -rf /etc/alternatives/impala-collect-diagnostics
rm -rf /etc/alternatives/impala-conf
rm -rf /etc/alternatives/impala-collect-minidumps
rm -rf /var/log/impala*
rm -rf /var/lib/alternatives/zookeeper-client
rm -rf /var/lib/alternatives/zookeeper-server
rm -rf /var/lib/alternatives/zookeeper-conf
rm -rf /var/lib/alternatives/zookeeper-server-initialize
rm -rf /var/lib/alternatives/zookeeper-server-cleanup
rm -rf /var/lib/alternatives/zookeeper-security-migration
rm -rf /etc/alternatives/zookeeper-conf
rm -rf /etc/alternatives/zookeeper-server
rm -rf /etc/alternatives/zookeeper-server-cleanup
rm -rf /etc/alternatives/zookeeper-server-initialize
rm -rf /etc/alternatives/zookeeper-security-migration
rm -rf /etc/alternatives/zookeeper-client
rm -rf /var/log/zookeeper
https://archive.apache.org/dist/flink/
参考博客 https://blog.csdn.net/qq_31454379/article/details/110440037
maven 添加flink 编译镜像
alimaven
central
aliyun maven
http://maven.aliyun.com/nexus/content/repositories/central/
alimaven
aliyun maven
http://maven.aliyun.com/nexus/content/groups/public/
central
central
Maven Repository Switchboard
http://repo1.maven.org/maven2/
central
repo2
central
Human Readable Name for this Mirror.
http://repo2.maven.org/maven2/
ibiblio
central
Human Readable Name for this Mirror.
http://mirrors.ibiblio.org/pub/mirrors/maven2/
jboss-public-repository-group
central
JBoss Public Repository Group
http://repository.jboss.org/nexus/content/groups/public
google-maven-central
Google Maven Central
https://maven-central.storage.googleapis.com
central
maven.net.cn
oneof the central mirrors in china
http://maven.net.cn/content/groups/public/
central
下载 shade 源码
https://archive.apache.org/dist/flink/flink-shaded-12.0/flink-shaded-12.0-src.tgz
下载源码
https://archive.apache.org/dist/flink/flink-1.10.2/flink-1.10.2-src.tgz
解压
tar -xvf flink-shaded-12.0-src.tgz
修改 pom.xml
profiles 中添加以下参数
vendor-repos
vendor-repos
cloudera-releases
https://repository.cloudera.com/artifactory/cloudera-repos
true
false
HDPReleases
HDP Releases
https://repo.hortonworks.com/content/repositories/releases/
false
true
HortonworksJettyHadoop
HDP Jetty
https://repo.hortonworks.com/content/repositories/jetty-hadoop
false
true
mapr-releases
https://repository.mapr.com/maven/
false
true
执行编译
mvn clean install -DskipTests -Pvendor-repos -Dhadoop.version=3.0.0-cdh6.3.2 -Dscala-2.12 -Drat.skip=true -T10C
解压
tar -xvf flink-1.10.2-src.tgz
进入目录编译
mvn clean install -DskipTests -Dfast -Drat.skip=true -Dhaoop.version=3.0.0-cdh6.3.2 -Pvendor-repos -Dinclude-hadoop -Dscala-2.12 -T10C
进入 flink-1.10.2/flink-dist/target/flink-1.10.2-bin
目录打包
tar -zcf flink-1.10.2-bin-scala_2.12.tgz flink-1.10.2
git clone https://github.com/pkeropen/flink-parcel.git
进入目录
vim flink-parcel.properties
更改为一下内容
#FLINK 下载地址
FLINK_URL=https://archive.apache.org/dist/flink/flink-1.10.2/flink-1.10.2-bin-scala_2.12.tgz
#flink版本号
FLINK_VERSION=1.10.2
#扩展版本号
EXTENS_VERSION=BIN-SCALA_2.12
#操作系统版本,以centos为例
OS_VERSION=7
#CDH 小版本
CDH_MIN_FULL=5.2
CDH_MAX_FULL=6.3.3
#CDH大版本
CDH_MIN=5
CDH_MAX=6
赋予 build.sh 文件执行权限
chmod +x build.sh
将上一步打包好的 flink-1.12.2-bin-scala_2.12.tgz 放到 flink-parcel根目录
编译 parcel
cm_ext 下载不下来,可以手动下载放进该目录下
./build.sh parcel
生成 csd
on yarn 版本
./build.sh csd_on_yarn
standalone版本
./build.sh csd_standalone
-rw-r--r-- 1 root root 7737 Jul 7 15:17 FLINK-1.10.2.jar
-rw-r--r-- 1 root root 8260 Jul 7 15:17 FLINK_ON_YARN-1.10.2.jar
cp FLINK-1.10.2-BIN-SCALA_2.12_build/* /var/www/html/flink-1.10.2/
yum -y install httpd
启动
service httpd start
j将Flink镜像复制到 /var/www/html/flink flink 目录自己创建
vim /etc/httpd/conf/httpd.conf
修改
AddType application/x-compress .Z
AddType application/x-gzip .gz .tgz .parcel #此处添加.parcel
重启服务
service httpd restart
角色日志报错
java.lang.ClassNotFoundException: org.apache.hadoop.yarn.exceptions.YarnException
缺少hadoop依赖,进入前面编译好的flink-shaded-10.0目录,拷贝依赖到flink的lib目录
cp /opt/module/flink/flink-shaded-10.0/flink-shaded-hadoop-2-parent/flink-shaded-hadoop-2-uber/target/flink-shaded-hadoop-2-uber-3.0.0-cdh6.3.2-10.0.jar /opt/cloudera/parcels/FLINK/lib/flink/lib/
org.apache.flink.configuration.IllegalConfigurationException: Kerberos login configuration is invalid; keytab is unreadable
没有开启kerberos的话,需要删除这两项内容,否则启动失败