一 前言
这是之前写的一篇文章,现在整理一下,重新发出来。
由于Ambari安装在ARM机器上问题比较多。主要问题如下:
- ambari依赖的node.js版本是0.10.44,而aarch64机器只支持v4.x以上版本。
- ambari依赖的phantomjs版本是1.9.8,而aarch64机器只支持v2.1.0的以上版本呢
- ambari依赖的一些第三方开源项目,aarch64机器不支持。
因此选择开源社区版Hadoop组件部署高可用集群。
二 集群架构设计
2.1 基础环境
节点角色 | IP地址 | 主机名 | 操作系统 | 基础软件 |
---|---|---|---|---|
Master | 192.168.100.60 | bigdata1 | Centos.7.4.aarch64 | jdk1.8_arm64,scala2.11.11 |
Master_backup | 192.168.100.61 | bigdata2 | Centos.7.4.aarch64 | jdk1.8_arm64,scala2.11.11 |
Slave01 | 192.168.100.62 | bigdata3 | Centos.7.4.aarch64 | jdk1.8_arm64,scala2.11.11 |
Slave02 | 192.168.100.63 | bigdata4 | Centos.7.4.aarch64 | jdk1.8_arm64,scala2.11.11 |
Slave03 | 192.168.100.64 | bigdata5 | Centos.7.4.aarch64 | jdk1.8_arm64,scala2.11.11 |
2.2 Hadoop组件
主机名 | Hadoop组件 | 服务 |
---|---|---|
bigdata1 | Hadoop HBase Spark Zeppelin | NameNode\ResourceManager\DFSZKFailoverController\HMaster\HistoryServer\ZeppelinServer |
bigdata2 | Hadoop HBase | NameNode\ResourceManager\DFSZKFailoverController\HMaster |
bigdata3 | Hadoop HBase ZooKeeper | DataNode\NodeManager\QuorumPeerMain\JournalNode\DataNode\HRegionServer |
bigdata4 | Hadoop HBase ZooKeeper | DataNode\NodeManager\QuorumPeerMain\JournalNode\DataNode\HRegionServer |
bigdata5 | Hadoop HBase ZooKeeper | DataNode\NodeManager\QuorumPeerMain\JournalNode\DataNode\HRegionServer |
2.3 软件版本
软件 | 版本号 |
---|---|
Centos | 7.4.aarch64 |
JDK | 1.8_arm64 |
Scala | 2.11.11 |
Hadoop | 2.7.3 |
HBase | 1.1.2 |
Spark | 2.1.0_2.7 |
ZooKeeper | 3.4.6 |
Zeppelin | 0.7.3 |
Kafka | 2.11-0.10.1.1 |
Confluent | 3.1.2 |
Hue | 4.2.0 |
三 下载Hadoop主要组件
下载oracle jdk arm64
wget --no-cookie --header "Cookie: oraclelicense=accept-securebackup-cookie" http://download.oracle.com/otn-pub/java/jdk/8u162-b12/0da788060d494f5095bf8624735fa2f1/jdk-8u162-linux-arm64-vfp-hflt.tar.gz
下载hadoop
wget https://archive.apache.org/dist/hadoop/core/hadoop-2.7.3/hadoop-2.7.3.tar.gz
下载hase
wget https://archive.apache.org/dist/hbase/1.1.2/hbase-1.1.2-bin.tar.gz
下载zk
wget https://archive.apache.org/dist/zookeeper/zookeeper-3.4.6/zookeeper-3.4.6.tar.gz
下载spark
wget https://archive.apache.org/dist/spark/spark-2.1.0/spark-2.1.0-bin-hadoop2.7.tgz
下载kafka
wget https://archive.apache.org/dist/kafka/0.10.1.1/kafka_2.11-0.10.1.1.tgz
下载phoenix
wget https://archive.apache.org/dist/phoenix/phoenix-4.7.0-HBase-1.1/bin/phoenix-4.7.0-HBase-1.1-bin.tar.gz
下载scala
wget https://downloads.lightbend.com/scala/2.11.11/scala-2.11.11.tgz
下载zeppelin
wget http://apache.claz.org/zeppelin/zeppelin-0.7.3/zeppelin-0.7.3-bin-all.tgz
下载hue
wget https://github.com/cloudera/hue/archive/release-4.2.0.tar.gz
上传本地文件到服务器
scp confluent-3.1.2-2.11.tar.gz -p 50300
所有下载目录解压后,都做ln -s软连接
四 基础环境变量配置
4.0 配置etc/hosts.安装ntp,关闭防火墙
4.1 添加bigdata账号
useradd -g bigdata bigdata
4.2 配置集群各节点SSH无密码连接
创建 authorized_keys 文件 //该文件的权限必须是644或者600,否则无效
4.3 环境变量配置(编辑bigdata用户的.bash_profile文件)
export SCALA_HOME=/usr/scala/default
export JAVA_HOME=/usr/java/default
export HADOOP_HOME=/home/bigdata/hadoop
export HBASE_HOME=/home/bigdata/hbase
export SPARK_HOME=/home/bigdata/spark
export HADOOP_CONF_DIR=/home/bigdata/hadoop/etc/hadoop
export HBASE_CONF_DIR=/home/bigdata/hbase/conf
export HADOOP_LOG_DIR=/home/bigdata/log/hdfs
export ZOOKEEPER_HOME=/home/bigdata/zookeeper
export ZEPPELIN_HOME=/home/bigdata/zeppelin
export KAFKA_HOME=/home/bigdata/kafka
export CONFLUENT_HOME=/home/bigdata/confluent
export YARN_LOG_DIR=$HADOOP_LOG_DIR
export HUE_HOME=/home/bigdata/hue
export PATH=$JAVA_HOME/bin:$SCALA_HOME/bin:$ZOOKEEPER_HOME/bin:$HADOOP_HOME/bin:$HADOOP_HOME/sbin:
$HBASE_HOME/bin:$CONFLUENT_HOME/bin:$PATH
下面的安装都是基于bigdata用户操作
五 安装Zookeeper
5.0 配置zoo.cfg
# The number of milliseconds of each tick
tickTime=2000
# The number of ticks that the initial
# synchronization phase can take
initLimit=10
# The number of ticks that can pass between
# sending a request and getting an acknowledgement
syncLimit=5
# the directory where the snapshot is stored.
# do not use /tmp for storage, /tmp here is just
# example sakes.
dataDir=/home/bigdata/zkdata
dataLogDir=/home/bigdata/zklogs
# the port at which the clients will connect
clientPort=2181
server.1=bigdata3:2888:3888
server.2=bigdata4:2888:3888
server.3=bigdata5:2888:3888
在配置的dataDir目录下创建相应的myid文件(这个myid文件必须创建,否则启动会报错)分别在ZK集群节点创建myid号,myid一定对应好zoo.cfg中配置的server后面1、2、3这个ZK号
5.1 在bigdata3,bigdata4,bigdata5 机器安装;
错误1—— 2台正常启动,另一台则报错误报错
/home/bigdata/zkdata/version-2/acceptedEpoch.tmp (Permission denied)
解决:查看verison-2文件夹的权限果然是root,改为bigdata
5.2 在bigdata3,bigdata4,bigdata5三台机器启动zk
zkServer.sh start
六 安装Hadoop
6.0 配置hadoop-env.sh和yarn_env.sh文件
JAVA_HOME=/usr/java/default
6.1 配置Hadoop文件core-site.xml
fs.defaultFS
hdfs://bigdatacluster
hadoop.tmp.dir
/home/bigdata/tmp/hadoop
Abase for other temporary directories.
ha.zookeeper.quorum
bigdata3:2181,bigdata4:2181,bigdata5:2181
hadoop.proxyuser.hue.hosts
*
hadoop.proxyuser.hue.groups
*
6.2 配置hdfs文件hdfs-site.xml
dfs.namenode.name.dir
/data/hdfs/nn
true
dfs.datanode.data.dir
/data/hdfs/dn
true
dfs.replication
2
dfs.nameservices
bigdatacluster
dfs.ha.namenodes.bigdatacluster
bigdata1,bigdata2
dfs.namenode.rpc-address.bigdatacluster.bigdata1
bigdata1:9000
dfs.namenode.http-address.bigdatacluster.bigdata1
bigdata1:50070
dfs.namenode.rpc-address.bigdatacluster.bigdata2
bigdata2:9000
dfs.namenode.http-address.bigdatacluster.bigdata2
bigdata2:50070
dfs.namenode.shared.edits.dir
qjournal://bigdata3:8485;bigdata4:8485;bigdata5:8485/bigdatacluster
dfs.journalnode.edits.dir
/data/hdfs/journal
dfs.ha.automatic-failover.enabled
true
dfs.client.failover.proxy.provider.bigdatacluster
org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider
dfs.ha.fencing.methods
sshfence
shell(/bin/true)
dfs.ha.fencing.ssh.private-key-files
/home/bigdata/.ssh/id_rsa
dfs.ha.fencing.ssh.connect-timeout
30000
dfs.webhdfs.enabled
true
6.3 配置MR文件mapred-site.xml
mapreduce.framework.name
yarn
6.4 配置YARN 文件yarn-site.xml
yarn.resourcemanager.ha.enabled
true
yarn.resourcemanager.cluster-id
rm-cluster
yarn.resourcemanager.ha.rm-ids
rm1,rm2
yarn.resourcemanager.ha.automatic-failover.recover.enabled
true
yarn.resourcemanager.hostname.rm1
bigdata1
yarn.resourcemanager.hostname.rm2
bigdata2
yarn.resourcemanager.store.class
org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore
yarn.resourcemanager.zk-address
bigdata3:2181,bigdata4:2181,bigdata5:2181
yarn.resourcemanager.scheduler.address.rm1
bigdata1:8030
yarn.resourcemanager.scheduler.address.rm2
bigdata2:8030
yarn.resourcemanager.resource-tracker.address.rm1
bigdata1:8031
yarn.resourcemanager.resource-tracker.address.rm2
bigdata2:8031
yarn.resourcemanager.address.rm1
bigdata1:8032
yarn.resourcemanager.address.rm2
bigdata2:8032
yarn.resourcemanager.admin.address.rm1
bigdata1:8033
yarn.resourcemanager.admin.address.rm2
bigdata2:8033
yarn.resourcemanager.webapp.address.rm1
bigdata1:8088
yarn.resourcemanager.webapp.address.rm2
bigdata2:8088
yarn.nodemanager.vmem-check-enabled
false
配置httpfs-site.xml(如果集成Hue,并且Hadoop集群是HA情况,需要httpFS访问hdfs)
httpfs.proxyuser.hue.hosts
*
httpfs.proxyuser.hue.groups
*
6.5 启动hadoop集群
6.5.1 确保三台slave机器已启动zk
6.5.2 在三台slave机器启动jouralnode
hadoop-daemon.sh start journalnode
6.5.3 在bigdata1上,第一次运行分别格式化hdfs,zk
hdfs namenode –format
hdfs zkfc –formatZK //手动输入,复制拷贝命令,不识别
6.5.4 在bigdata2 上格式化目录并同步两个master节点的元数据
# 方法1,通过bigdata1:9000端口连接不到bigdata1,所以采用方法2
hdfs namenode -bootstrapStandby
# 方法2,直接拷贝bigdata1格式化后的元数据到bigdata2
scp -r nn bigdata2:/data/hdfs/
6.5.5 分别在bigdata1和bigdata2上启动ZKFC来监控NameNode
#在bigdata1 启动ZKFC来监控NameNode
hadoop-daemon.sh start zkfc
#在bigdata2 启动ZKFC来监控NameNode
hadoop-daemon.sh start zkfc
6.5.6 在bigdata1上启动hdfs
start-dfs.sh
报错:
Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
解决:添加编译好的hadoop aarch64位so文件
6.5.7 在bigdata1上启动yarn
start-yarn.sh
6.5.8 在bigdata2上启动yarn
yarn-daemon.sh start resourcemanager
http://bigdata1:50070
http://bigdata1:8088/cluster
七 Hbase安装
7.0 对于Hbase 修改 ulimit 限制
echo "bigdata - nofile 32768" >> /etc/security/limits.conf
echo "bigdata - nproc 32000" >> /etc/security/limits.conf
echo "session required pam_limits.so" >> /etc/pam.d/common-session
7.1 配置hbase-env.sh
# The java implementation to use. Java 1.7+ required.
export JAVA_HOME=/usr/java/default
//配置该路径,否则在HA情况下,不能解析HDFS路径
export HADOOP_HOME=/home/bigdata/hadoop
# Tell HBase whether it should manage it's own instance of Zookeeper or not.
export HBASE_MANAGES_ZK=false
配置hbase-site.xml
hbase.zookeeper.quorum
bigdata3:2181,bigdata4:2181,bigdata5:2181
The directory shared by RegionServers.
hbase.zookeeper.property.clientPort
2181
hbase.zookeeper.property.dataDir
/home/bigdata/zkdata
Property from ZooKeeper config zoo.cfg.
The directory
where the snapshot is stored.
hbase.rootdir
hdfs://bigdatacluster/hbase
The directory shared by RegionServers.
hbase.cluster.distributed
true
The mode the cluster will be in. Possible values are
false: standalone and pseudo-distributed setups with managed
Zookeeper
true: fully-distributed with unmanaged Zookeeper Quorum (see
hbase-env.sh)
Regionservers
bigdata3
bigdata4
bigdata5
7.2 启动Hbase
在bigdata1启动
start-hbase.sh
在bigdata2启动
hbase-daemon.sh start master
http://bigdata1:16010/master-status
8 基于HA高可用环境安装Spark on Yarn (在bigdata1上安装)
8.1 配置spark-env.sh
export SCALA_HOME=/usr/scala/default
export JAVA_HOME=/usr/java/default
export HADOOP_HOME=/home/bigdata/hadoop
export HBASE_HOME=/home/bigdata/hbase
export HADOOP_CONF_DIR=/home/bigdata/hadoop/etc/hadoop
export HBASE_CONF_DIR=/home/bigdata/hbase/conf
export HADOOP_LOG_DIR=/home/bigdata/log/hdfs
8.2 配置spark-defaults
spark.master yarn
spark.driver.memory 2g
spark.executor.memory 2g
spark.eventLog.enabled true
#如果hadoop 是HA环境,注意命名空间的名称
spark.eventLog.dir hdfs://bigdatacluster/spark-logs
# 历史日志服务配置
spark.history.provider org.apache.spark.deploy.history.FsHistoryProvider
spark.history.fs.logDirectory hdfs://bigdatacluster/spark-logs
spark.history.fs.update.interval 10s
spark.history.ui.port 18080
8.3 在hdfs上创建spark日志目录
hdfs dfs -mkdir /spark-logs
8.4 启动spark日志
start-history-server.sh
访问页面
http://bigdata1:18080/
九 安装Zeppelin(在bigdata1上安装)
9.0 修改zeppelin-site.xml文件
# 修改端口号为28080
zeppelin.server.port
28080
Server port.
修改 zeppelin-env.sh文件
export JAVA_HOME=/usr/java/default
export HADOOP_CONF_DIR=/home/bigdata/hadoop/etc/hadoop
#### HBase interpreter configuration ####
export HBASE_HOME=/home/bigdata/hbase #
export HBASE_CONF_DIR=/home/bigdata/hbase/conf
启动服务
zeppelin-daemon.sh start
http://bigdata1:28080
十 安装部署Kafka(在bigdata1上安装)
10.1 首先确保已经安装启动ZK服务器
10.2 配置server.properties文件
# The id of the broker. This must be set to a unique integer for each broker.
broker.id=0
## 监听本机所有网络接口(network interfaces)
listeners=PLAINTEXT://0.0.0.0:9092
## 被发布到Zookeeper上,公布给Client让Client使用
advertised.listeners=PLAINTEXT://bigdata1:9092
num.network.threads=3
# The number of threads doing disk I/O
num.io.threads=8
# The send buffer (SO_SNDBUF) used by the socket server
socket.send.buffer.bytes=102400
# The receive buffer (SO_RCVBUF) used by the socket server
socket.receive.buffer.bytes=102400
# The maximum size of a request that the socket server will accept (protection against OOM)
socket.request.max.bytes=104857600
//配置kafka的日志目录
log.dirs=/home/bigdata/kafka-logs
# The default number of log partitions per topic. More partitions allow greater
# parallelism for consumption, but this will also result in more files across
# the brokers.
num.partitions=1
# The number of messages to accept before forcing a flush of data to disk
#log.flush.interval.messages=10000
#log.flush.interval.ms=1000
log.retention.hours=168
#log.retention.bytes=1073741824
log.segment.bytes=1073741824
log.retention.check.interval.ms=300000
############################# Zookeeper #############################
# root directory for all kafka znodes.
//配置已安装好的zk集群地址
zookeeper.connect=bigdata3:2181,bigdata4:2181,bigdata5:2181
# Timeout in ms for connecting to zookeeper
zookeeper.connection.timeout.ms=6000
10.3 启动kafka
kafka-server-start.sh -daemon config/server.properties
10.4 测试消息
## 1.创建主题
kafka-topics.sh --create --topic TestTopic003 --partitions 1 --replication-factor 1 --zookeeper bigdata3:2181,bigdata4:2181,bigdata5:2181
## 2.发送消息
kafka-console-producer.sh --topic TestTopic003 --broker-list bigdata1:9092
This is a message
This is another message
## 3.消费消息
kafka-console-consumer.sh --topic TestTopic003 --from-beginning --bootstrap-server bigdata1:9092
十一 安装Confluent
11.1上传本地3.1.2 版本至服务器,配置环境变量;
11.2 部署hbase-sink.jar包
nohup schema-registry-start -daemon $CONFLUENT_HOME/etc/schema-registry/schema-registry.properties >nohup_shcema_registry.log> nohup_shcema_registry.err
nohup connect-standalone -daemon $CONFLUENT_HOME/etc/schema-registry/connect-avro-standalone.properties $CONFLUENT_HOME/etc/kafka-connect-hbase/hbase-sink.properties > nohup_standalone.log>nohup_standalone.err
kafka-avro-console-producer \
--broker-list bigdata1:9092 --topic test \
--property value.schema='{"type":"record","name":"record","fields":[{"name":"id","type":"int"}, {"name":"name", "type": "string"}]}'
{"id": 1, "name": "sz”}
{"id": 2, "name": "bj”}
{"id": 3, "name": “aa”}
{"id": 4, "name": "bb”}
schema-registry-start $CONFLUENT_HOME/etc/schema-registry/schema-registry.properties
connect-standalone $CONFLUENT_HOME/etc/schema-registry/connect-avro-standalone.properties $CONFLUENT_HOME/etc/kafka-connect-hbase/hbase-sink.properties
{"id": 1, "name": "sz”}
{"id": 2, "name": "bj”}
{"id": 3, "name": “aa”}
{"id": 4, "name": "bb”}
错误记录:
部署远程kafka客户端写入kafka集群数据时
org.apache.kafka.common.errors.TimeoutException: Failed to update metadata after 60000 ms.
解决办法:远程客户端,没有找到服务器,一种方法kafka的server.properties是配置文件中主机名改为ip地址,另一种方式是在远程客户端配置host文件中ip地址和主机名的映射;
十二 安装HUE
这个组件安装比较复杂,需要编译安装
12.1 安装依赖库
sudo yum install ant asciidoc cyrus-sasl-devel cyrus-sasl-gssapi cyrus-sasl-plain gcc gcc-c++ krb5-devel libffi-devel libxml2-devel libxslt-devel make mysql mysql-devel openldap-devel python-devel sqlite-devel gmp-devel
注意安装ant时,会依赖open jdk,如果已经安装oracle jdk 版本安装完ant后,卸载openjdk即可。
12.2 安装maven 3+版本,配置环境变量
12.3 编译部署(时间较长,耐心等待)
tar -zxvf hue-release-4.2.0
ln -s hue-release-4.2.0 hue
cd hue-release-4.2.0
如果要编译中文环境,进入hue-release-4.2.0/desktop/core/src/desktop目录请修改settings.py文件
#注释英文,添加简体中文
#LANGUAGE_CODE = 'en-us'
LANGUAGE_CODE='zh_CN'
# 开始编译
make apps
12.4 配置Hue
进入hue-release-4.2.0/desktop/conf目录,
cp pseudo-distributed.ini hue.ini
配置hue.ini文件
[desktop]
# Set this to a random string, the longer the better.
# This is used for secure hashing in the session store.
secret_key=jFE93j;2[290-eiw.KEiwN2s3['d;/.q[eIW^y#e=+Iei*@Mn
12.5 启动httpfs服务(HA集群需要)
需要/hadoop/sbin/httpfs.sh start来启动Bootstrap进程,以服务HttpFS管理
12.6 启动Hue
参考文章
1.Hadoop 2.7.3 高可用(HA)集群部署. HanBert
2.Spark On Yarn Install, Configure, and Run Spark on Top of a Hadoop YARN Cluster .Florent Houbart
3.Hue官方安装文档 .