AmbariServer高可用搭建指南

一.前言

本篇文章主要讲解Ambari Server端的高可用搭建。注意,是Ambari的Server,而不是Hadoop集群的应用。截止目前为止(Ambari 2.7.x),hortonworks官方并没有给出AmbariServer的高可用的内部实现。因此本篇文章主要讲解如何通过现有开源组件实现AmbariServer的高可用搭建。下述内容直接简称AmbariServer为Ambari。

另附:针对Hadoop集群应用组件的高可用搭建,hortonworks在以下文档中已经有所讲解(以HDP2.6.5为例):

  • HDFS-NameNode高可用:https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.6.5/bk_hadoop-high-availability/content/ch_HA-NameNode.html
  • YARN-ResourceManager高可用:https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.6.5/bk_hadoop-high-availability/content/ch_HA-ResourceManager.html
  • HBase高可用:https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.6.5/bk_hadoop-high-availability/content/ch_HA-HBase.html
  • HiveMetastore高可用:https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.6.5/bk_hadoop-high-availability/content/ch_HA-Hive.html
  • HiveServer2高可用:https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.6.5/bk_hadoop-high-availability/content/ch_multiple_hs2s.html
  • Oozie高可用:https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.6.5/bk_hadoop-high-availability/content/ha-nn-deploy-oozie.html

 

二.部署搭建

1.环境

  • 操作系统:CentOS 7.3 
  • MariaDB:10.1.x
  • HAProxy:1.5.x
  • Keepalived:1.3.5
  • Ambari :2.6.2.x

关于兼容性问题,可以参考这里:https://supportmatrix.hortonworks.com/

 

2.整体架构

这里对Ambari高可用部署的整体架构做下简单描述:

1)DB集群

首先,要保证上层应用服务的高可用性,下层的数据持久化必须可靠。因此选用MariaDB Galera Cluster作为Ambari的关系型数据库。Galer集群的实现方案直接解决了数据可靠性的问题,且该解决方案天生支持多点写入。

那么有了galera集群之后,剩下的问题仅有一个:因为Ambari并没有任何FailOver的机制,其数据库配置就是一个简单且标准的JDBC配置(Ambari使用C3P0连接数据库)。因此,需要向Ambari提供一个稳定且可靠的数据库接入,即VIP。

关于Galera Cluster,请参考:http://galeracluster.com/documentation-webpages/

关于博主使用的MariaDB Galera Cluster,请参考:https://mariadb.com/kb/zh-cn/getting-started-with-mariadb-galera-cluster/

 

2)负载均衡代理(VIP)

因此,引入HAProxy作为代理,对外暴露唯一的VIP,即frontend;对内连接各个数据库节点。这样,Ambari对于分布式数据库的访问就透明化了:Ambari只需要通过VIP和对应的端口(默认3306)进行数据库连接即可,即使某个数据库节点异常导致不能提供服务,Ambari也能在无须更改任何配置的情况下实现FailOver切换。

关于HAProxy,请参考:http://www.haproxy.org/

 

3)负载均衡器HA

可是此时,如果用于提供VIP能力的HAProxy异常,那么Ambari对于数据库的访问同样会受到影响。(无论一个院子再大,有再多的房子可供人居住,要是大门没了,院子再大、房子再多都没用。人都进不去,如何使用。。。)因此需要某种机制来实现HAProxy的高可用,所以我们再次引入Keepalived。

Keepalived的作用就是要保证假如当前提供服务的HAProxy挂了,那么对应的VIP入口的工作应该交给其它节点的HAProxy代为执行。其实就是实现HAProxy的Active-Passive集群模式。当然,博主这里是直接采用了keepalived进行实现,功能已经能满足需求。如果面临类似更复杂的环境,建议采用Pacemaker+Corosync,进行统一的集群资源管理和集群通讯。

关于Keepalived,请参考:http://www.keepalived.org/manpage.html

 

3.具体应用配置

对应上述描述,下面给出关于各个部分应用的配置。注意:

  1. 下述配置信息借鉴即可,实际生产环境还需根据实际情况斟酌;

  2. 采用标准的3节点最小化集群:

  • 节点1:node1 ——AmbariServer-active, MariaDB-1, HAProxy-1, Keepalived-1
  • 节点2:node2 ——AmbariServer-standby, MariaDB-2, HAProxy-2, Keepalived-2
  • 节点3:node3 —— MariaDB-3, HAProxy-3, Keepalived-3

1)MariaDB配置(/etc/my.cnf.d/server.cnf)

  • 节点1,节点2,节点3配置均相同(注:mysql版本需要区别节点主机名称):
# this is read by the standalone daemon and embedded servers
[server]

# this is only for the mysqld standalone daemon
[mysqld]

#
# * Galera-related settings
#
[galera]
# Mandatory settings
wsrep_on=ON
wsrep_cluster_name="mariadb_cluster"
wsrep_provider=/usr/lib64/galera/libgalera_smm.so
wsrep_cluster_address=gcomm://node1.io,node2.io,node3.io
binlog_format=row
default_storage_engine=InnoDB
innodb_autoinc_lock_mode=2
innodb_doublewrite=1
#
# Allow server to accept connections on all interfaces.
#
bind-address=0.0.0.0
#
# Optional setting
#wsrep_slave_threads=1
innodb_flush_log_at_trx_commit=0

# this is only for embedded server
[embedded]

# This group is only read by MariaDB servers, not by MySQL.
[mariadb]
port=3307

2)HAProxy配置(/etc/haproxy/haproxy.cfg)

  • 节点1,节点2,节点3配置均相同:
​#---------------------------------------------------------------------
# Global settings
#---------------------------------------------------------------------
global
    # to have these messages end up in /var/log/haproxy.log you will
    # need to:
    #
    # 1) configure syslog to accept network log events.  This is done
    #    by adding the '-r' option to the SYSLOGD_OPTIONS in
    #    /etc/sysconfig/syslog
    #
    # 2) configure local2 events to go to the /var/log/haproxy.log
    #   file. A line like the following can be added to
    #   /etc/sysconfig/syslog
    #
    #    local2.*                       /var/log/haproxy.log
    #
    log         127.0.0.1 local2

    chroot      /var/lib/haproxy
    pidfile     /var/run/haproxy.pid
    maxconn     4000
    user        haproxy
    group       haproxy
    daemon

    # turn on stats unix socket
    stats socket /var/lib/haproxy/stats
    
#---------------------------------------------------------------------
# common defaults that all the 'listen' and 'backend' sections will
# use if not designated in their block
#---------------------------------------------------------------------
defaults
    mode                    http
    log                     global
    option                  httplog
    option                  dontlognull
    option http-server-close
    option forwardfor       except 127.0.0.0/8
    option                  redispatch
    retries                 3
    timeout http-request    10s
    timeout queue           1m
    timeout connect         10s
    timeout client          1m
    timeout server          1m
    timeout http-keep-alive 10s
    timeout check           10s
    maxconn                 3000

#---------------------------------------------------------------------
# mariadb configuration
#---------------------------------------------------------------------
listen mariadb
  mode tcp
  bind *:3306
  timeout client 28801s
  timeout server 28801s
  balance source
  option tcpka
  option mysql-check user haproxy
  server mariadb-1 node1.io:3307 check weight 1
  server mariadb-2 node1.io:3307 backup check weight 1
  server mariadb-3 node1.io:3307 backup check weight 1   
    
#---------------------------------------------------------------------
# ambari-server configuration
#---------------------------------------------------------------------
frontend ambari-server_frontend
  bind *:8080
  mode http
  default_backend ambari-server
backend ambari-server
  mode http
  server ambari-server-1 node1.io:8082 cookie ambari-server-1 check on-marked-up shutdown-backup-sessions
  server ambari-server-2 node2.io:8082 backup cookie ambari-server-2 check

frontend ambari-server-ow_frontend
  bind *:8440
  mode tcp
  default_backend ambari-server-ow
backend ambari-server-ow
  mode tcp
  server ambari-server-ow-1 node1.io:8442 check on-marked-up shutdown-backup-sessions
  server ambari-server-ow-2 node2.io:8442 backup check

frontend ambari-server-tw_frontend
  bind *:8441
  mode tcp
  default_backend ambari-server-tw
backend ambari-server-tw
  mode tcp
  server ambari-server-tw-1 node1.io:8443 check on-marked-up shutdown-backup-sessions
  server ambari-server-tw-2 node2.io:8443 backup check

#---------------------------------------------------------------------
# status configuration
#---------------------------------------------------------------------
listen status
  bind *:8000
  stats enable
  stats uri /admin
  stats auth admin:admin
  stats hide-version
  stats admin if TRUE
  stats refresh 10s

​

3)Keepalived配置(/etc/keepalived/keepalived.conf)

  • 节点1配置(主):
global_defs {
   router_id LVS_DEVEL
}

vrrp_script chk_haproxy {
    script "killall -0 haproxy"
    interval 2
    weight 2
}

vrrp_instance haproxy {
    state MASTER
    interface eth0
    virtual_router_id 51
    priority 150
    advert_int 1
    authentication {
        auth_type PASS
        auth_pass 1111
    }
    virtual_ipaddress {
        node1.io
    }
    track_script {
        chk_haproxy
    }
}
  • 节点2和节点3配置(备):
global_defs {
   router_id LVS_DEVEL
}

vrrp_script chk_haproxy {
    script "killall -0 haproxy"
    interval 2
    timeout 2
    fall 3
}

vrrp_instance haproxy {
    state BACKUP
    interface eth0
    virtual_router_id 51
    priority 100
    advert_int 1
    authentication {
        auth_type PASS
        auth_pass 1111
    }
    virtual_ipaddress {
        node1.io
    }
    track_script {
        chk_haproxy
    }
}

 

4)Ambari Server配置(/etc/ambari-server/conf/ambari.properties)

  • 节点1和节点2的配置文件均添加如下配置:
security.server.one_way_ssl.port=8442
security.server.two_way_ssl.port=8443
  • AmbariServer JDBC配置选择node1主机作为数据库连接对象。

 

三.注意事项及说明

1.完成配置修改后注意要重启服务

1)MariaDB

systemctl restart mariadb

2)HAProxy

systemctl restart haproxy

3)Keepalived

systemctl restart keepalived

4)Ambari Server

ambari-server restart

 

2.端口映射关系

根据上述配置,个应用组件的端口映射关系如下:

应用组件 默认端口 实际端口号 haproxy代理端口 端口描述
MariaDB 3306 3307 3306 MariaDB TCP通讯端口
AmbariServer 8080 8082 8080 服务端HTTP访问端口
AmbariServer 8440 8442 8440 HTTPS-握手通讯端口
AmbariServer 8441 8443 8441 HTTPS-心跳及注册通信端口

 

3.配置文件注意事项

1)HAProxy - Ambari节点

在同一时间下,默认只能有一个AmbariServer提供服务,故可以将AmbariServer看作一个有状态服务(Ambari自己的会话保持还是做得不错的)。因此在HAProxy的配置中,务必要将额外的AmbariServer节点配置为backup,并且为非backup节点添加"on-marked-up shutdown-backup-sessions"属性,确保当active和backup同时存在时,所有的负载流量通过active节点的AmbariServer而不是backup节点的AmbariServer。

2)HAProxy - Ambari端口

默认AmbariAgent与AmbariServer的交互端口就是上述的默认端口8440和8441。如果haproxy的代理端口不是上述2个默认端口,则需要为AmbariAgent添加额外配置,具体操作步骤详见:https://docs.hortonworks.com/HDPDocuments/Ambari-2.6.2.0/bk_ambari-administration/content/optional_changing_the_ambari_server_agent_comm_port.html

3) HAProxy - MariaDB

默认MariaDB的全局和会话超时时间是28800s(8小时),也就是说只有当空闲连接超过8小时才会被MariaDB销毁。AmbariServer对应的空闲连接超时时间是14400s(4小时)。因此HAProxy的连接超时配置必须大于上述两者,即"timeout client 28801s"和"timeout server 28801s"。否则HAProxy将会优先MariaDB和AmbariServer,主动超时掉空闲连接。根据测试,当HAProxy超时连接后,AmbariServer后台会出现JDBC的告警,旧的空闲连接无法复用,且会创建新的连接,并且有超出最大连接上限的风险(默认MariaDB的最大连接数量为151)。

 

4.云平台下使用的注意事项

上述HAProxy + Keepalived的高可用实现模式仅适用于一般的裸金属环境。如果是云平台部署,则可以忽略上述实现(需要关闭服务),然后采用云平台的负载均衡器代为实现。原因:因为在云环境下,虚拟机IP就是虚拟交换设备的PORT,因此无法实现虚拟机网卡IP的直接配置。

你可能感兴趣的:(hadoop)