11g RAC负载均衡和故障转移配置

  • Oracle RAC层面提供两种不同的方式进行负载均衡:
    • 1.客户端负载均衡 (Client-Side LB)
    • 2.服务器端均衡(Server-Side LB)
  • Oracle RAC 的failover 按实现方式分为两种:
    • 一:Client-Side TAF (客户端处理)
    • 二:.Service-Side TAF (服务端处理)

Oracle RAC层面提供两种不同的方式进行负载均衡:

1.通过Connection Balancing,按照某种算法把用户分配到不同的节点。
2.通过service,在应用层上分散负载,也可认为是根据业务进行分散负载。

Connection Balancing
这种负载均衡是在用户连接这个层次进行的,在用户请求建立连接时,根据每个节点的负载决定把连接分配给哪个实例,而一旦连接建立之后,会话的所有操作都在这个实例上完成,二不会再分派给其他节点了。

1.客户端负载均衡 (Client-Side LB)

客户端负载均衡(Client -Side LB) 是 oracle 8i使用的方法,配置在客户端的tnsnames.ora文件中,加入了LOAD_BALANCE=ON条目.
当客户端发起连接时,会从地址列表中随机的选取一个,在使用随机算法把连接请求分配到各个实例。

配置示例(客户端):

breath =
  (DESCRIPTION =
    (ADDRESS = (PROTOCOL = TCP)(HOST = breath01-vip)(PORT = 1522))
    (ADDRESS = (PROTOCOL = TCP)(HOST = breath02-vip)(PORT = 1522))
    (LOAD_BALANCE = ON)
    (CONNECT_DATA =
      (SERVER = DEDICATED)
      (SERVICE_NAME = breath)
    )
  )

2.服务器端均衡(Server-Side LB)

Server-Side 是从Oracle 9i 引入的。它的实现依赖于Listener收集负载信息。在数据库运行过程中,PMON后台进程会收集系统的负载信息,然后登记到listener中
最少1分钟,最多10分钟PMON就要做一个信息更新,并且如果节点负载越高,更新频率就越高,以保证listener能掌握每个节点准确的负载情况。
如果listener关闭了,PMON进程会每隔1秒钟检查listener是否重启。除了这个自动的定时更新任务外,用户也可以使用 alter system register 命令来手动进行注册
这个自动更新动作可以从listener的日志中看到,比如下面的这个listener日志片段很清楚的记录了这些动作。
注意,实例启动是PMON进程进行的第一次注册过程叫做Server-register,而后更新过程叫做service-update.
查看日志:

[grid@breath01 ~]$  tail -100 /u01/app/grid/diag/tnslsnr/breath01/listener/trace/listener.log | grep service_update
08-NOV-2017 09:28:18 * service_update * breath1 * 0
08-NOV-2017 09:33:56 * service_update * +ASM1 * 0
08-NOV-2017 09:33:57 * service_update * breath1 * 0
08-NOV-2017 09:38:19 * service_update * breath1 * 0
08-NOV-2017 09:38:22 * service_update * breath1 * 0
08-NOV-2017 09:38:34 * service_update * breath1 * 0
08-NOV-2017 09:39:04 * service_update * breath1 * 0
08-NOV-2017 09:48:22 * service_update * breath1 * 0
08-NOV-2017 09:48:31 * service_update * breath1 * 0
08-NOV-2017 09:48:58 * service_update * breath1 * 0
08-NOV-2017 09:58:25 * service_update * breath1 * 0
08-NOV-2017 09:58:37 * service_update * breath1 * 0
[grid@breath01 ~]$ cat /u01/app/grid/diag/tnslsnr/breath01/listener/trace/listener.log | grep service_register | tail -10
02-NOV-2017 10:52:07 * service_register * LsnrAgt * 0
02-NOV-2017 10:52:24 * service_register * +ASM1 * 0
02-NOV-2017 10:52:45 * service_register * breath1 * 0
02-NOV-2017 17:20:49 * service_register * breath1 * 0
03-NOV-2017 15:12:33 * service_register * LsnrAgt * 0
03-NOV-2017 15:13:01 * service_register * breath1 * 0
03-NOV-2017 15:13:03 * service_register * +ASM1 * 0
07-NOV-2017 09:39:32 * service_register * LsnrAgt * 0
07-NOV-2017 09:40:16 * service_register * +ASM1 * 0
07-NOV-2017 09:41:17 * service_register * breath1 * 0

配置方法
所有节点的tnsnames.ora文件添加以下内容:
[oracle@breath01 ~] cat c a t ORACLE_HOME/network/admin/tnsnames.ora
BREATH_REMOTE =
(DESCRIPTION =
(ADDRESS = (PROTOCOL = TCP)(HOST = breath01-vip)(PORT = 1522))
(ADDRESS = (PROTOCOL = TCP)(HOST = breath02-vip)(PORT = 1522))
)

其中一个节点修改数据库remote_listener 参数。

SYS@breath1>alter system set remote_listener='BREATH_REMOTE' scope=both sid='*';
检查参数设置
SYS@breath1>show parameter remote_lis

NAME                     TYPE    VALUE
------------------------------------ ----------- ------------------------------
remote_listener              string  BREATH_REMOTE
SYS@breath2>show parameter remote_lis

NAME                     TYPE    VALUE
------------------------------------ ----------- ------------------------------
remote_listener              string  BREATH_REMOTE

监听状态(节点一的,其他节点同样如此)

[grid@breath01 ~]$ lsnrctl status

LSNRCTL for Linux: Version 11.2.0.3.0 - Production on 09-NOV-2017 14:52:18

Copyright (c) 1991, 2011, Oracle.  All rights reserved.

Connecting to (DESCRIPTION=(ADDRESS=(PROTOCOL=IPC)(KEY=LISTENER)))
STATUS of the LISTENER
------------------------
Alias                     LISTENER
Version                   TNSLSNR for Linux: Version 11.2.0.3.0 - Production
Start Date                07-NOV-2017 09:39:32
Uptime                    2 days 5 hr. 12 min. 46 sec
Trace Level               off
Security                  ON: Local OS Authentication
SNMP                      OFF
Listener Parameter File   /u01/app/11.2.0/grid/network/admin/listener.ora
Listener Log File         /u01/app/grid/diag/tnslsnr/breath01/listener/alert/log.xml
Listening Endpoints Summary...
  (DESCRIPTION=(ADDRESS=(PROTOCOL=ipc)(KEY=LISTENER)))
  (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=192.0.2.111)(PORT=1522)))
  (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=192.0.2.113)(PORT=1522)))
Services Summary...
Service "+ASM" has 1 instance(s).
  Instance "+ASM1", status READY, has 1 handler(s) for this service...
Service "breath" has 2 instance(s).
  Instance "breath1", status READY, has 2 handler(s) for this service...
  Instance "breath2", status READY, has 1 handler(s) for this service...
Service "breathXDB" has 2 instance(s).
  Instance "breath1", status READY, has 1 handler(s) for this service...
  Instance "breath2", status READY, has 1 handler(s) for this service...
The command completed successfully

日志

[grid@breath01 ~]$ cat /u01/app/grid/diag/tnslsnr/breath01/listener/trace/listener.log | grep service_register | tail -10
02-NOV-2017 10:52:45 * service_register * breath1 * 0
02-NOV-2017 17:20:49 * service_register * breath1 * 0
03-NOV-2017 15:12:33 * service_register * LsnrAgt * 0
03-NOV-2017 15:13:01 * service_register * breath1 * 0
03-NOV-2017 15:13:03 * service_register * +ASM1 * 0
07-NOV-2017 09:39:32 * service_register * LsnrAgt * 0
07-NOV-2017 09:40:16 * service_register * +ASM1 * 0
07-NOV-2017 09:41:17 * service_register * breath1 * 0
09-NOV-2017 14:49:51 * service_register * breath1 * 0
09-NOV-2017 14:51:49 * service_register * breath2 * 0    ------节点二,注册到监听

Oracle RAC 的failover 按实现方式分为两种:

一:Client-Side TAF (客户端处理)

基于客户端tnsname.ora配置方式
对已有连接处理方式不同,可再分为两小类:
1.Client-Side Connect-Time Failover TAF
客户端tnsnames.ora中 启用 FAILOVER=ON参数(默认及时开启的,写不写都一样)。遍历DESCRIPTION 中所有 地址,尝试连接,直到连接成功。但是现有已建立连接所对应的实例故障宕机,客户端能感觉的到
2. TAF
客户端tnsnames.ora中 启用 FAILOVER_MODE参数,当已建立连接所对应的实例故障宕机,客户端自动去向其他正常实例发送请求。客户端无感,实现透明故障转移 .

二:.Service-Side TAF (服务端处理)

基于服务端配置 service,客户端无需配置任何参数。所有请求交由service 按照既定策略进行故障转移。 service 会指定 主与备的实例,正常连接主的实例,当主的实例故障宕机,透明转移到备用的实例。

important:
Do not set the GLOBAL_DBNAME parameter in the SID_LIST_listener_name section of the listener.ora file. A statically configured global database name disables TAF.
不能再listener.ora文件中设置 GLOBAL_DBNAME,因为这个参数会禁用TAF

客户端tnsnames.ora配置:

breath_test2 =
  (DESCRIPTION =
    (ADDRESS = (PROTOCOL = TCP)(HOST = breath01-vip)(PORT = 1522))
    (ADDRESS = (PROTOCOL = TCP)(HOST = breath02-vip)(PORT = 1522))
    (LOAD_BALANCE = ON)
    (FAILOVER=ON)
    (CONNECT_DATA =
      (SERVER = DEDICATED)
      (SERVICE_NAME = breath)
     (FAILOVER_MODE=
        (TYPE=session)
        (METHOD=basic)
        (RETRIES=180)
        (DELAY=5)
     )
    )
  )

测试:

[oracle@breath admin]$ sqlplus system/oracle@breath_test2
SQL> select INSTANCE_NAME,HOST_NAME from v$instance;

INSTANCE_NAME    HOST_NAME
---------------- ----------------------------------------------------------------
breath1      breath01.example.com

SQL> col MACHINE for a30           
SQL> SELECT MACHINE, FAILOVER_TYPE, FAILOVER_METHOD, FAILED_OVER, COUNT(*) FROM V$SESSION GROUP BY MACHINE, FAILOVER_TYPE, FAILOVER_METHOD, FAILED_OVER;

MACHINE                FAILOVER_TYPE FAILOVER_M FAI   COUNT(*)
------------------------------ ------------- ---------- --- ----------
breath01.example.com           NONE      NONE   NO      47
breath.example.com         SESSION       BASIC  NO       1

SQL> select sid from v$mystat where rownum<2;

       SID
----------
       156
SQL> create table test1(id number,name varchar2(20));

Table created.
SQL> insert into test1 values(1,'breath_client');

1 row created.
SQL> commit;
commit
*
ERROR at line 1:
ORA-25405: transaction status unknown
SQL> select * from test1;

no rows selected

SQL> SELECT MACHINE, FAILOVER_TYPE, FAILOVER_METHOD, FAILED_OVER, COUNT(*) FROM V$SESSION GROUP BY MACHINE, FAILOVER_TYPE, FAILOVER_METHOD, FAILED_OVER;

MACHINE                FAILOVER_TYPE FAILOVER_M FAI   COUNT(*)
------------------------------ ------------- ---------- --- ----------
breath.example.com         SESSION       BASIC  YES      1
breath02.example.com           NONE      NONE   NO      48

Server Service TAF
服务器端透明故障转移,通过配置service来实现,客户端无须任何配置。
要提高RAC的性能,可以从两方面入手:1. 提高Cache Fusion的能力,这个可以使用更好的互联设备,比如G级的private network,或者使用Infiniband等DRA技术。2. 尽量减少Cache Fusion的流量,减少实例间的互相依赖。而Service就是后一种思路基础上发展出来的。
之前提到的应用层负载也是基于创建多个service 指定 不同的 “首选实例”,然后不同的应用程序去分散连接service,来达到分散负载的功能

service创建有多种方式,dbca,oem,srvctl命令行。下面是srvctl配置方式:
不能用gird用户创建service,报错如下:
[grid@breath01 ~]$ srvctl add service -d breath -s breath_TAF1 -r “breath1” -a “breath2” -P basic -m basic -e select -w 3 -z 100
PRCD-1026 : Failed to create service breath_TAF1 for database breath
PRKH-1014 : Current user “grid” is not the oracle owner user “oracle” of oracle home “/u01/app/oracle/product/11.2.0/dbhome_1”

切换oracle用户下:
[grid@breath01 ~]$ su - oracle
Password:
创建两个服务(实现业务隔离分散负载的作用):
breath_TAF1—–首选实例breath1,备选实例breath2
breath_TAF2—–首选实例breath2,备选实例breath1

[oracle@breath01 ~]$ srvctl add service -d breath -s breath_TAF1 -r "breath1" -a "breath2" -P basic -m basic -e select -w 3 -z 100
[oracle@breath01 ~]$ srvctl add service -d breath -s breath_TAF2 -r "breath2" -a "breath1" -P basic -m basic -e select -w 3 -z 100

-d Unique name for the database –数据库名称
-s Service name –自定义服务名
-r “” Comma separated list of preferred instances –首选实例名称
-a “” Comma separated list of available instances –备选实例
-P {NONE | BASIC | PRECONNECT} TAF policy specification –TAF创建的,PRECONNECT才会在OCR当中注册服务
-m Failover method (NONE or BASIC)
-e Failover type (NONE, SESSION, or SELECT)
-w Failover delay
-z Failover retries

[grid@breath01 ~]$ srvctl start service -d breath -s breath_TAF1,breath_TAF2
[grid@breath01 ~]$ srvctl config service -d breath
Service name: breath_TAF1
Service is enabled
Server pool: breath_breath_TAF1
Cardinality: 1
Disconnect: false
Service role: PRIMARY
Management policy: AUTOMATIC
DTP transaction: false
AQ HA notifications: false
Failover type: SELECT
Failover method: BASIC
TAF failover retries: 100
TAF failover delay: 3
Connection Load Balancing Goal: LONG
Runtime Load Balancing Goal: NONE
TAF policy specification: BASIC
Edition: 
Preferred instances: breath1
Available instances: breath2
Service name: breath_TAF2
Service is enabled
Server pool: breath_breath_TAF2
Cardinality: 1
Disconnect: false
Service role: PRIMARY
Management policy: AUTOMATIC
DTP transaction: false
AQ HA notifications: false
Failover type: SELECT
Failover method: BASIC
TAF failover retries: 100
TAF failover delay: 3
Connection Load Balancing Goal: LONG
Runtime Load Balancing Goal: NONE
TAF policy specification: BASIC
Edition: 
Preferred instances: breath2
Available instances: breath1

修改service_name 参数,默认只会添加到一个节点上
如节点二上:

SYS@breath2>show parameter service

NAME                     TYPE    VALUE
------------------------------------ ----------- ------------------------------
service_names                string  breath_TAF2

SYS@breath2>ALTER SYSTEM SET service_names='breath_TAF1,breath_TAF2' sid='*';
SYS@breath2>show parameter service

NAME                     TYPE    VALUE
------------------------------------ ----------- ------------------------------
service_names                string  breath_TAF1,breath_TAF2


[grid@breath01 ~]$ lsnrctl status
Service "breath_TAF1" has 2 instance(s).
  Instance "breath1", status READY, has 1 handler(s) for this service...
  Instance "breath2", status READY, has 2 handler(s) for this service...
Service "breath_TAF2" has 2 instance(s).
  Instance "breath1", status READY, has 1 handler(s) for this service...
  Instance "breath2", status READY, has 2 handler(s) for this service...
The command completed successfully


[root@breath01 ~]# crsctl stat res -t
..........
ora.breath.breath_taf1.svc1        ONLINE  ONLINE       breath01                                     
ora.breath.breath_taf2.svc1        ONLINE  ONLINE       breath02                                     

查看service TAF 策略 ,视图 dba_services:

set line 300
col name format a15  
col failover_method format a11 heading 'METHOD' 
col failover_type format a10 heading 'TYPE' 
col failover_retries format 9999999 heading 'RETRIES' 
col FAILOVER_DELAY for 999999 heading 'DELAY'
col goal format a10 
col clb_goal format a8 
select name, failover_method, failover_type, failover_retries,FAILOVER_DELAY,goal, clb_goal from dba_services where name in ('breath_TAF1','breath_TAF2');

NAME        METHOD      TYPE    RETRIES   DELAY GOAL       CLB_GOAL
--------------- ----------- ---------- -------- ------- ---------- --------
breath_TAF2 BASIC       SELECT      100       3 NONE       LONG
breath_TAF1 BASIC       SELECT      100       3 NONE       LONG

修改service TAF 策略 方法两种:

   1.srvctl modify service
  修改 breath_TAF2:
   [oracle@breath01 ~]$ srvctl modify service -d breath -s breath_TAF2  -P basic -m basic -e session -w 10 -z 50
   SYS@breath1>select name, failover_method, failover_type, failover_retries,FAILOVER_DELAY,goal, clb_goal from dba_services where name in ('breath_TAF1','breath_TAF2');

NAME        METHOD      TYPE    RETRIES   DELAY GOAL       CLB_GOAL
--------------- ----------- ---------- -------- ------- ---------- --------
breath_TAF2 BASIC       SESSION      50      10 NONE       LONG        
breath_TAF1 BASIC       SELECT      100       3 NONE       LONG

2.通过dbms_service包
修改breath_TAF1:

Begin
Dbms_service.modify_service(
Service_name=>'breath_TAF1',
Failover_method=>dbms_service.failover_method_basic,
Failover_type=>dbms_service.failover_type_session,
Failover_retries=>180,
Failover_delay=>5
);
  End;

SYS@breath1>select name, failover_method, failover_type, failover_retries,FAILOVER_DELAY,goal, clb_goal from dba_services where name in ('breath_TAF1','breath_TAF2');

NAME        METHOD      TYPE    RETRIES   DELAY GOAL       CLB_GOAL
--------------- ----------- ---------- -------- ------- ---------- --------
breath_TAF2 BASIC       SESSION      50      10 NONE       LONG
breath_TAF1 BASIC       SESSION     180       5 NONE       LONG

测试:
客户端配置

breath_TAF1 =
  (DESCRIPTION =
    (ADDRESS = (PROTOCOL = TCP)(HOST = breath01-vip)(PORT = 1522))
    (ADDRESS = (PROTOCOL = TCP)(HOST = breath02-vip)(PORT = 1522))
    (LOAD_BALANCE = OFF)
    (FAILOVER=OFF)       ----禁用客户端FAILOVER功能,方便测试
    (CONNECT_DATA =
      (SERVER = DEDICATED)
      (SERVICE_NAME = breath_TAF1)
    )
  )

breath_TAF2 =
  (DESCRIPTION =
    (ADDRESS = (PROTOCOL = TCP)(HOST = breath01-vip)(PORT = 1522))
    (ADDRESS = (PROTOCOL = TCP)(HOST = breath02-vip)(PORT = 1522))
    (LOAD_BALANCE = OFF)
    (FAILOVER=OFF)
    (CONNECT_DATA =
      (SERVER = DEDICATED)
      (SERVICE_NAME = breath_TAF2)
    )
  )


[oracle@breath admin]$ sqlplus system/oracle@breath_taf1
SYSTEM@breath_taf1>select INSTANCE_NAME,VERSION from v$instance;

INSTANCE_NAME    VERSION
---------------- -----------------
breath1      11.2.0.3.0
SYSTEM@breath_taf1>select sid,failover_type,failover_method,failed_over from v$session where sid=(select sid from v$mystat where rownum=1);

       SID FAILOVER_TYPE FAILOVER_M FAI
---------- ------------- ---------- ---
    27 SESSION   BASIC      NO

服务端关闭breath1实例
[grid@breath01 ~]$ srvctl  stop instance -d breath -i breath1

客户端刚才的会话窗口 (没有断开):
SYSTEM@breath_taf1>select INSTANCE_NAME,VERSION from v$instance;

INSTANCE_NAME    VERSION
---------------- -----------------
breath2      11.2.0.3.0

SYSTEM@breath_taf1>select sid,failover_type,failover_method,failed_over from v$session where sid=(select sid from v$mystat where rownum=1);

       SID FAILOVER_TYPE FAILOVER_M FAI
---------- ------------- ---------- ---
    28 SELECT    BASIC      YES

这里变成yes 证明是经过了failover 

你可能感兴趣的:(Oracle,RAC)