1.通过Connection Balancing,按照某种算法把用户分配到不同的节点。
2.通过service,在应用层上分散负载,也可认为是根据业务进行分散负载。
Connection Balancing
这种负载均衡是在用户连接这个层次进行的,在用户请求建立连接时,根据每个节点的负载决定把连接分配给哪个实例,而一旦连接建立之后,会话的所有操作都在这个实例上完成,二不会再分派给其他节点了。
客户端负载均衡(Client -Side LB) 是 oracle 8i使用的方法,配置在客户端的tnsnames.ora文件中,加入了LOAD_BALANCE=ON条目.
当客户端发起连接时,会从地址列表中随机的选取一个,在使用随机算法把连接请求分配到各个实例。
配置示例(客户端):
breath =
(DESCRIPTION =
(ADDRESS = (PROTOCOL = TCP)(HOST = breath01-vip)(PORT = 1522))
(ADDRESS = (PROTOCOL = TCP)(HOST = breath02-vip)(PORT = 1522))
(LOAD_BALANCE = ON)
(CONNECT_DATA =
(SERVER = DEDICATED)
(SERVICE_NAME = breath)
)
)
Server-Side 是从Oracle 9i 引入的。它的实现依赖于Listener收集负载信息。在数据库运行过程中,PMON后台进程会收集系统的负载信息,然后登记到listener中
最少1分钟,最多10分钟PMON就要做一个信息更新,并且如果节点负载越高,更新频率就越高,以保证listener能掌握每个节点准确的负载情况。
如果listener关闭了,PMON进程会每隔1秒钟检查listener是否重启。除了这个自动的定时更新任务外,用户也可以使用 alter system register 命令来手动进行注册
这个自动更新动作可以从listener的日志中看到,比如下面的这个listener日志片段很清楚的记录了这些动作。
注意,实例启动是PMON进程进行的第一次注册过程叫做Server-register,而后更新过程叫做service-update.
查看日志:
[grid@breath01 ~]$ tail -100 /u01/app/grid/diag/tnslsnr/breath01/listener/trace/listener.log | grep service_update
08-NOV-2017 09:28:18 * service_update * breath1 * 0
08-NOV-2017 09:33:56 * service_update * +ASM1 * 0
08-NOV-2017 09:33:57 * service_update * breath1 * 0
08-NOV-2017 09:38:19 * service_update * breath1 * 0
08-NOV-2017 09:38:22 * service_update * breath1 * 0
08-NOV-2017 09:38:34 * service_update * breath1 * 0
08-NOV-2017 09:39:04 * service_update * breath1 * 0
08-NOV-2017 09:48:22 * service_update * breath1 * 0
08-NOV-2017 09:48:31 * service_update * breath1 * 0
08-NOV-2017 09:48:58 * service_update * breath1 * 0
08-NOV-2017 09:58:25 * service_update * breath1 * 0
08-NOV-2017 09:58:37 * service_update * breath1 * 0
[grid@breath01 ~]$ cat /u01/app/grid/diag/tnslsnr/breath01/listener/trace/listener.log | grep service_register | tail -10
02-NOV-2017 10:52:07 * service_register * LsnrAgt * 0
02-NOV-2017 10:52:24 * service_register * +ASM1 * 0
02-NOV-2017 10:52:45 * service_register * breath1 * 0
02-NOV-2017 17:20:49 * service_register * breath1 * 0
03-NOV-2017 15:12:33 * service_register * LsnrAgt * 0
03-NOV-2017 15:13:01 * service_register * breath1 * 0
03-NOV-2017 15:13:03 * service_register * +ASM1 * 0
07-NOV-2017 09:39:32 * service_register * LsnrAgt * 0
07-NOV-2017 09:40:16 * service_register * +ASM1 * 0
07-NOV-2017 09:41:17 * service_register * breath1 * 0
配置方法
所有节点的tnsnames.ora文件添加以下内容:
[oracle@breath01 ~] cat c a t ORACLE_HOME/network/admin/tnsnames.ora
BREATH_REMOTE =
(DESCRIPTION =
(ADDRESS = (PROTOCOL = TCP)(HOST = breath01-vip)(PORT = 1522))
(ADDRESS = (PROTOCOL = TCP)(HOST = breath02-vip)(PORT = 1522))
)
其中一个节点修改数据库remote_listener 参数。
SYS@breath1>alter system set remote_listener='BREATH_REMOTE' scope=both sid='*';
检查参数设置
SYS@breath1>show parameter remote_lis
NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
remote_listener string BREATH_REMOTE
SYS@breath2>show parameter remote_lis
NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
remote_listener string BREATH_REMOTE
监听状态(节点一的,其他节点同样如此)
[grid@breath01 ~]$ lsnrctl status
LSNRCTL for Linux: Version 11.2.0.3.0 - Production on 09-NOV-2017 14:52:18
Copyright (c) 1991, 2011, Oracle. All rights reserved.
Connecting to (DESCRIPTION=(ADDRESS=(PROTOCOL=IPC)(KEY=LISTENER)))
STATUS of the LISTENER
------------------------
Alias LISTENER
Version TNSLSNR for Linux: Version 11.2.0.3.0 - Production
Start Date 07-NOV-2017 09:39:32
Uptime 2 days 5 hr. 12 min. 46 sec
Trace Level off
Security ON: Local OS Authentication
SNMP OFF
Listener Parameter File /u01/app/11.2.0/grid/network/admin/listener.ora
Listener Log File /u01/app/grid/diag/tnslsnr/breath01/listener/alert/log.xml
Listening Endpoints Summary...
(DESCRIPTION=(ADDRESS=(PROTOCOL=ipc)(KEY=LISTENER)))
(DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=192.0.2.111)(PORT=1522)))
(DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=192.0.2.113)(PORT=1522)))
Services Summary...
Service "+ASM" has 1 instance(s).
Instance "+ASM1", status READY, has 1 handler(s) for this service...
Service "breath" has 2 instance(s).
Instance "breath1", status READY, has 2 handler(s) for this service...
Instance "breath2", status READY, has 1 handler(s) for this service...
Service "breathXDB" has 2 instance(s).
Instance "breath1", status READY, has 1 handler(s) for this service...
Instance "breath2", status READY, has 1 handler(s) for this service...
The command completed successfully
日志
[grid@breath01 ~]$ cat /u01/app/grid/diag/tnslsnr/breath01/listener/trace/listener.log | grep service_register | tail -10
02-NOV-2017 10:52:45 * service_register * breath1 * 0
02-NOV-2017 17:20:49 * service_register * breath1 * 0
03-NOV-2017 15:12:33 * service_register * LsnrAgt * 0
03-NOV-2017 15:13:01 * service_register * breath1 * 0
03-NOV-2017 15:13:03 * service_register * +ASM1 * 0
07-NOV-2017 09:39:32 * service_register * LsnrAgt * 0
07-NOV-2017 09:40:16 * service_register * +ASM1 * 0
07-NOV-2017 09:41:17 * service_register * breath1 * 0
09-NOV-2017 14:49:51 * service_register * breath1 * 0
09-NOV-2017 14:51:49 * service_register * breath2 * 0 ------节点二,注册到监听
基于客户端tnsname.ora配置方式
对已有连接处理方式不同,可再分为两小类:
1.Client-Side Connect-Time Failover TAF
客户端tnsnames.ora中 启用 FAILOVER=ON参数(默认及时开启的,写不写都一样)。遍历DESCRIPTION 中所有 地址,尝试连接,直到连接成功。但是现有已建立连接所对应的实例故障宕机,客户端能感觉的到
2. TAF
客户端tnsnames.ora中 启用 FAILOVER_MODE参数,当已建立连接所对应的实例故障宕机,客户端自动去向其他正常实例发送请求。客户端无感,实现透明故障转移 .
基于服务端配置 service,客户端无需配置任何参数。所有请求交由service 按照既定策略进行故障转移。 service 会指定 主与备的实例,正常连接主的实例,当主的实例故障宕机,透明转移到备用的实例。
important:
Do not set the GLOBAL_DBNAME parameter in the SID_LIST_listener_name section of the listener.ora file. A statically configured global database name disables TAF.
不能再listener.ora文件中设置 GLOBAL_DBNAME,因为这个参数会禁用TAF
客户端tnsnames.ora配置:
breath_test2 =
(DESCRIPTION =
(ADDRESS = (PROTOCOL = TCP)(HOST = breath01-vip)(PORT = 1522))
(ADDRESS = (PROTOCOL = TCP)(HOST = breath02-vip)(PORT = 1522))
(LOAD_BALANCE = ON)
(FAILOVER=ON)
(CONNECT_DATA =
(SERVER = DEDICATED)
(SERVICE_NAME = breath)
(FAILOVER_MODE=
(TYPE=session)
(METHOD=basic)
(RETRIES=180)
(DELAY=5)
)
)
)
测试:
[oracle@breath admin]$ sqlplus system/oracle@breath_test2
SQL> select INSTANCE_NAME,HOST_NAME from v$instance;
INSTANCE_NAME HOST_NAME
---------------- ----------------------------------------------------------------
breath1 breath01.example.com
SQL> col MACHINE for a30
SQL> SELECT MACHINE, FAILOVER_TYPE, FAILOVER_METHOD, FAILED_OVER, COUNT(*) FROM V$SESSION GROUP BY MACHINE, FAILOVER_TYPE, FAILOVER_METHOD, FAILED_OVER;
MACHINE FAILOVER_TYPE FAILOVER_M FAI COUNT(*)
------------------------------ ------------- ---------- --- ----------
breath01.example.com NONE NONE NO 47
breath.example.com SESSION BASIC NO 1
SQL> select sid from v$mystat where rownum<2;
SID
----------
156
SQL> create table test1(id number,name varchar2(20));
Table created.
SQL> insert into test1 values(1,'breath_client');
1 row created.
SQL> commit;
commit
*
ERROR at line 1:
ORA-25405: transaction status unknown
SQL> select * from test1;
no rows selected
SQL> SELECT MACHINE, FAILOVER_TYPE, FAILOVER_METHOD, FAILED_OVER, COUNT(*) FROM V$SESSION GROUP BY MACHINE, FAILOVER_TYPE, FAILOVER_METHOD, FAILED_OVER;
MACHINE FAILOVER_TYPE FAILOVER_M FAI COUNT(*)
------------------------------ ------------- ---------- --- ----------
breath.example.com SESSION BASIC YES 1
breath02.example.com NONE NONE NO 48
Server Service TAF
服务器端透明故障转移,通过配置service来实现,客户端无须任何配置。
要提高RAC的性能,可以从两方面入手:1. 提高Cache Fusion的能力,这个可以使用更好的互联设备,比如G级的private network,或者使用Infiniband等DRA技术。2. 尽量减少Cache Fusion的流量,减少实例间的互相依赖。而Service就是后一种思路基础上发展出来的。
之前提到的应用层负载也是基于创建多个service 指定 不同的 “首选实例”,然后不同的应用程序去分散连接service,来达到分散负载的功能
service创建有多种方式,dbca,oem,srvctl命令行。下面是srvctl配置方式:
不能用gird用户创建service,报错如下:
[grid@breath01 ~]$ srvctl add service -d breath -s breath_TAF1 -r “breath1” -a “breath2” -P basic -m basic -e select -w 3 -z 100
PRCD-1026 : Failed to create service breath_TAF1 for database breath
PRKH-1014 : Current user “grid” is not the oracle owner user “oracle” of oracle home “/u01/app/oracle/product/11.2.0/dbhome_1”
切换oracle用户下:
[grid@breath01 ~]$ su - oracle
Password:
创建两个服务(实现业务隔离分散负载的作用):
breath_TAF1—–首选实例breath1,备选实例breath2
breath_TAF2—–首选实例breath2,备选实例breath1
[oracle@breath01 ~]$ srvctl add service -d breath -s breath_TAF1 -r "breath1" -a "breath2" -P basic -m basic -e select -w 3 -z 100
[oracle@breath01 ~]$ srvctl add service -d breath -s breath_TAF2 -r "breath2" -a "breath1" -P basic -m basic -e select -w 3 -z 100
-d Unique name for the database –数据库名称
-s Service name –自定义服务名
-r “” Comma separated list of preferred instances –首选实例名称
-a “” Comma separated list of available instances –备选实例
-P {NONE | BASIC | PRECONNECT} TAF policy specification –TAF创建的,PRECONNECT才会在OCR当中注册服务
-m Failover method (NONE or BASIC)
-e Failover type (NONE, SESSION, or SELECT)
-w Failover delay
-z Failover retries
[grid@breath01 ~]$ srvctl start service -d breath -s breath_TAF1,breath_TAF2
[grid@breath01 ~]$ srvctl config service -d breath
Service name: breath_TAF1
Service is enabled
Server pool: breath_breath_TAF1
Cardinality: 1
Disconnect: false
Service role: PRIMARY
Management policy: AUTOMATIC
DTP transaction: false
AQ HA notifications: false
Failover type: SELECT
Failover method: BASIC
TAF failover retries: 100
TAF failover delay: 3
Connection Load Balancing Goal: LONG
Runtime Load Balancing Goal: NONE
TAF policy specification: BASIC
Edition:
Preferred instances: breath1
Available instances: breath2
Service name: breath_TAF2
Service is enabled
Server pool: breath_breath_TAF2
Cardinality: 1
Disconnect: false
Service role: PRIMARY
Management policy: AUTOMATIC
DTP transaction: false
AQ HA notifications: false
Failover type: SELECT
Failover method: BASIC
TAF failover retries: 100
TAF failover delay: 3
Connection Load Balancing Goal: LONG
Runtime Load Balancing Goal: NONE
TAF policy specification: BASIC
Edition:
Preferred instances: breath2
Available instances: breath1
修改service_name 参数,默认只会添加到一个节点上
如节点二上:
SYS@breath2>show parameter service
NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
service_names string breath_TAF2
SYS@breath2>ALTER SYSTEM SET service_names='breath_TAF1,breath_TAF2' sid='*';
SYS@breath2>show parameter service
NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
service_names string breath_TAF1,breath_TAF2
[grid@breath01 ~]$ lsnrctl status
Service "breath_TAF1" has 2 instance(s).
Instance "breath1", status READY, has 1 handler(s) for this service...
Instance "breath2", status READY, has 2 handler(s) for this service...
Service "breath_TAF2" has 2 instance(s).
Instance "breath1", status READY, has 1 handler(s) for this service...
Instance "breath2", status READY, has 2 handler(s) for this service...
The command completed successfully
[root@breath01 ~]# crsctl stat res -t
..........
ora.breath.breath_taf1.svc1 ONLINE ONLINE breath01
ora.breath.breath_taf2.svc1 ONLINE ONLINE breath02
查看service TAF 策略 ,视图 dba_services:
set line 300
col name format a15
col failover_method format a11 heading 'METHOD'
col failover_type format a10 heading 'TYPE'
col failover_retries format 9999999 heading 'RETRIES'
col FAILOVER_DELAY for 999999 heading 'DELAY'
col goal format a10
col clb_goal format a8
select name, failover_method, failover_type, failover_retries,FAILOVER_DELAY,goal, clb_goal from dba_services where name in ('breath_TAF1','breath_TAF2');
NAME METHOD TYPE RETRIES DELAY GOAL CLB_GOAL
--------------- ----------- ---------- -------- ------- ---------- --------
breath_TAF2 BASIC SELECT 100 3 NONE LONG
breath_TAF1 BASIC SELECT 100 3 NONE LONG
修改service TAF 策略 方法两种:
1.srvctl modify service
修改 breath_TAF2:
[oracle@breath01 ~]$ srvctl modify service -d breath -s breath_TAF2 -P basic -m basic -e session -w 10 -z 50
SYS@breath1>select name, failover_method, failover_type, failover_retries,FAILOVER_DELAY,goal, clb_goal from dba_services where name in ('breath_TAF1','breath_TAF2');
NAME METHOD TYPE RETRIES DELAY GOAL CLB_GOAL
--------------- ----------- ---------- -------- ------- ---------- --------
breath_TAF2 BASIC SESSION 50 10 NONE LONG
breath_TAF1 BASIC SELECT 100 3 NONE LONG
2.通过dbms_service包
修改breath_TAF1:
Begin
Dbms_service.modify_service(
Service_name=>'breath_TAF1',
Failover_method=>dbms_service.failover_method_basic,
Failover_type=>dbms_service.failover_type_session,
Failover_retries=>180,
Failover_delay=>5
);
End;
SYS@breath1>select name, failover_method, failover_type, failover_retries,FAILOVER_DELAY,goal, clb_goal from dba_services where name in ('breath_TAF1','breath_TAF2');
NAME METHOD TYPE RETRIES DELAY GOAL CLB_GOAL
--------------- ----------- ---------- -------- ------- ---------- --------
breath_TAF2 BASIC SESSION 50 10 NONE LONG
breath_TAF1 BASIC SESSION 180 5 NONE LONG
测试:
客户端配置
breath_TAF1 =
(DESCRIPTION =
(ADDRESS = (PROTOCOL = TCP)(HOST = breath01-vip)(PORT = 1522))
(ADDRESS = (PROTOCOL = TCP)(HOST = breath02-vip)(PORT = 1522))
(LOAD_BALANCE = OFF)
(FAILOVER=OFF) ----禁用客户端FAILOVER功能,方便测试
(CONNECT_DATA =
(SERVER = DEDICATED)
(SERVICE_NAME = breath_TAF1)
)
)
breath_TAF2 =
(DESCRIPTION =
(ADDRESS = (PROTOCOL = TCP)(HOST = breath01-vip)(PORT = 1522))
(ADDRESS = (PROTOCOL = TCP)(HOST = breath02-vip)(PORT = 1522))
(LOAD_BALANCE = OFF)
(FAILOVER=OFF)
(CONNECT_DATA =
(SERVER = DEDICATED)
(SERVICE_NAME = breath_TAF2)
)
)
[oracle@breath admin]$ sqlplus system/oracle@breath_taf1
SYSTEM@breath_taf1>select INSTANCE_NAME,VERSION from v$instance;
INSTANCE_NAME VERSION
---------------- -----------------
breath1 11.2.0.3.0
SYSTEM@breath_taf1>select sid,failover_type,failover_method,failed_over from v$session where sid=(select sid from v$mystat where rownum=1);
SID FAILOVER_TYPE FAILOVER_M FAI
---------- ------------- ---------- ---
27 SESSION BASIC NO
服务端关闭breath1实例
[grid@breath01 ~]$ srvctl stop instance -d breath -i breath1
客户端刚才的会话窗口 (没有断开):
SYSTEM@breath_taf1>select INSTANCE_NAME,VERSION from v$instance;
INSTANCE_NAME VERSION
---------------- -----------------
breath2 11.2.0.3.0
SYSTEM@breath_taf1>select sid,failover_type,failover_method,failed_over from v$session where sid=(select sid from v$mystat where rownum=1);
SID FAILOVER_TYPE FAILOVER_M FAI
---------- ------------- ---------- ---
28 SELECT BASIC YES
这里变成yes 证明是经过了failover