Oracle HA高可用性之RAC、Data Guard、Stream功能总结

Oracle数据库的高可用性主要体现在其下的三个组件技术RAC、Data Guard、Streams。

先来看看官方文档怎么介绍RAC、DG和Streams的。

以下摘取自Oracle 12c官方文档

《Real Application Clusters Administration and Deployment Guide》

Overview of Oracle RAC

Provides an introduction to Oracle RAC and its functionality.

Non-cluster Oracle databases have a one-to-one relationship between the Oracle database and the instance. Oracle RAC environments, however, have a one-to-many relationship between the database and instances. An Oracle RAC database can have up to 100 instances,Foot 1 all of which access one database. All database instances must use the same interconnect, which can also be used by Oracle Clusterware. 

Foot 1: With Oracle Database 10g release 2 (10.2) and later releases, Oracle Clusterware supports 100 nodes in an Oracle Clusterware standard Cluster, with the option to run 100 database instances belonging to one production database on these nodes.

Oracle RAC is a unique technology that provides high availability and scalability for all application types. The Oracle RAC infrastructure is also a key component for implementing the Oracle enterprise grid computing architecture. Having multiple instances access a single database prevents the server from being a single point of failure. Oracle RAC enables you to combine smaller commodity servers into a cluster to create scalable environments that support mission critical business applications. Applications that you deploy on Oracle RAC databases can operate without code changes.

《Data Guard Concepts and Administration 》

1 Introduction to Oracle Data Guard

Oracle Data Guard provides a comprehensive set of services that create, maintain, manage, and monitor one or more standby databases to enable production Oracle databases to survive disasters and data corruptions. Oracle Data Guard maintains these standby databases as copies of the production database. Then, if the production database becomes unavailable because of a planned or an unplanned outage, Oracle Data Guard can switch any standby database to the production role, minimizing the downtime associated with the outage. Oracle Data Guard can be used with traditional backup, restoration, and cluster techniques to provide a high level of data protection and data availability. Oracle Data Guard transport services are also used by other Oracle features such as Oracle Streams and Oracle GoldenGate for efficient and reliable transmission of redo from a source database to one or more remote destinations.

With Oracle Data Guard, administrators can optionally improve production database performance by offloading resource-intensive backup and reporting operations to standby systems.

《Oracle Streams High Availability Environments》

13.1 Overview of Oracle Streams High Availability Environments

Configuring a high availability solution requires careful planning and analysis of failure scenarios. Database backups and physical standby databases provide physical copies of a source database for failover protection. Oracle Data Guard, in SQL apply mode, implements a logical standby database in a high availability environment. Because Oracle Data Guard is designed for a high availability environment, it handles most failure scenarios. However, some environments might require the flexibility available in Oracle Streams, so that they can take advantage of the extended feature set offered by Oracle Streams.

以上这段很关键,解释了数据库备份、Data Guard、Streams的功能和使用场景。官方推荐使用SQL应用模式下Data Guard逻辑备库来实现数据库级高可用性。

13.2.1 Oracle Streams Replica Database

Like Oracle Data Guard in SQL apply mode, Oracle Streams can capture database changes, propagate them to destinations, and apply the changes at these destinations. Oracle Streams is optimized for replicating data. Oracle Streams can capture changes at a source database, and the captured changes can be propagated asynchronously to replica databases. This optimization can reduce the latency and can enable the replicas to lag the primary database by no more than a few seconds.

Nevertheless, you might choose to use Oracle Streams to configure and maintain a logical copy of your production database. Although using Oracle Streams might require additional work, it offers increased flexibility that might be required to meet specific business requirements. A logical copy configured and maintained using Oracle Streams is called a replica, not a logical standby, because it provides many capabilities that are beyond the scope of the normal definition of a standby database. Some of the requirements that can best be met using an Oracle Streams replica are listed in the following sections.

13.2.1.1 Updates at the Replica Database

The greatest difference between a replica database and a standby database is that a replica database can be updated and a standby database cannot. Applications that must update data can run against the replica, including jobs and reporting applications that log reporting activity. Replica databases also allow local applications to operate autonomously, protecting local applications from WAN failures and reducing latency for database operations.

13.2.1.2 Heterogeneous Platform Support

The production and the replica do not need to be running on the exact same platform. This provides more flexibility in using computing assets, and facilitates migration between platforms.

13.2.1.3 Multiple Character Sets

Oracle Streams replicas can use different character sets than the production database. Data is automatically converted from one character set to another before being applied. This ability is extremely important if you have global operations and you must distribute data in multiple countries.

13.2.1.4 Mining the Online Redo Logs to Minimize Latency

If the replica is used for near real-time reporting, Oracle Streams can lag the production database by no more than a few seconds, providing up-to-date and accurate queries. Changes can be read from the online redo logs as the logs are written, rather than from the redo logs after archiving.

13.2.1.5 Fast Failover

Oracle Streams replicas can be open to read/write operations at all times. If a primary database fails, then Oracle Streams replicas are able to instantly resume processing. A small window of data might be left at the primary database, but this data will be automatically applied when the primary database recovers. This ability can be important if you value fast recovery time over no lost data. Assuming the primary database can eventually be recovered, the data is only temporarily unavailable.

13.2.1.6 Single Capture for Multiple Destinations

In a complex environment, changes need only be captured once. These changes can then be sent to multiple destinations. When a capture process is used to capture changes, this ability enables more efficient use of the resources needed to mine the redo logs for changes.

13.2.2 When Not to Use Oracle Streams

As mentioned previously, there are scenarios in which you might choose to use Oracle Streams to meet some of your high availability requirements. One of the rules of high availability is to keep it simple. Oracle Data Guard is designed for high availability and is easier to implement than an Oracle Streams-based high availability solution. If you decide to leverage the flexibility offered by Oracle Streams, then you must be prepared to invest in the expertise and planning required to make an Oracle Streams-based solution robust. You might need to write scripts to implement much of the automation and management tools provided with Oracle Data Guard.


由红字部分可见,Oracle官方也很矛盾,HA的一项原则是简洁,所以建议使用DG,而某些特殊场景Oracle Stream又可以提供更好的灵活性。我之前看到过一篇文章说Stream因为不好用、实用性差、使用率低已经被弃用了,但看官方最新的白皮书里还在更新相应的技术——Oracle Advanced Streams和Golden Gate。

就国内环境讲,还是用RAC+DG的比较多,Stream近乎不用。

下面详细介绍一下RAC、DG、Streams和双机热备。

RAC

非集群Oracle数据库系统数据库和Oracle实例是一对一的关系。Oracle RAC环境中数据库和Oracle实例是一对多的关系。一个Oracle RAC数据库最多可拥有100个实例(Oracle Database 10g release 2版本以后),它们都连接到同一个数据库。所有实例必须使用Oracle Clusterware所使用的相同的内部网络连接。

Oracle RAC是一种为所有应用程序提供高可用性和可扩展性的特有技术。Oracle RAC架构是Oracle企业级网格计算架构的关键组件之一。多个实例访问单一数据库可以避免服务器单点故障。Oracle RAC允许使用小型商用服务器组成一个支撑关键业务应用程序的可扩展环境。

总结一下,Oracle RAC提供实例级冗余来解决服务器单点故障的问题,以实现高可用性和可扩展性,可以使用多台小型机组成的集群节约成本。

为什么使用 RAC
通过 Oracle Real Application Clusters (RAC),可以充分利用低成本的标准模块化服务器(如刀片服务器)集群。RAC 可为服务提供自动工作量管理。服务是指应用程序的分组或分类,应用程序由与应用程序工作量对应的业务组件组成。通过 RAC 中的服务可以实现持续、不中断的数据库操作,还可为多个实例中的多种服务提供支持。您可以指定服务在一个或多个实例以及可用作备份实例的替代实例上运行。如果主实例出现故障,Oracle会将服务从故障实例转移到仍旧正常运行的替代实例。此外,Oracle 还自动在托管服务的实例之间平衡连接负载。RAC 将多台低成本的计算机作为一台大型计算机来执行数据库处理,这是用于各种类型应用程序的大型 SMP 逻辑单元的唯一可行替代方法。基于共享磁盘体系结构的 RAC 可以根据需要扩大和缩小,而不必采用人工方式在集群服务器之间对数据进行分区。此外,RAC 还提供了在集群中添加和删除服务器的一键式功能。因此,可以轻松地在数据库中添加或删除服务器。

Data Guard

Oracle Data Guard provides a comprehensive set of services that create, maintain, manage, and monitor one or more standby databases to enable production Oracle databases to survive disasters and data corruptions. Oracle Data Guard maintains these standby databases as copies of the production database.

Oracle Data Guard提供一系列广泛的服务,创建、维护、管理和监视一个或多个备用数据库来确保产品数据库从灾难和数据损毁中幸存下来。Oracle Data Guard将备用数据库作为产品数据库的副本。

Oracle Data Guard 是一种具有管理、监视和自动化功能的软件基础结构,此基础结构与生产数据库及一个或多个备用数据库协同工作,可保护数据使其免受故障、错误或损坏的危害,从而防止数据库损坏。此基础结构提供了在一项 Data Guard 配置中自动创建、管理和监视数据库及其它组件的工具,从而起到保护重要数据的作用。此基础结构会自动执行维护 Oracle 生产数据库副本(称为“备用数据库”)的进程;在生产数据库因例行维护而处于脱机状态或被损坏时,可以使用此数据库副本。在 Data Guard 配置中,生产数据库称为“主数据库”。“备用数据库”是主数据库的同步副本。使用主数据库的备份副本,可以创建 1-9 个备用数据库。主数据库与备用数据库组合在一起便构成了一项 Data Guard 配置。每个备用数据库只能与一个主数据库关联。
注:可以使用“级联重做日志目标”功能在配置中包含九个以上的备用数据库。
强烈建议在 Data Guard 中的所有备用数据库(包括主数据库)上配置备用重做日志文件,
以帮助进行角色转换。

Oracle HA高可用性之RAC、Data Guard、Stream功能总结_第1张图片

总结一下,RAC虽然解决了实例级HA,但数据还是存储在同一地点同一共享存储上的。Oracle Data Guard提供数据库级冗余,解决了RAC遗留的数据安全隐患,将数据库也存储在多个地点的不同存储上,避免了因自然灾害、存储故障等问题导致数据库无法恢复的问题,避免生产系统出现下线的情况,使生产系统永久保持在线可用。

但这里有个遗留问题是,Data Guard对Oracle自身的数据库软件版本和一些相关配置有严格要求 ——所有Data Guard集群内的数据库版本和目录存储位置必须一致,且Primary Database、Standby Database以及切换时的参数要配置合理。

Streams

Stream(流)是数据库内部或从一个数据库到另一个数据库的信息流。Oracle Streams 是一组进程和
数据库结构,用于共享数据流中的数据和消息。置于流中的信息单元被称为事件:
• 格式化为 LCR 的 DDL 或 DML 更改
• 用户创建的事件
事件存放在队列中并在队列之间传播。
大部分人将 Oracle Streams 当作一种复制系统,在该系统中,所有数据库均是可更新的,
不需要考虑平台或版本。其特性包括:
• 所有站点:是活动且可更新的
• 自动检测冲突并提供可选的解决方案
• 支持数据转换
• 灵活的配置:多向、中心辐射,等等
• 不同的数据库平台、版本和方案
• 为应用程序提供高可用性(可避免或管理更新冲突)

Oracle HA高可用性之RAC、Data Guard、Stream功能总结_第2张图片

总结一下,虽然Oracle Data Guard可以做很多事情,适用于很多场景,但仍有其不适用的场景出现。此时,就需要Oracle Streams了,它不受数据库平台和版本的限制,可以提供数据流级的数据库复制、备份。

双机热备
双机热备特指基于高可用系统中的两台服务器的热备(或高可用),因两机高可用在国内使用较多,故得名双机热备,双机高可用按工作中的切换方式分为:主-备方式(Active-Standby方式)和双主机方式(Active-Active方式),主-备方式即指的是一台服务器处于某种业务的激活状态(即Active状态),另一台服务器处于该业务的备用状态(即Standby状态)。而双主机方式即指两种不同业务分别在两台服务器上互为主备状态(即Active-Standby和Standby-Active状态)。

组成双机热备的方案主要的三种方式分别为:基于共享存储(磁盘阵列)的方式,全冗余方式和复制方式。

基于共享存储(磁盘阵列)的方式

共享存储方式主要通过磁盘阵列提供切换后,对数据完整性和连续性的保障。用户数据一般会放在磁盘阵列上,当主机宕机后,备机继续从磁盘阵列上取得原有数据。如下图所示

传统的单存储方式

这种方式因为使用一台存储设备,往往被业内人士称为磁盘单点故障。但一般来讲存储的安全性较高。所以如果忽略存储设备故障的情况下,这种方式也是业内采用最多的热备方式了。

由上可知,基于共享存储的双机热备和RAC很像,但RAC在高可用性方面要优于双机热备,因为双机热备架构无法利用备机资源,而RAC则可以基于管控组件合理利用备机的资源,比如设置一定的规则,dml操作在产品实例服务器上做,查询在备用实例服务器上做。

全冗余方式

全冗余方式就是双机双存储,基于单台存储的传统双机热备方式,确实存在存储单点故障的情况,为实现存储冗余,存储高可用也已经越来越多的被用户接受。我们从理解上可以看出,双机热备最早是为解决服务器的计划性停机与非计划性宕机的解决方案,但是我们无法实现存储的计划性停机与非计划性宕机带来的服务器停机,而存储作为双机热备中唯一存储数据的设备,它一旦发生故障往往会造成双机热备系统全面崩溃。

存储热备方式

随着科技的进步,云存储,云计算发展,对于存储热备已经进入了成熟及快速发展阶段,双机热备也随着技术的进步,进入到了没有单点故障的全冗余双机热备方式。如图:

这种方式的特点在于:

1、存储之间的数据复制不经过网络,而是由存储之间进行复制。

2、两个存储之间的复制是完全实时的,不存在任何时间延时。

3、主备存储之间的切换时间小于500ms,以确保系统存储时不产生延时。

4、硬盘盘符及分区不因为主备存储之间的切换而改变。

5、服务器的切换,不影响存储之间的初始化,增量同步及数据复制。

6、某一存储设备的计划性停机,不影响整个服务器双机热备系统的工作。

7、存储设备之间使用重复数据删除技术,完成增量同步工作。

8、真正的7X24小时或切换的全冗余方案。

复制方式

这种方式主要利用数据的同步方式,保证主备服务器的数据一致性。

总结一下,

参考文章:

1、Oracle 12c官方文档

2、百度百科

3、Oracle 18c最大高可用架构

https://www.oracle.com/technetwork/database/availability/maximum-availability-wp-18c-4403435.pdf

4、Rosanu 的《oracle集群(RAC)和主备数据同步(DataGuard)思路》

https://blog.csdn.net/rosanu_blog/article/details/68108262 

5、Dave的《Oracle Data Guard 理论知识》

这个文章有点旧了,10g时候的,虽然补充了11g的部分改动,但感觉还是不太靠谱,里面的理论有些已经有变更或弃用了。想学学理论的可以看看。

https://blog.csdn.net/tianlesoftware/article/details/5514082
 

你可能感兴趣的:(Oracle)