This article explain how to build a data consistency capable Postgresql HA architecture by using Chained Cascading Replication.
First of all, async replication will cause 'time-delay data inconsistency' is expected and some type of applications can accept this kind of data inconsistency. Within secnonds, the data in the stand-by machine will become consistent with the master machine.
On the other hand, 'failure caused data inconsistency' may not be accpted by applications. After failover, an async replicated stand-by machine may contain a transaction commit WAL record which is NOT in the WAL records of the promoted master machine. The async stand-by machine may already send the transaction records to the customer thru read-only query.
Consider the following "parallel replication" topology:
A Master machine - MA1
A hot-standby machine connected to MA1 thru sync replication - SB1
Another hot-standby machine connected to MA1 thru async replication - SB2
If the below worst case scenario happen:
Here, T1, T2,… represent time point in the time line
T1. MA1 issue a transaction (xid: TX1) Commit
T2. MA1 Flush TX1 WAL to disk WAL records
T3. WAL-sender from MA1 to SB1 is corrupted and MA1 CANNOT send TX1 WAL records to SB1
T4. MA1 send TX1 WAL records to SB2 thru another WAL sender (async replication)
T5. TX1 is NOT committed in MA1 buffer memory and db storage, MA1 still wait for ack from SB1
T6. SB2 apply the TX1 WAL records and answered a read-only query
T7. MA1 fail
T8. some how, SB2 fail also
T9. Failover execute, SB1 become the new master, name it MA-SB1
The WAL-sender from MA1 to SB1 is corrupted at time T3, there is no commit record for transaction TX1 MA-SB1. Since SB1 is now the new master MA-SB1, SB2 should sync with MA-SB1. However, at time T6, SB2 apply the TX1 WAL records and answered a read-only query. In this case, 'failure caused data inconsistency'! Application need to take care and rectify this kind of data inconsistency
Consider the following "chained cascading replication" topology:
A Master machine - MA1
A hot-standby machine connected to MA1 thru sync replication - SB1
Another hot-standby machine connected to SB1 thru cascading async replication - SB2
The below scenario happen:
Here, T1, T2,… represent time point in the time line
T1. MA1 issue a transaction (xid: TX1) Commit
T2. MA1 Flush TX1 WAL to disk WAL records
T3. WAL-sender from MA1 to SB1 is corrupted and MA1 CANNOT send TX1 WAL records to SB1
T4. SB2 can only receive WAL records from SB1 (async replication), on disk WAL record in SB2 is a SUBSET of SB1 on disk WAL records.
T5. TX1 is NOT committed in MA1 buffer memory and db storage, MA1 still wait for ack from SB1
T6. TX1 WAL records DOES NOT EXIST in SB2, NO record selected for the read-only query about TX1
T7. MA1 fail
T8. Failover execute, SB1 become the new master, name it MA-SB1, on disk WAL record in SB2 is still a SUBSET of SB1 on disk WAL records.
T9. Upgrade SB2 to sync rep from MA-SB1
The 'failure caused data inconsistency' happened in "paralle replicatiion" CANNOT happen in "chained cascading replication" topology.
To make use of postgresql hot-standy HA, a load-balance component is needed. For postgresql, pgpool2 is a popular one. I will talk about it in the next article.
Postgresql HA cluster-Sync Rep+pg_rewind - part 1
http://my.oschina.net/u/2399919/blog/469330