PGSQL Standby Replica

Standby replicas and quorum commit

Sync Standby

Under normal circumstances, PostgreSQL’s replication latency is a few tens of KB/10ms, which is negligible for regular business.

When the primary fails, data that has not yet completed replication will be lost! Replication latency can be a problem when dealing with critical and sophisticated business queries. Or, in a replica, immediately read-your-write after the primary writes, which can also be very sensitive to replication latency.

Sync standbys can solve such problems. A simple way to configure a sync standby is to use the pg_conf = crit template, which automatically enables synchronous replication.

pg-test:
  hosts:
    10.10.10.11: { pg_seq: 1, pg_role: primary }
    10.10.10.12: { pg_seq: 2, pg_role: replica }
    10.10.10.13: { pg_seq: 3, pg_role: replica }
  vars:
    pg_cluster: pg-test
    pg_conf: crit.yml

After the cluster is created, you can also execute pg edit-config <cluster.name> on the meta node, edit the cluster configuration file, change the value of the synchronous_mode to true and apply it.

$ pg edit-config pg-test
---
+++
-synchronous_mode: false
+synchronous_mode: true
 synchronous_mode_strict: false

Apply these changes? [y/N]: y

Quorum Commit

By default, synchronous replication picks an instance from all candidate replicas as a sync standby. Any primary transaction is only considered successfully committed and returned when replicated to the replica and flushed to the disk. A quorum commit can be used if more persistent data is expected. For example, in a 1primary & 3 replicas cluster, at least two replicas successfully flush to disk before a commit is confirmed.

When using quorum commit, you need to modify the synchronous_standby_names in PostgreSQL and the value of synchronous_node_count in Patroni. Assuming that the three replicas are pg-test-2, pg-test-3, and pg-test-4, the following should be configured.

  • synchronous_standby_names = ANY 2 (pg-test-2, pg-test-3, pg-test-4)
  • synchronous_node_count : 2
pg-test:
  hosts:
    10.10.10.10: { pg_seq: 1, pg_role: primary } # pg-test-1
    10.10.10.11: { pg_seq: 2, pg_role: replica } # pg-test-2
    10.10.10.12: { pg_seq: 3, pg_role: replica } # pg-test-3
    10.10.10.13: { pg_seq: 4, pg_role: replica } # pg-test-4
  vars:
    pg_cluster: pg-test

Execute pg edit-config pg-test and modify the config as follows.

$ pg edit-config pg-test
---
+++
@@ -82,10 +82,12 @@
     work_mem: 4MB
+    synchronous_standby_names: 'ANY 2 (pg-test-2, pg-test-3, pg-test-4)'
 
-synchronous_mode: false
+synchronous_mode: true
+synchronous_node_count: 2
 synchronous_mode_strict: false

Apply these changes? [y/N]: y

After the application, the configuration takes effect, and two Sync Standby appear. When the cluster has Failover or expansion and contraction, please adjust these parameters to avoid service unavailability.

+ Cluster: pg-test (7080814403632534854) +---------+----+-----------+-----------------+
| Member    | Host        | Role         | State   | TL | Lag in MB | Tags            |
+-----------+-------------+--------------+---------+----+-----------+-----------------+
| pg-test-1 | 10.10.10.10 | Leader       | running |  1 |           | clonefrom: true |
| pg-test-2 | 10.10.10.11 | Sync Standby | running |  1 |         0 | clonefrom: true |
| pg-test-3 | 10.10.10.12 | Sync Standby | running |  1 |         0 | clonefrom: true |
| pg-test-4 | 10.10.10.13 | Replica      | running |  1 |         0 | clonefrom: true |
+-----------+-------------+--------------+---------+----+-----------+-----------------+

Last modified 2022-06-04: fii en docs batch 2 (61bf601)