Mastering PostgreSQL Replication: A Step-by-Step Guide for High Availability
Imagine this: It’s 3:00 AM. Your phone buzzes. The production database is down. The disk failed.
If you have a replication setup, you simply promote your standby server and go back to sleep. If you don't... well, you're in for a very long night of restoring from cold backups.
Replication is one of the most critical skills for any backend developer or DBA. It transforms your database from a single point of failure into a resilient, high-performance cluster.
In this guide, we’ll cover why you need it, the types available, and a complete step-by-step tutorial to set up a robust Primary-Standby architecture in 10 minutes.
Why Do We Need Replication?
Beyond just "sleeping better at night," replication solves three massive architectural problems:
- High Availability (HA): Hardware fails. AWS zones go down. Replication ensures that if your Primary node dies, a Standby node is ready to take over immediately.
- Read Scaling (Load Balancing): As your app grows,
SELECTqueries often become the bottleneck. Replication allows you to direct all write traffic (INSERT,UPDATE) to the Primary, while distributing read traffic across multiple Read Replicas. - Analytics Isolation: Running heavy analytical queries on your main database can kill performance for users. With replication, you can run those heavy reports on a Standby node without affecting the live application.
The Two Flavors: Physical vs. Logical
PostgreSQL offers two main ways to replicate data. Choosing the right one is crucial.
1. Physical Streaming Replication (The Standard)
This is the most common setup. It copies the physical "Write-Ahead Log" (WAL) files from the Primary to the Standby.
- Pros: It’s an exact binary clone. It’s extremely fast and easy to manage.
- Cons: The Standby is read-only. You cannot write to it. You also cannot replicate just one table; it’s all or nothing.
- Best For: High Availability and Disaster Recovery.
2. Logical Replication
This decodes the data changes into a stream of logical modifications (e.g., "Row inserted into Table A").
- Pros: Flexible. You can replicate specific tables or replicate between different OS versions (e.g., Linux to Windows). The subscriber database can be writable.
- Cons: Slightly higher overhead and complexity.
- Best For: Data migration (ETL) or moving data to data warehouses.
🏗️ The Tutorial: Setting Up Physical Replication
Let's get our hands dirty. We will set up Physical Streaming Replication.
The Setup
- Primary Server (Node A):
192.168.1.10 - Standby Server (Node B):
192.168.1.20 - OS: Ubuntu/Debian (Paths may vary slightly for CentOS/RHEL)
- Postgres Version: 14, 15, or 16 (Must be the same on both)
Step 1: Configure the Primary (Node A)
First, we need to create a user that has the special permission to stream WAL files.
Restart Postgres:
sudo systemctl restart postgresql
Edit pg_hba.conf:We need to explicitly allow the Standby server to connect. Add this line to the bottom of the file:
# TYPE DATABASE USER ADDRESS METHOD
host replication rep_user 192.168.1.20/32 md5
Edit postgresql.conf:Locate your config file (usually /etc/postgresql/15/main/postgresql.conf) and set these parameters:
# Allow the server to listen on all IPs
listen_addresses = '*'
# Enable replication mode
wal_level = replica
# Optional: How many standby connections allowed
max_wal_senders = 10
Create the Replication User:Log into Postgres and run:
CREATE USER rep_user WITH REPLICATION ENCRYPTED PASSWORD 'secret_password';
Step 2: Configure the Standby (Node B)
⚠️ Warning: This step involves wiping the data directory on the Standby. Ensure this is a fresh server or you have backups.
-h: Primary IP-U: Replication User-D: Standby Data Directory-R: Crucial. Creates thestandby.signalfile and connection settings automatically.
Start the Standby:
sudo systemctl start postgresql
The Magic Command: pg_basebackupThis utility copies the data from the Primary and configures the Standby for us automatically.
sudo -u postgres pg_basebackup -h 192.168.1.10 -D /var/lib/postgresql/15/main/ -U rep_user -P -R
(Type 'secret_password' when prompted)
Clear the Existing Data:
# Be very careful with this command!
sudo rm -rf /var/lib/postgresql/15/main/*
Stop Postgres on the Standby:
sudo systemctl stop postgresql
Step 3: Verification
Is it working? Let's check.
On the Primary Server, run this SQL query:
SELECT * FROM pg_stat_replication;
If you see a row with the state streaming, congratulations! You have a working cluster.
Test it out:
- Create a table on the Primary:
CREATE TABLE test (id int); - Go to the Standby and run:
SELECT * FROM test; - The table should be there instantly!
How to Handle a Failover
If your Primary server crashes, the Standby is still in "Read-Only" mode. To promote it to be the new Primary (writable), run this command on the Standby server:
sudo -u postgres pg_ctl promote -D /var/lib/postgresql/15/main/
The Standby will immediately sever ties with the old Primary and accept write connections.
Conclusion
Setting up replication used to be a complex task involving manual WAL file management. With tools like pg_basebackup, it’s now accessible to everyone.
Start with this simple Primary-Standby setup. As you scale, you can explore synchronous replication for zero-data-loss guarantees or logical replication for complex data flows.
Did this guide save your uptime? Follow for more !