Mastering PostgreSQL Replication: A Step-by-Step Guide for High Availability

Mastering PostgreSQL Replication: A Step-by-Step Guide for High Availability
Photo by Caspar Camille Rubin / Unsplash

Imagine this: It’s 3:00 AM. Your phone buzzes. The production database is down. The disk failed.

If you have a replication setup, you simply promote your standby server and go back to sleep. If you don't... well, you're in for a very long night of restoring from cold backups.

Replication is one of the most critical skills for any backend developer or DBA. It transforms your database from a single point of failure into a resilient, high-performance cluster.

In this guide, we’ll cover why you need it, the types available, and a complete step-by-step tutorial to set up a robust Primary-Standby architecture in 10 minutes.

Why Do We Need Replication?

Beyond just "sleeping better at night," replication solves three massive architectural problems:

  1. High Availability (HA): Hardware fails. AWS zones go down. Replication ensures that if your Primary node dies, a Standby node is ready to take over immediately.
  2. Read Scaling (Load Balancing): As your app grows, SELECT queries often become the bottleneck. Replication allows you to direct all write traffic (INSERT, UPDATE) to the Primary, while distributing read traffic across multiple Read Replicas.
  3. Analytics Isolation: Running heavy analytical queries on your main database can kill performance for users. With replication, you can run those heavy reports on a Standby node without affecting the live application.

The Two Flavors: Physical vs. Logical

PostgreSQL offers two main ways to replicate data. Choosing the right one is crucial.

1. Physical Streaming Replication (The Standard)

This is the most common setup. It copies the physical "Write-Ahead Log" (WAL) files from the Primary to the Standby.

  • Pros: It’s an exact binary clone. It’s extremely fast and easy to manage.
  • Cons: The Standby is read-only. You cannot write to it. You also cannot replicate just one table; it’s all or nothing.
  • Best For: High Availability and Disaster Recovery.

2. Logical Replication

This decodes the data changes into a stream of logical modifications (e.g., "Row inserted into Table A").

  • Pros: Flexible. You can replicate specific tables or replicate between different OS versions (e.g., Linux to Windows). The subscriber database can be writable.
  • Cons: Slightly higher overhead and complexity.
  • Best For: Data migration (ETL) or moving data to data warehouses.

🏗️ The Tutorial: Setting Up Physical Replication

Let's get our hands dirty. We will set up Physical Streaming Replication.

The Setup

  • Primary Server (Node A): 192.168.1.10
  • Standby Server (Node B): 192.168.1.20
  • OS: Ubuntu/Debian (Paths may vary slightly for CentOS/RHEL)
  • Postgres Version: 14, 15, or 16 (Must be the same on both)

Step 1: Configure the Primary (Node A)

First, we need to create a user that has the special permission to stream WAL files.

Restart Postgres:

sudo systemctl restart postgresql

Edit pg_hba.conf:We need to explicitly allow the Standby server to connect. Add this line to the bottom of the file:

# TYPE  DATABASE        USER            ADDRESS                 METHOD
host    replication     rep_user        192.168.1.20/32         md5

Edit postgresql.conf:Locate your config file (usually /etc/postgresql/15/main/postgresql.conf) and set these parameters:

# Allow the server to listen on all IPs
listen_addresses = '*'

# Enable replication mode
wal_level = replica

# Optional: How many standby connections allowed
max_wal_senders = 10

Create the Replication User:Log into Postgres and run:

CREATE USER rep_user WITH REPLICATION ENCRYPTED PASSWORD 'secret_password';

Step 2: Configure the Standby (Node B)

⚠️ Warning: This step involves wiping the data directory on the Standby. Ensure this is a fresh server or you have backups.

    • -h: Primary IP
    • -U: Replication User
    • -D: Standby Data Directory
    • -R: Crucial. Creates the standby.signal file and connection settings automatically.

Start the Standby:

sudo systemctl start postgresql

The Magic Command: pg_basebackupThis utility copies the data from the Primary and configures the Standby for us automatically.

sudo -u postgres pg_basebackup -h 192.168.1.10 -D /var/lib/postgresql/15/main/ -U rep_user -P -R

(Type 'secret_password' when prompted)

Clear the Existing Data:

# Be very careful with this command!
sudo rm -rf /var/lib/postgresql/15/main/*

Stop Postgres on the Standby:

sudo systemctl stop postgresql

Step 3: Verification

Is it working? Let's check.

On the Primary Server, run this SQL query:

SELECT * FROM pg_stat_replication;

If you see a row with the state streaming, congratulations! You have a working cluster.

Test it out:

  1. Create a table on the Primary: CREATE TABLE test (id int);
  2. Go to the Standby and run: SELECT * FROM test;
  3. The table should be there instantly!

How to Handle a Failover

If your Primary server crashes, the Standby is still in "Read-Only" mode. To promote it to be the new Primary (writable), run this command on the Standby server:

sudo -u postgres pg_ctl promote -D /var/lib/postgresql/15/main/

The Standby will immediately sever ties with the old Primary and accept write connections.

Conclusion

Setting up replication used to be a complex task involving manual WAL file management. With tools like pg_basebackup, it’s now accessible to everyone.

Start with this simple Primary-Standby setup. As you scale, you can explore synchronous replication for zero-data-loss guarantees or logical replication for complex data flows.

Did this guide save your uptime? Follow for more !

Subscribe to Digital Lab

Don’t miss out on the latest issues. Sign up now to get access to the library of members-only issues.
jamie@example.com
Subscribe