Just somewhere to keep my notes while I'm playing.

Tuesday, December 22, 2009

Point-In-Time Recover in Postgresql

The way you do point in time recovery in Postgresql is to restore a known backup, and then reapply WAL logs until you reach the time you wish to stop at. The database must be in Archive Log Mode (just like Oracle), and you must have all the logs to complete the recovery.

Putting the database in Archive Log Mode


Step 1:
Back up your database
pg_dump -h localhost -p 5432 -U timj -F c -v -f "D:\pgbackup\cddb_21Dec2009_2.backup" cddb

Now let's put the database in archive log mode:

Edit postgresql.conf:
fsync = on
synchronous_commit = on
wal_sync_method = fsync
full_page_writes = on
archive_mode = on
archive_command = 'copy "%p" "D:\\pglogs\\%f"'
archive_timeout = 3600 # switch logs every hour

And remember to bounce the database.

Taking a base backup


I now have to take a base backup of my system.
To do this I logged in as postgres and issued the following:
SELECT pg_start_backup('label2');

I backed the database up using a normal file system backup with the datbase up, and then issued:
SELECT pg_stop_backup();

Next - put in a transaction that we will want recovering:

insert into artist (name) values ('Vicky Beeching');

As this is my first time doing this, I forced a log switch to help with the recovery!

SELECT pg_switch_xlog();

The Disaster!

After my backup, I created a record in the artist table that should be there post recovery. The table has a time trigger on it, so I knew exactky the time I wanted to recover to. I then deleted Nick Drake's entry from the artist table - could there be a bigger mistake than that!

The Recovery

Step 1: Take the database down. This may seem like the obvious first step, but there is a gotchya. The default checkpoint time-out in Postgresql is 5 minutes. This means that if you are relying on recoving your database up to the last good transaction, and if that transaction was less than 5 minutes before hand, it may not have been written to a log file in the pg_xlog directory. Therefore, it may be worth while logging on as an admin user and issuing the command checkpoint; before you shut down the instance.

Step 2: Take a backup!
With the database down, copy off the data directory and any other tablespaces you have in use.

Step 3: Delete your database!
We are going to rebuild the database using the good copy we created in our initial backup, the archived WAL logs and the WAL logs from pg_xlog, so we need to get rid of everything in the existing database.

Step 4: Copy across your backed up database
Copy your good backup into the live data directory.

Step 5: Delete everything from pg_xlog
The files in the restored pg_xlog directory will be from before the backup, and so of no use to you.

Step 6: Copy across the content of your pg_xlog directory
From the backup you took after you discovered you needed to recover, copy the files to your live pg_xlog directory.

Step 7: Create a recovery.conf file
Something like this:
restore_command = 'copy "D:\\pglogs\\%f" "%p"' # Windows
recovery_target_time ('2009-12-22 15:34:56')
The copy command should be in the syntax of whatever OS you are using, very much like the entry in postgresql.conf

Step 8: Restart the database
You might want to restrict who can access while you are checking the recovery.
At the end of the recovery recovery.conf is renamed recovery.done

Step 9: Take a fresh baseline baclup!
It is really good practice to do this before your users start posting transactions, if at all possible.

No comments: