PITR
PITR(Point-in-Time Recovery),基于时间点的恢复,可以恢复basebackup时,由pg_backup_start创建点REDO点到之后任意时间的数据。
一、PIRT的恢复流程
- 使用内部函数read_backup_label 从backup_label文件中读取CHECKPOINT LOCATION的值。
- 从postgresql.conf中读取restore_command和recovery_target_time的值。
- 用从CHECKPOINT LOCATION获取的REDO点重放 WAL 数据,执行restore_command中配置的命令,将归档日志从归档区域复制到临时区域,并从中读取 WAL 数据,复制到临时区域中的日志文件会在使用后被删除。
如果参数recovery_target_time设置了时间戳,会从REDO点读取并重放WAL数据,直到设置的时间戳为止。如果postgresql.conf中没有配置recovery_target,则将重放至归档日志的末尾。
- 当恢复过程完成时,会在pg_wal中创建时间线历史文件,如00000002.history。如果启用了日志归档功能,则还会在归档目录中创建相同的命名文件。
PIRT的恢复过程与常规的完全恢复过程基本一致,区别有两点:
- 读取WAL/归档日志的位置不同:常规的恢复模式下,从数据目录下的pg_wal中读取,PITR模式下,从 archive_command 指定的归档目录下获取
- 读取checkpoint的位置不同:常规的恢复模式下,从pg_control中读取,PITR模式下从backup_label文件中获取。
例如在上一次的全量恢复后,在12:15:00时间点又犯了一个错误,错误发生在时间线ID为2的数据库集簇上。在这种情况下,为了恢复数据库,需要在postgresql.conf中指定参数 recovery_target_time和recovery_target_timeline。重启PostgreSQL服务器并进入PITR模式,数据库会沿着时间线标识2进行恢复。
| recovery_target_time = "2023-2-1 12:15:00 GMT"
recovery_target_timeline = 2
|
When you do PITR more than once, you should explicitly set a timelineId for using the appropriate timeline history file.
二、PITR的恢复实验
基于上一篇已经恢复过一次的数据库进行操作。
准备数据
| postgres=# create table tbl(id int, name varchar(20));
CREATE TABLE
postgres=# insert into tbl values(1,'a'),(2,'b'),(3,'c');
INSERT 0 3
|
1. recovery_target_time
recovery_target_time(timestramp):指定按时间戳恢复。
1. 删除数据
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18 | postgres=# select current_timestamp;
current_timestamp
-------------------------------
2023-02-02 23:26:59.785955+08
(1 row)
postgres=# delete from tbl;
DELETE 3
postgres=# select * from tbl;
id | name
----+------
(0 rows)
postgres=# select pg_switch_wal(); -- 切换日志,保证wal日志同步到归档目录
pg_switch_wal
---------------
0/70134A0
(1 row)
|
2. 恢复数据
清理当前数据
| postgres@ubuntu:~$ pg_ctl stop
postgres@ubuntu:~$ rm -rf $PGDATA/*
|
恢复数据并指定recovery_target_time
| postgres@ubuntu:~$ touch $PGDATA/recovery.signal
postgres@ubuntu:~$ tar xf backup.tar/base.tar.gz -C $PGDATA
postgres@ubuntu:~$ tar xf backup.tar/pg_wal.tar.gz -C $PGDATA
postgres@ubuntu:~$ echo "recovery_target_time = '2023-02-02 23:26:59.785955+08'" >> $PGDATA/postgresql.conf
|
3. 启动数据库
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21 | postgres@ubuntu:~$ pg_ctl start
waiting for server to start....2023-02-02 23:30:06.035 CST [5328] LOG: starting PostgreSQL 12.13 on x86_64-pc-linux-gnu, compiled by gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0, 64-bit
2023-02-02 23:30:06.036 CST [5328] LOG: listening on IPv4 address "127.0.0.1", port 5432
2023-02-02 23:30:06.054 CST [5328] LOG: listening on Unix socket "/tmp/.s.PGSQL.5432"
2023-02-02 23:30:06.095 CST [5329] LOG: database system was interrupted; last known up at 2023-01-30 22:52:08 CST
2023-02-02 23:30:06.126 CST [5329] LOG: restored log file "00000002.history" from archive
cp: cannot stat '/home/postgres/archive/00000003.history': No such file or directory
2023-02-02 23:30:06.131 CST [5329] LOG: starting point-in-time recovery to 2023-02-02 23:26:59.785955+08
2023-02-02 23:30:06.146 CST [5329] LOG: restored log file "00000002.history" from archive
2023-02-02 23:30:06.189 CST [5329] LOG: restored log file "000000010000000000000004" from archive
2023-02-02 23:30:06.231 CST [5329] LOG: redo starts at 0/4000028
2023-02-02 23:30:06.248 CST [5329] LOG: consistent recovery state reached at 0/4000138
2023-02-02 23:30:06.249 CST [5328] LOG: database system is ready to accept read only connections
2023-02-02 23:30:06.302 CST [5329] LOG: restored log file "000000010000000000000005" from archive
done
server started
postgres@ubuntu:~$ 2023-02-02 23:30:06.380 CST [5329] LOG: restored log file "000000020000000000000006" from archive
2023-02-02 23:30:06.441 CST [5329] LOG: restored log file "000000020000000000000007" from archive
2023-02-02 23:30:06.471 CST [5329] LOG: recovery stopping before commit of transaction 488, time 2023-02-02 23:27:19.159422+08
2023-02-02 23:30:06.471 CST [5329] LOG: recovery has paused
2023-02-02 23:30:06.471 CST [5329] HINT: Execute pg_wal_replay_resume() to continue.
|
连接到数据库验证数据是否恢复
| postgres=# select * from tbl; -- 删除的数据恢复成功
id | name
----+------
1 | a
2 | b
3 | c
(3 rows)
|
接触暂停状态
| postgres=# select pg_wal_replay_resume();
pg_wal_replay_resume
----------------------
(1 row)
|
2. recovery_target_name
recovery_target_name(string):从pg_create_restore_point()所创建恢复点,进行恢复。
1. 创建还原点
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21 | postgres=# select * from tbl;
id | name
----+------
1 | a
2 | b
3 | c
(3 rows)
postgres=# select pg_create_restore_point('rp1'); -- 创建还原点
pg_create_restore_point
-------------------------
0/7013400
(1 row)
postgres=# drop table tbl;
DROP TABLE
postgres=# select pg_switch_wal();
pg_switch_wal
---------------
0/701AD88
(1 row)
|
2. 恢复数据
| postgres@ubuntu:~$ touch $PGDATA/recovery.signal
postgres@ubuntu:~$ tar xf backup.tar/base.tar.gz -C $PGDATA
postgres@ubuntu:~$ tar xf backup.tar/pg_wal.tar.gz -C $PGDATA
postgres@ubuntu:~$ echo "recovery_target_name = 'rp1'" >> $PGDATA/postgresql.conf
|
3. 启动数据库
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23 | postgres@ubuntu:~$ pg_ctl start
waiting for server to start....2023-02-02 23:48:09.528 CST [81978] LOG: starting PostgreSQL 12.13 on x86_64-pc-linux-gnu, compiled by gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0, 64-bit
2023-02-02 23:48:09.559 CST [81978] LOG: listening on IPv4 address "127.0.0.1", port 5432
2023-02-02 23:48:09.584 CST [81978] LOG: listening on Unix socket "/tmp/.s.PGSQL.5432"
2023-02-02 23:48:09.648 CST [81979] LOG: database system was interrupted; last known up at 2023-01-30 22:52:08 CST
2023-02-02 23:48:09.669 CST [81979] LOG: restored log file "00000002.history" from archive
2023-02-02 23:48:09.674 CST [81979] LOG: restored log file "00000003.history" from archive
cp: cannot stat '/home/postgres/archive/00000004.history': No such file or directory
2023-02-02 23:48:09.678 CST [81979] LOG: starting point-in-time recovery to "rp1"
2023-02-02 23:48:09.684 CST [81979] LOG: restored log file "00000003.history" from archive
2023-02-02 23:48:09.720 CST [81979] LOG: restored log file "000000010000000000000004" from archive
2023-02-02 23:48:09.744 CST [81979] LOG: restored log file "00000002.history" from archive
2023-02-02 23:48:09.805 CST [81979] LOG: redo starts at 0/4000028
2023-02-02 23:48:09.821 CST [81979] LOG: consistent recovery state reached at 0/4000138
2023-02-02 23:48:09.822 CST [81978] LOG: database system is ready to accept read only connections
done
server started
postgres@ubuntu:~$ 2023-02-02 23:48:09.857 CST [81979] LOG: restored log file "000000010000000000000005" from archive
2023-02-02 23:48:09.915 CST [81979] LOG: restored log file "000000020000000000000006" from archive
2023-02-02 23:48:09.952 CST [81979] LOG: restored log file "000000030000000000000007" from archive
2023-02-02 23:48:09.975 CST [81979] LOG: recovery stopping at restore point "rp1", time 2023-02-02 23:44:09.828934+08
2023-02-02 23:48:09.975 CST [81979] LOG: recovery has paused
2023-02-02 23:48:09.975 CST [81979] HINT: Execute pg_wal_replay_resume() to continue.
|
连接到数据库验证数据是否恢复
| postgres=# select * from tbl; -- 删除的数据恢复成功
id | name
----+------
1 | a
2 | b
3 | c
(3 rows)
|
接触暂停状态
| postgres=# select pg_wal_replay_resume();
pg_wal_replay_resume
----------------------
(1 row)
|
此时数据库时间线为4
| postgres@ubuntu:~$ cat $PGDATA/pg_wal/00000004.history
1 0/6000000 no recovery target specified
2 0/70132E8 before 2023-02-02 23:27:19.159422+08
3 0/7013400 at restore point "rp1"
|
3. recovery_target_xid
recovery_target_xid(string):按事物ID进行恢复。
4. revovery_target_lsn
revovery_target_lsn(lsn_log):按继续进行的WAL日志位置的LSN进行恢复。