假设有凌晨1点做了全备,当天下午4点误删了数据库,需要恢复到删数据库之前
思路
recovery_target_xid = '' # 恢复到事务号
recovery_target_lsn = '' # 恢复到日志序列号
恢复使用这两个参数会更精确一些
一、备份环境
###备份脚本
[postgres@postgresql ~]$ pg_basebackup -D /data/backupsets/ -R -Ft -Pv -Upostgres
pg_basebackup: initiating base backup, waiting for checkpoint to complete
pg_basebackup: checkpoint completed
pg_basebackup: write-ahead log start point: 0/2000028 on timeline 1
pg_basebackup: starting background WAL receiver
pg_basebackup: created temporary replication slot "pg_basebackup_8831"
32243/32243 kB (100%), 1/1 tablespace
pg_basebackup: write-ahead log end point: 0/2000100
pg_basebackup: waiting for background process to finish streaming ...
pg_basebackup: syncing data to disk ...
pg_basebackup: renaming backup_manifest.tmp to backup_manifest
pg_basebackup: base backup completed
# 备份集
[postgres@postgresql backupsets]$ ll
total 48872
-rw-------. 1 postgres postgres 178976 Jul 14 17:41 backup_manifest
-rw-------. 1 postgres postgres 33079808 Jul 14 17:41 base.tar
-rw-------. 1 postgres postgres 16778752 Jul 14 17:41 pg_wal.tar
二、模拟drop库
testdb=# \c postgres
You are now connected to database "postgres" as user "postgres".
postgres=# drop database testdb;
DROP DATABASE
postgres=# select pg_walfile_name(pg_current_wal_lsn());
pg_walfile_name
--------------------------
00000001000000000000001E
可以看到当前使用的wal日志文件是00000001000000000000001E,那么drop操作一定是在这个日志中,需要使用pg_waldump工具查看日志文件,找出drop之前的事务号或者日志序列号
三、恢复数据库
[postgres@postgresql ~]$ psql
psql (13.6)
Type "help" for help.
postgres=# select pg_switch_wal();
pg_switch_wal
---------------
0/1E0008D8
(1 row)
--在关闭数据库前手工切换下日志,确保归档目录下有误操作的日志文件
[postgres@postgresql ~]$ pg_ctl stop
waiting for server to shut down.... done
server stopped
[postgres@postgresql ~]$ cd $PGDATA
[postgres@postgresql data]$ rm -rf *
---这里为了方便,删除原库操作
1.恢复全备数据
[postgres@postgresql backupsets]$ tar xf base.tar -C /data/pg13.6/data/
2.日志应用前配置
归档文件
[postgres@postgresql pgarchive]$ ll
total 131080
-rw-------. 1 postgres postgres 16777216 Jul 14 17:35 000000010000000000000018
-rw-------. 1 postgres postgres 16777216 Jul 14 17:35 000000010000000000000019
-rw-------. 1 postgres postgres 16777216 Jul 14 17:37 00000001000000000000001A
-rw-------. 1 postgres postgres 16777216 Jul 14 17:37 00000001000000000000001B
-rw-------. 1 postgres postgres 340 Jul 14 17:37 00000001000000000000001B.00000028.backup
-rw-------. 1 postgres postgres 16777216 Jul 14 17:41 00000001000000000000001C
-rw-------. 1 postgres postgres 16777216 Jul 14 17:41 00000001000000000000001D
-rw-------. 1 postgres postgres 340 Jul 14 17:41 00000001000000000000001D.00000028.backup
-rw-------. 1 postgres postgres 16777216 Jul 14 18:32 00000001000000000000001E
-rw-------. 1 postgres postgres 16777216 Jul 14 18:38 00000001000000000000001F
找出恢复点
[postgres@postgresql pg_wal]$ pg_waldump 00000001000000000000001E
rmgr: Standby len (rec/tot): 50/ 50, tx: 0, lsn: 0/1E000028, prev 0/1D000100, desc: RUNNING_XACTS nextXid 511 latestCompletedXid 510 oldestRunningXid 511
rmgr: Heap len (rec/tot): 59/ 1511, tx: 511, lsn: 0/1E000060, prev 0/1E000028, desc: DELETE off 7 flags 0x00 KEYS_UPDATED , blkref #0: rel 1664/0/1262 blk 0 FPW
rmgr: Standby len (rec/tot): 54/ 54, tx: 0, lsn: 0/1E000648, prev 0/1E000060, desc: RUNNING_XACTS nextXid 512 latestCompletedXid 510 oldestRunningXid 511; 1 xacts: 511
rmgr: Standby len (rec/tot): 54/ 54, tx: 0, lsn: 0/1E000680, prev 0/1E000648, desc: RUNNING_XACTS nextXid 512 latestCompletedXid 510 oldestRunningXid 511; 1 xacts: 511
rmgr: XLOG len (rec/tot): 114/ 114, tx: 0, lsn: 0/1E0006B8, prev 0/1E000680, desc: CHECKPOINT_ONLINE redo 0/1E000680; tli 1; prev tli 1; fpw true; xid 0:512; oid 24590; multi 1; offset 0; oldest xid 478 in DB 1; oldest multi 1 in DB 1; oldest/newest commit timestamp xid: 0/0; oldest running xid 511; online
rmgr: Database len (rec/tot): 38/ 38, tx: 511, lsn: 0/1E000730, prev 0/1E0006B8, desc: DROP dir 1663/16398
rmgr: Transaction len (rec/tot): 66/ 66, tx: 511, lsn: 0/1E000758, prev 0/1E000730, desc: COMMIT 2022-07-14 17:42:26.981178 CST; inval msgs: catcache 21; sync
rmgr: Standby len (rec/tot): 50/ 50, tx: 0, lsn: 0/1E0007A0, prev 0/1E000758, desc: RUNNING_XACTS nextXid 512 latestCompletedXid 511 oldestRunningXid 512
rmgr: Standby len (rec/tot): 50/ 50, tx: 0, lsn: 0/1E0007D8, prev 0/1E0007A0, desc: RUNNING_XACTS nextXid 512 latestCompletedXid 511 oldestRunningXid 512
rmgr: XLOG len (rec/tot): 114/ 114, tx: 0, lsn: 0/1E000810, prev 0/1E0007D8, desc: CHECKPOINT_ONLINE redo 0/1E0007D8; tli 1; prev tli 1; fpw true; xid 0:512; oid 24590; multi 1; offset 0; oldest xid 478 in DB 1; oldest multi 1 in DB 1; oldest/newest commit timestamp xid: 0/0; oldest running xid 512; online
rmgr: Standby len (rec/tot): 50/ 50, tx: 0, lsn: 0/1E000888, prev 0/1E000810, desc: RUNNING_XACTS nextXid 512 latestCompletedXid 511 oldestRunningXid 512
rmgr: XLOG len (rec/tot): 24/ 24, tx: 0, lsn: 0/1E0008C0, prev 0/1E000888, desc: SWITCH
从挖掘的日志来看,只要恢复到事务号511之前即可(drop database 操作在日志中记录为drop dir字眼)如果日志内容过多,可以根据时间点仔细分析,而lsn: 0/1E000028为510的最后序列号
所以postgresql.auto.conf 添加如下内容
[postgres@postgresql data]$ vim postgresql.auto.conf
restore_command='cp /pgarchive/%f %p'
recovery_target_lsn = '0/1E000028'
注:因在原先的机器上,且归档目录不在$PGDATA目录下,归档文件完整,所以没必要解压一遍/data/backupsets/pg_wal.tar到归档目录下
四、启动数据库
[postgres@postgresql data]$ pg_ctl start
waiting for server to start....2022-07-14 18:58:23.398 CST [15123] LOG: redirecting log output to logging collector process
2022-07-14 18:58:23.398 CST [15123] HINT: Future log output will appear in directory "log".
done
server started
[postgres@postgresql data]$ psql
psql (13.6)
Type "help" for help.
postgres=# \l
List of databases
Name | Owner | Encoding | Collate | Ctype | Access privileges
-----------+----------+----------+-------------+-------------+-----------------------
postgres | postgres | UTF8 | en_US.UTF-8 | en_US.UTF-8 |
template0 | postgres | UTF8 | en_US.UTF-8 | en_US.UTF-8 | =c/postgres +
| | | | | postgres=CTc/postgres
template1 | postgres | UTF8 | en_US.UTF-8 | en_US.UTF-8 | =c/postgres +
| | | | | postgres=CTc/postgres
testdb | postgres | UTF8 | en_US.UTF-8 | en_US.UTF-8 |
(4 rows)
postgres=# \c testdb
You are now connected to database "testdb" as user "postgres".
testdb=# select * from test;
id | name | phone | country | numberrange
----+----------------+----------------+----------+-------------
1 | Wade Sykes | 1-917-342-3132 | Turkey | 3
2 | Barrett Boyer | 1-264-304-0665 | Germany | 9
3 | Alana Kaufman | (213) 254-4997 | India | 0
4 | Emmanuel Lopez | (543) 493-0137 | Germany | 9
5 | Timon Bauer | 1-269-448-2772 | Pakistan | 6
(5 rows)
这时候数据库还是只读模式,需要执行select pg_wal_replay_resume();来结束日志应用
[postgres@postgresql data]$ psql
psql (13.6)
Type "help" for help.
postgres=# create database xx;
ERROR: cannot execute CREATE DATABASE in a read-only transaction
postgres=# select pg_wal_replay_resume();
pg_wal_replay_resume
----------------------
(1 row)
postgres=# create database xx;
CREATE DATABASE
postgres=#
五、部署备份任务计划(仅供参考)
创建backup.sh备份脚本
[postgres@postgresql backupsets]$ cd /data/scripts/
[postgres@postgresql scripts]$ cat backup.sh
#!/bin/bash
DATE=$(date +%Y%m%d)
BACKDIR=/data/backupsets/
REV_DATE=1
function green_echo(){
echo -e "\e[40;32;1m$1\e[0m"
}
function red_echo(){
echo -e "\e[40;31;1m$1\e[0m"
}
#全量备份所有数据到 以备份日期命名的文件夹
function backup(){
pg_basebackup -D /data/backupsets/bkdata_$DATE/ -R -Ft -Pv -Upostgres
}
#开始备份
green_echo "Begin backup data At Date:`date`"
backup
green_echo "End backup data At Date:`date`"
#删除7天前的备份文件
cd $BACKDIR
if [ $? -eq 0 ];then
find ${BACKDIR:=/tmp} -type d -mtime +$REV_DATE |xargs rm -rvf
green_echo "delete Success"
else
red_echo "delete fail please check log!!!"
exit 1
fi
配置crontab任务计划
[postgres@postgresql scripts]$ crontab -l
30 03 * * * /bin/bash /data/scripts/backup.sh >/data/scripts/backup.log 2>&1
-bash-4.2$ cat backupdump.sh
#!/bin/bash
#操作类型,backup、restore
type=$1
#造作数据库schema名
dbname=$2
#备份文件名,格式为:注册名_yyyyMMddHHmmss.sql
backupFileName=$3
#数据库所在服务器ip
dbhost=$4
#固定存储目录/home/backup/
if [ ! -d "/home/backup/" ];then
mkdir "/home/backup/"
fi
backupFile="/home/backup/"${backupFileName}
echo ${backupFile}
cd /usr/pgsql-14/bin
if [ $type == "backup" ];then
PGPASSWORD="postgres" ./pg_dump -h ${dbhost} -U postgres ${dbname} > ${backupFile}
elif [ $type == "restore" ];then
#先清理掉schema,不然相对于恢复点有新增数据,恢复时不会清理
PGPASSWORD="postgres" ./psql -h ${dbhost} -U postgres -d ${dbname} -c "drop schema public cascade;create schema public;"
PGPASSWORD='postgres' ./psql -h ${dbhost} -U postgres -d ${dbname} -f ${backupFile}
else
echo "没有合适的操作类型"
exit 1
fi
exit 0
-bash-4.2$ ls
14 backupdump.sh
-bash-4.2$ cat basebackup.sh
#!/bin/bash
DATE=$(date +%Y%m%d)
BACKDIR=/data/backupsets/
REV_DATE=1
function green_echo(){
echo -e "\e[40;32;1m$1\e[0m"
}
function red_echo(){
echo -e "\e[40;31;1m$1\e[0m"
}
#全量备份所有数据到 以备份日期命名的文件夹
function backup(){
pg_basebackup -D /data/backupsets/bkdata_$DATE/ -R -Ft -Pv -Upostgres
}
#开始备份
green_echo "Begin backup data At Date:`date`"
backup
green_echo "End backup data At Date:`date`"
#删除7天前的备份文件
cd $BACKDIR
if [ $? -eq 0 ];then
find ${BACKDIR:=/tmp} -type d -mtime +$REV_DATE |xargs rm -rvf
green_echo "delete Success"
else
red_echo "delete fail please check log!!!"
exit 1
fi
-bash-4.2$
[root@mysql1 data]# vim pg_hba.conf
local all all trust
host all all 127.0.0.1/32 trust
host all all ::1/128 trust
host all all 0.0.0.0/0 md5
local replication all trust
host replication all 0.0.0.0/0 md5
host replication all 127.0.0.1/32 trust
host replication all ::1/128 trust
之前的远程备份关了
[root@mysql1 data]# vim /pgsql/data/postgresql.conf
[root@mysql1 data]# systemctl restart postgresql.service
在备份服务器执行下面操作
备份目录/pgsql/backup/必须为空
mkdir -p /pgsql/backup/
备份
pg_basebackup -D /pgsql/backup/ -Ft -Upostgres -h192.168.57.110 -R
root@zhaohuakang:/pgsql/backup# ls /pgsql/backup/
backup_manifest base.tar pg_wal.tar
利用完全备份恢复
在备份服务器上执行下面操作还原
创建存放归档日志的目录
chown -R postgres. /pgsql/backup/
su - postgres
pg_ctl stop
mkdir /archive/
chown postgres.postgres /archive/
rm -rf /archive/*
rm -rf /pgsql/data/*
解压缩备份文件到数据目录下,进行还原
tar xf /pgsql/backup/base.tar -C /pgsql/data/
tar xf /pgsql/backup/pg_wal.tar -C /archive/
修改配置文件
postgres@zhaohuakang:~$ vim /pgsql/data/postgresql.conf
restore_command = 'cp /archive/%f %p'
recovery_target = 'immediate'
启动
pg_ctl start
利用pitr实现误删除的实战案例
每天2:00备份,第二天10:00误删除数据,如何恢复
恢复过程
备份数据和归档
还原流程
还原完全备份
归档日志恢复:备份中的归档,恢复2点到10点直接的归档,恢复在线redo
在主服务器开启归档
[root@mysql1 data]# vim postgresql.conf
archive_mode = on
archive_command = '[ ! -f /archive/%f ] && cp %p /archive/%f'
[root@mysql1 data]# systemctl restart postgresql.service
创建测试数据
postgres=# create database testdb;
postgres=# \c testdb
testdb=# create table t1(id int);
testdb=# insert into t1 values(1);
在备份服务器上对数据库进行远程备份
rm -rf /pgsql/backup/*
pg_basebackup -D /pgsql/backup/ -Ft -Upostgres -h192.168.57.110 -R
chown -R postgres. /pgsql/backup/
在数据库上继续生成测试数据
testdb=# insert into t1 values(2);
testdb=# insert into t1 values(3);
模拟数据库删除
testdb=# \c db1;
db1=# drop database testdb;
发现故障,停止用户访问
查看当前日志文件
db1=# select pg_walfile_name(pg_current_wal_lsn());
pg_walfile_name
000000010000000000000029
查看当前事务id
db1=# select txid_current();
txid_current
898
故障还原
在服务器上切换归档日志
db1=# select pg_switch_wal();
在要还原的服务器停止服务,准备还原
su - postgres
pg_ctl stop
rm -rf /archive/*
rm -rf /pgsql/data/*
解压缩备份文件到数据目录下,进行还原
tar xf /pgsql/backup/base.tar -C /pgsql/data/
tar xf /pgsql/backup/pg_wal.tar -C /archive/
复制服务器的归档日志到还原的测试服务器,还在还原的服务器操作,但是要root账户
root@zhaohuakang:/backup# rsync -a 192.168.57.110:/archive/ /archive/
查看故障点事务id
root@zhaohuakang:/backup# pg_waldump /archive/000000010000000000000029 |grep DROP
rmgr: Database len (rec/tot): 38/ 38, tx: 897, lsn: 0/29001128, prev 0/290010B0, desc: DROP dir 1663/16543
查看此指令的事务id为897,前一个事务是896
修改配置文件
postgres@zhaohuakang:~$ vim /pgsql/data/postgresql.conf
restore_command = 'cp /archive/%f %p'
recovery_target_xid = '896'
启动
su - postgres
pg_ctl start
验证数据
postgres@zhaohuakang:~$ psql
postgres=# \c testdb
testdb=# select * from t1;
id
1
2
3
当前无法写入
testdb=# insert into t1 values(3);
ERROR: cannot execute INSERT in a read-only transaction
恢复正常模式
testdb=# select pg_wal_replay_resume();
备份单个数据库带创建库的命令,并发送给原数据库
pg_dump -U postgres -C -f /backup/testdb testdb
scp /backup/testdb 192.168.57.110:/backup/testdb
数据库录入 这个时候数据库是归档状态,可以通过pg_controldata命令查看数据库簇状态: 8、切换数据库状态 执行pg_ctl promote命令即可。
cd /backup/
psql