系统使用了一款开源的cas单点登录系统,存储大对象的方式是lo,通常lo的性能会比bytea要好一点,开发告知会定期清理用户数据,但是实际上发现系统并没有删除用户数据所关联的大对象数据。故需要写个脚本定期清理一下。
一、开发背景
DB: PostgreSQL 9.3.0
cas=# select oid,rolname from pg_authid where oid in (10,327299);
oid | rolname
-----+----------
10 | postgres
327299| usr_cas
(1 row)
cas=# select lomowner,count(1) from pg_largeobject_metadata group by 1;
lomowner | count
----------+--------
10 | 292408
327299 | 382123
(2 row)
二、清理
需要清理两部分,postgres用户的大对象与usr_cas用户的大对象,前者是用postgres连接时创建的,需要全部删除,后者存在部分用户数据已删但大对象没删的数据,也需要删除。
1.lo_unlink删除
删除通常使用自带的lo_unlink()函数,于是使用了以下命令,但爆出问题 out of shared memory
cas=# select lo_unlink(oid) from pg_largeobject_metadata where lomowner = 10;
WARNING: out of shared memory
ERROR: out of shared memory
HINT: You might need to increase max_locks_per_transaction.
cas=# show max_locks_per_transaction ;
max_locks_per_transaction
---------------------------
64
(1 row)
这个提示比较明显,一个SQL把所有的大对象在一个事务里完成,但分配的内存不够,所以失败了,要增加max_locks_per_transaction参数值,这个值默认是64。其实也可以换个角度删除,不把所有的大对象在一个事务里删除,而是分批次执行,因为要删除的数据量其实也不算多,就考虑了后者。
--多执行以下命令几次就可以了,每次删2W,执行10几次就够了,也可以放脚本里写,一次执行
cas=# select lo_unlink(oid) from pg_largeobject_metadata where lomowner = 10 limit 20000;
2.vacuumlo删除
清理完postgres的用户数据以后,接着要清理usr_cas用户的大对象数据,要写脚本逐个比对比较麻烦,而且效率也不一定好。这可以使用自带的vacuumlo的小工具。这个工具是通过大对象的OID与用户表中的oid进行关联比对,然后逐一删除,所以在设计大对象用户表时,虽然也可以使用int类型存储oid值,但是对后期的维护不方便,推荐使用oid类型。 如果这个工具没有安装,可以在contrib/vacuumlo下面make && make install安装一下即可
简介如下:
[postgres@kenyon-primary ~]$ vacuumlo --help
vacuumlo removes unreferenced large objects from databases.
Usage:
vacuumlo [OPTION]... DBNAME...
Options:
-l LIMIT commit after removing each LIMIT large objects
-n don't remove large objects, just show what would be done
-v write a lot of progress messages
-V, --version output version information, then exit
-?, --help show this help, then exit
Connection options:
-h HOSTNAME database server host or socket directory
-p PORT database server port
-U USERNAME user name to connect as
-w never prompt for password
-W force password prompt
Report bugs to .
使用:
--显示要清理的数据,不清理,只显示
[postgres@kenyon-primary ~]$ vacuumlo -n cas -v
Connected to database "cas"
Test run: no large objects will be removed!
Checking expiration_policy in public.serviceticket
Checking service in public.serviceticket
Checking expiration_policy in public.ticketgrantingticket
Checking authentication in public.ticketgrantingticket
Checking services_granted_access_to in public.ticketgrantingticket
Would remove 382143 large objects from database "cas".
--清理,可以加个“l”参数,每隔这个参数提交一次
[postgres@kenyon-primary ~]$ vacuumlo cas -v -l 1000
Connected to database "cas"
Test run: no large objects will be removed!
Checking expiration_policy in public.serviceticket
Checking service in public.serviceticket
Checking expiration_policy in public.ticketgrantingticket
Checking authentication in public.ticketgrantingticket
Checking services_granted_access_to in public.ticketgrantingticket
Would remove 382143 large objects from database "cas".
清理完毕再看一下
cas=# select pg_size_pretty(pg_database_size('cas'));
pg_size_pretty
----------------
1.3 GB
(1 row)
--空间还没有收缩,使用vacuum full analyze
cas=# vacuum full analyze verbose pg_largeobject;
INFO: vacuuming "pg_catalog.pg_largeobject"
INFO: scanned index "pg_largeobject_loid_pn_index" to remove 88928 row versions
DETAIL: CPU 0.01s/0.24u sec elapsed 0.26 sec.
INFO: "pg_largeobject": removed 88928 row versions in 6833 pages
DETAIL: CPU 0.00s/0.02u sec elapsed 0.02 sec.
INFO: index "pg_largeobject_loid_pn_index" now contains 948117 row versions in 4120 pages
DETAIL: 88928 index row versions were removed.
1516 index pages have been deleted, 1269 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO: "pg_largeobject": found 88928 removable, 52 nonremovable row versions in 6891 out of 109226 pages
DETAIL: 0 dead row versions cannot be removed yet.
There were 2329 unused item pointers.
0 pages are entirely empty.
CPU 0.03s/0.32u sec elapsed 0.35 sec.
INFO: analyzing "pg_catalog.pg_largeobject"
INFO: "pg_largeobject": scanned 30000 of 109226 pages, containing 260529 live rows and 0 dead rows; 30000 rows in sample, 947568 estimated total rows
VACUUM
cas=# select pg_size_pretty(pg_relation_size('pg_largeobject'));
pg_size_pretty
----------------
8192 KB
(1 row)
整个世界清静了。 写成脚本的方式,定期执行
[postgres@kenyon-primary ~]$ more cas_rm_lo.sh
#!/bin/bash
######################################################
##
## purpose:Rm the cas's large object and free space
##
## author :Kenyon
##
## created:2014-01-22
##
#####################################################
source /home/postgres/.bash_profile
vacuumlo cas -l 1000 -v
psql -d cas -c "vacuum full analyze verbose pg_largeobject;"
psql -d cas -c "vacuum full analyze verbose pg_largeobject_metadata;"
三、总结
在使用开源的一些工具时,如果有使用一些大对象,需要注意一下程序清理用户数据时是否会同步删除大对象数据。