Bucardo的状态问题

用Bucardo做PostgreSQL的双主或者主从是,比如A服务器部署了bucardo,向B服务器近实时同步A服务器的数据,假如B服务器挂了,会影响Bucardo的状态。但是这个状态需要细化才能发下。


一、 环境:
PostgreSQL 9.3.5
Bucardo 5.3.0

A与B通过Bucardo做同步


二、状态监测
--停掉B服务器的DB,能看到sync_adm的状态还是好的,这是假象
[postgres@his-db02 ~]$ bucardo status
PID of Bucardo MCP: 25822
 Name       State    Last good    Time     Last I/D    Last bad    Time  
==========+========+============+========+===========+===========+=======
 sync_adm | Good   | 11:18:12   | 9m 30s | 0/0       | none      |     

--看sync_adm的状态
[postgres@his-db02 ~]$ bucardo status sync_adm
======================================================================
Last good                : Jan 23, 2015 11:18:12 (time to run: 1s)
Rows deleted/inserted    : 0 / 0
Sync name                : sync_adm
Current state            : Good
Source relgroup/database : herd_adm / source_db_adm
Tables in sync           : 1
Status                   : Stalled
Check time               : None
Overdue time             : 00:00:00
Expired time             : 00:00:00
Stayalive/Kidsalive      : Yes / Yes
Rebuild index            : No
Autokick                 : Yes
Onetimecopy              : No
Post-copy analyze        : Yes
Last error:              : 
======================================================================
这时可以看到这个状态是Stalled的,两边的同步已经断掉了,此时需要重启一下bucardo,重启完后会自动同步,如果只是做主从的话,这点不如PostgreSQL内置的replication stream,好处是同步的粒度是表级的,比内置的流复制更细。

三、日志
(25862) [Fri Jan 23 10:57:17 2015] KID (sync_adm) Totals: deletes=1 inserts=1 conflicts=0
(25862) [Fri Jan 23 10:57:51 2015] KID (sync_adm) Delta count for source_db_adm.public.test_ken : 1
(25862) [Fri Jan 23 10:57:51 2015] KID (sync_adm) Totals: deletes=1 inserts=1 conflicts=0
(25822) [Fri Jan 23 11:17:49 2015] MCP Ping failed for database target_db_adm, trying to reconnect
(25862) [Fri Jan 23 11:17:49 2015] KID (sync_adm) Kid has died, error is: Ping failed for database "target_db_adm" Line: 5413 
(25862) [Fri Jan 23 11:17:49 2015] KID (sync_adm) Ping failed for database target_db_adm
(25862) [Fri Jan 23 11:17:49 2015] KID (sync_adm) Kid 25862 exiting at cleanup_kid. Sync "sync_adm" public.test_ken Reason: Ping failed for database "target_db_adm" Line: 5413 
(25822) [Fri Jan 23 11:17:49 2015] MCP Starting check_sync_health
(25822) [Fri Jan 23 11:17:49 2015] MCP Database target_db_adm failed ping
(25822) [Fri Jan 23 11:17:49 2015] MCP Warning: Killed (line 44): DBI connect('dbname=admin;port=5432;host=192.168.2.90','postgres',...) failed: could not connect to server: Connection refused
        Is the server running on host "192.168.2.90" and accepting
        TCP/IP connections on port 5432? at Bucardo.pm line 5644
(25822) [Fri Jan 23 11:17:49 2015] MCP Database target_db_adm is unreachable, marking as stalled
(25822) [Fri Jan 23 11:17:49 2015] MCP Marked sync sync_adm as stalled
(29392) [Fri Jan 23 11:17:50 2015] KID (sync_adm) New kid, sync "sync_adm" alive=1 Parent=25848 PID=29392 kicked=1 
(29392) [Fri Jan 23 11:17:50 2015] KID (sync_adm) Kid has died, error is: DBI connect('dbname=admin;port=5432;host=192.168.2.90','postgres',...) failed: could not connect to server: Connection refused         Is the server running on host "192.168.2.90" and accepting    TCP/IP connections on port 5432? at Bucardo.pm line 5644 Line: 3213 
(29392) [Fri Jan 23 11:17:50 2015] KID (sync_adm) Missing target_db_adm database handle
(29392) [Fri Jan 23 11:17:50 2015] KID (sync_adm) Kid 29392 exiting at cleanup_kid. Sync "sync_adm" Reason: DBI connect('dbname=admin;port=5432;host=192.168.2.90','postgres',...) failed: could not connect to server: Connection refused   Is the server running on host "192.168.2.90" and accepting        TCP/IP connections on port 5432? at Bucardo.pm line 5644 Line: 3213 
(25822) [Fri Jan 23 11:17:50 2015] MCP Starting check_sync_health
(25822) [Fri Jan 23 11:17:50 2015] MCP Skipping stalled sync sync_adm
(29519) [Fri Jan 23 11:18:01 2015] KID (sync_adm) New kid, sync "sync_adm" alive=1 Parent=25848 PID=29519 kicked=1 
(29519) [Fri Jan 23 11:18:01 2015] KID (sync_adm) Kid has died, error is: DBI connect('dbname=admin;port=5432;host=192.168.2.90','postgres',...) failed: could not connect to server: Connection refused         Is the server running on host "192.168.2.90" and accepting    TCP/IP connections on port 5432? at Bucardo.pm line 5644 Line: 3213 
(29519) [Fri Jan 23 11:18:01 2015] KID (sync_adm) Missing target_db_adm database handle
(29519) [Fri Jan 23 11:18:01 2015] KID (sync_adm) Kid 29519 exiting at cleanup_kid. Sync "sync_adm" Reason: DBI connect('dbname=admin;port=5432;host=192.168.2.90','postgres',...) failed: could not connect to server: Connection refused   Is the server running on host "192.168.2.90" and accepting        TCP/IP connections on port 5432? at Bucardo.pm line 5644 Line: 3213 
(25822) [Fri Jan 23 11:18:01 2015] MCP Starting check_sync_health
(25822) [Fri Jan 23 11:18:01 2015] MCP Skipping stalled sync sync_adm
(29652) [Fri Jan 23 11:18:12 2015] KID (sync_adm) New kid, sync "sync_adm" alive=1 Parent=25848 PID=29652 kicked=1 
(25822) [Fri Jan 23 11:18:49 2015] MCP Database "target_db_adm" Local epoch: 1421983129.73297  DB epoch: 1421983089.30327
(25822) [Fri Jan 23 11:18:49 2015] MCP Database "target_db_adm" Local time: Fri Jan 23 11:18:49 2015  DB time: 2015-01-23 11:18:09.303267+08
(25822) [Fri Jan 23 11:18:49 2015] MCP Database "target_db_adm" Local timezone: CST (+0800)  DB timezone: PRC
(25822) [Fri Jan 23 11:18:49 2015] MCP Database "target_db_adm" Postgres version: 90305
(25822) [Fri Jan 23 11:18:49 2015] MCP Database "target_db_adm" Database port: 5432
(25822) [Fri Jan 23 11:18:49 2015] MCP Database "target_db_adm" Database host: 192.168.2.90

你可能感兴趣的:(PostgreSQL,bucaro)