以dbadmin用户启动vertica数据库时,报下面的错误:
Spread error; can't determine which DBs are running; attempting to continue... '
*** Starting database: noas ***
Enter password for
[email protected] (2 attempts left):
Spread does not seem to be running on 10.41.20.113. The database will not be started on this host.
The following host(s) are not available: 10.41.20.113.
You should get them running first. Operation can not be completed.
Database start up failed. K-safe parameters not met.
Press RETURN to continue
查看vertica启动时的后台日志:/opt/vertica/log/adminTools-dbadmin.log
Nov 4 23:06:50 [adminTools] Use unique ports = False
Nov 4 23:06:50 [adminExec.__init__] Unique Ports: False
Nov 4 23:06:52 [spreadAdmin.gRDBFS]: EOF (spread not running?) --发现spread没有启动
Nov 4 23:06:52 [spreadAdmin.gRDBFS]: EOF output was: 'Spread library version is 4.1.0^M
SP_error: (-2) Could not connect. Is Spread running?^M
^M
Bye.^M
', '<class 'vertica.utils.pexpect.EOF'>'
Nov 4 23:06:55 [adminCtrl.removeRunningDbs]: items '[('noas', 'noas', 0)]' runningDBs '[]'
Nov 4 23:06:55 [adminCtrl.dbStart]: candidates to start (no active) '[('noas', 'noas', 0)]'
Nov 4 23:06:56 [vsql.Passwords.Instance().getpass ] get password '[]'
Nov 4 23:06:58 [adminExec.getPortNo] no port assignment for noas, using 5433
Nov 4 23:06:58 Participating hosts:
Nov 4 23:06:58 10.41.20.113
Nov 4 23:06:58 10.41.20.116
Nov 4 23:06:58 [SSH.login user=dbadmin host=10.41.20.116]
Nov 4 23:07:01
[email protected]: ['cat /home/dbadmin/noas/v_noas_node0001_catalog/Epoch.log | head -3']
Nov 4 23:07:01
[email protected]: -- Executing -- cat /home/dbadmin/noas/v_noas_node0002_catalog/Epoch.log | head -3
Nov 4 23:07:01
[email protected]: cat /home/dbadmin/noas/v_noas_node0002_catalog/Epoch.log | head -3
Nov 4 23:07:01 [getLastGoodEpoch] trying to parse: Last good epoch: 0x8bf ended at '2013-11-14 11:54:22.087909+08'
Last good catalog version: 0x1419
K-safety: 0
Nov 4 23:07:01 KSafety: 0 Total hosts:2 Down hosts:[]
Nov 4 23:07:01
[email protected]: /etc/init.d/spreadd status
Nov 4 23:07:01
[email protected]: ['0', ['4938', '\r\x1b[80C\x1b[10D\x1b[1;32mrunning\x1b[m\x0f']]
Nov 4 23:07:01 Spread does not seem to be running on 10.41.20.113. The database will not be started on this host.
You should get them running first. Operation can not be completed.
Nov 4 23:07:01 Database start up failed. K-safe parameters not met.
Nov 4 23:09:42 [SSH.logout]
Nov 4 23:07:01
[email protected]: ['0', ["Last good epoch: 0x8bf ended at '2013-11-14 11:54:22.087909+08'", 'Last good catalog version: 0x1419', 'K-safety: 0']]
Nov 4 23:07:01 [getLastGoodEpoch] trying to parse: Last good epoch: 0x8bf ended at '2013-11-14 11:54:22.087909+08'
Last good catalog version: 0x1419
K-safety: 0
Nov 4 23:07:01 [getLastGoodEpoch] found epoch '2239' at timestamp '2013-11-14 11:54:22.087909+08' with catalog version '5145' and ksafety '0'
Nov 4 23:07:01 [getLastGoodEpoch] trying to parse: Last good epoch: 0x8bf ended at '2013-11-14 11:54:22.087909+08'
Last good catalog version: 0x1419
K-safety: 0
Nov 4 23:07:01 [getLastGoodEpoch] found epoch '2239' at timestamp '2013-11-14 11:54:22.087909+08' with catalog version '5145' and ksafety '0'
Nov 4 23:07:01 KSafety: 0 Total hosts:2 Down hosts:[]
Nov 4 23:07:01
[email protected]: /etc/init.d/spreadd status
Nov 4 23:07:01
[email protected]: ['0', ['4938', '\r\x1b[80C\x1b[10D\x1b[1;32mrunning\x1b[m\x0f']]
Nov 4 23:07:01 Spread does not seem to be running on 10.41.20.113. The database will not be started on this host.
Nov 4 23:07:01
The following host(s) are not available: 10.41.20.113.
You should get them running first. Operation can not be completed.
Nov 4 23:07:01 Database start up failed. K-safe parameters not met.
Nov 4 23:09:42 [SSH.logout]
经过分析,发现vertica集群中另外一台机器没有启动spreadd进程。
linux113:/opt # /etc/init.d/spreadd status
dead
If having trouble starting spread, try using 'spreadd stop' to clear state,
check /opt/vertica/config/vspread.conf and spread logs in /tmp/spread_*
and /var/log/spreadd.log, also check that /var has not run out of space.
解决办法:
启动spreadd进程:
linux113:/opt # /etc/init.d/spreadd start