【摘要】 修改主机IP后,GaussDB 起不来了,自己挖坑自己填。
由于笔者有轻微强迫症,自己测试环境几十个虚拟机也要划分网段管理,所以对之前安装的GaussDB 单机环境,进行了主机IP变更。变更后坏了,起不来了。所以就有了这篇文章。
修改前:192.168.0.11/16
修改后:192.168.10.5/16
启动报错:
[root@G0 ~]# su - omm
Last login: Mon Dec 23 14:13:19 CST 2019 on pts/1
[omm@G0 ~]$ cd opt/gaussdb/gaussdb100/bin/
[omm@G0 bin]$ zctl.py -t start
Can not get instance '/opt/gaussdb/data' process pid
[omm@G0 bin]$
跟踪实例启动过程日志
[root@G0 run]# tailf opt/gaussdb/data/log/run/zengine.rlog
UTC+8 2019-12-25 12:44:55.948|ZENGINE|00000|77309418914|INFO>[LOG] file '/opt/gaussdb/data/log/zenith_alarm.log' is added [srv_param.c:488]
UTC+8 2019-12-25 12:44:55.948|ZENGINE|00000|7586|INFO>[LOG] file '/opt/gaussdb/data/log/run/zengine.rlog' is added [cm_log.c:643]
UTC+8 2019-12-25 12:44:55.948|ZENGINE|00000|7586|INFO>[PARAM] LSNR_ADDR = 127.0.0.1,192.168.0.11
UTC+8 2019-12-25 12:44:55.948|ZENGINE|00000|7586|INFO>[PARAM] LSNR_PORT = 1888
UTC+8 2019-12-25 12:44:55.948|ZENGINE|00000|7586|INFO>[PARAM] DATA_BUFFER_SIZE = 500m
UTC+8 2019-12-25 12:44:55.948|ZENGINE|00000|7586|INFO>[PARAM] SHARED_POOL_SIZE = 150M
UTC+8 2019-12-25 12:44:55.948|ZENGINE|00000|7586|INFO>[PARAM] LOG_BUFFER_SIZE = 64M
UTC+8 2019-12-25 12:44:55.948|ZENGINE|00000|7586|INFO>[PARAM] LOG_BUFFER_COUNT = 8
UTC+8 2019-12-25 12:44:55.948|ZENGINE|00000|7586|INFO>[PARAM] TEMP_BUFFER_SIZE = 150M
UTC+8 2019-12-25 12:44:55.948|ZENGINE|00000|7586|INFO>[PARAM] SESSIONS = 1500
UTC+8 2019-12-25 12:44:55.948|ZENGINE|00000|7586|INFO>[PARAM] CONTROL_FILES = (/opt/gaussdb/data/data/cntl1, opt/gaussdb/data/data/cntl2, opt/gaussdb/data/data/cntl3)
UTC+8 2019-12-25 12:44:55.948|ZENGINE|00000|7586|INFO>[PARAM] DBWR_PROCESSES = 8
UTC+8 2019-12-25 12:44:55.948|ZENGINE|00000|7586|INFO>[PARAM] INSTANCE_NAME = zenith
UTC+8 2019-12-25 12:44:55.948|ZENGINE|00000|7586|INFO>[PARAM] ENABLE_SYSDBA_LOGIN = TRUE
UTC+8 2019-12-25 12:44:55.948|ZENGINE|00000|206158437794|INFO>starting instance(normal)
UTC+8 2019-12-25 12:44:55.948|ZENGINE|00000|7587|INFO>timer thread started
UTC+8 2019-12-25 12:44:55.996|ZENGINE|00000|7601|INFO>rollback thread started
UTC+8 2019-12-25 12:44:55.996|ZENGINE|00000|7603|INFO>rmon thread started
UTC+8 2019-12-25 12:44:55.996|ZENGINE|00000|7604|INFO>job master thread started
UTC+8 2019-12-25 12:44:55.996|ZENGINE|00000|7602|INFO>rollback thread started
UTC+8 2019-12-25 12:44:55.996|ZENGINE|00000|7598|INFO>smon thread started
UTC+8 2019-12-25 12:44:55.996|ZENGINE|00000|7593|INFO>dbwr thread started
UTC+8 2019-12-25 12:44:55.996|ZENGINE|00000|7594|INFO>dbwr thread started
UTC+8 2019-12-25 12:44:55.996|ZENGINE|00000|7589|INFO>dbwr thread started
UTC+8 2019-12-25 12:44:55.996|ZENGINE|00000|7590|INFO>dbwr thread started
UTC+8 2019-12-25 12:44:55.996|ZENGINE|00000|7591|INFO>dbwr thread started
UTC+8 2019-12-25 12:44:55.996|ZENGINE|00000|7595|INFO>dbwr thread started
UTC+8 2019-12-25 12:44:55.996|ZENGINE|00000|7596|INFO>dbwr thread started
UTC+8 2019-12-25 12:44:55.996|ZENGINE|00000|7597|INFO>ckpt thread started
UTC+8 2019-12-25 12:44:55.996|ZENGINE|00000|7600|INFO>index page recycle thread started
UTC+8 2019-12-25 12:44:55.996|ZENGINE|00000|7592|INFO>dbwr thread started
UTC+8 2019-12-25 12:44:55.996|ZENGINE|00000|7599|INFO>stats thread started
UTC+8 2019-12-25 12:44:55.996|ZENGINE|00000|7588|INFO>lgwr thread started
UTC+8 2019-12-25 12:44:56.009|ZENGINE|00000|7586|INFO>local ip: 127.0.0.1
UTC+8 2019-12-25 12:44:56.009|ZENGINE|00000|7586|INFO>local ip: 192.168.0.11
UTC+8 2019-12-25 12:44:56.015|ZENGINE|00000|7605|INFO>reactor thread started
UTC+8 2019-12-25 12:44:59.021|ZENGINE|00000|18446743974925311394|ERROR>GS-00310 : Failed to bind socket for 192.168.0.11:1888, error code 99 [cs_listener.c:207]
UTC+8 2019-12-25 12:44:59.021|ZENGINE|00000|7586|ERROR>failed to create lsnr sockets for listener type 1
UTC+8 2019-12-25 12:44:59.021|ZENGINE|00000|206158437794|ERROR>failed to start lsnr for LSNR_ADDR
UTC+8 2019-12-25 12:44:59.204|ZENGINE|00000|7601|INFO>rollback thread closed
UTC+8 2019-12-25 12:44:59.405|ZENGINE|00000|7602|INFO>rollback thread closed
UTC+8 2019-12-25 12:44:59.605|ZENGINE|00000|7598|INFO>smon thread closed
UTC+8 2019-12-25 12:44:59.807|ZENGINE|00000|7603|INFO>rmon thread closed
UTC+8 2019-12-25 12:45:00.007|ZENGINE|00000|7599|INFO>stats thread closed
UTC+8 2019-12-25 12:45:00.998|ZENGINE|00000|7604|INFO>job master thread closed
UTC+8 2019-12-25 12:45:00.998|ZENGINE|00000|7600|INFO>index_recycle thread closed
UTC+8 2019-12-25 12:45:01.009|ZENGINE|00000|7597|INFO>ckpt thread closed
UTC+8 2019-12-25 12:45:01.009|ZENGINE|00000|7589|INFO>dbwr thread closed
UTC+8 2019-12-25 12:45:01.009|ZENGINE|00000|7590|INFO>dbwr thread closed
UTC+8 2019-12-25 12:45:01.009|ZENGINE|00000|7591|INFO>dbwr thread closed
UTC+8 2019-12-25 12:45:01.009|ZENGINE|00000|7592|INFO>dbwr thread closed
UTC+8 2019-12-25 12:45:01.009|ZENGINE|00000|7593|INFO>dbwr thread closed
UTC+8 2019-12-25 12:45:01.009|ZENGINE|00000|7594|INFO>dbwr thread closed
UTC+8 2019-12-25 12:45:01.009|ZENGINE|00000|7595|INFO>dbwr thread closed
UTC+8 2019-12-25 12:45:01.009|ZENGINE|00000|7596|INFO>dbwr thread closed
UTC+8 2019-12-25 12:45:01.211|ZENGINE|00000|7588|INFO>lgwr thread closed
UTC+8 2019-12-25 12:45:01.230|ZENGINE|00000|7605|INFO>reactor thread closed
UTC+8 2019-12-25 12:45:01.230|ZENGINE|00000|13847455598648738|ERROR>failed to start lsnr
UTC+8 2019-12-25 12:45:01.230|ZENGINE|00000|7586|ERROR>Instance Startup Failed
核心报错信息:
UTC+8 2019-12-25 12:44:59.021|ZENGINE|00000|18446743974925311394|ERROR>GS-00310 : Failed to bind socket for 192.168.0.11:1888, error code 99 [cs_listener.c:207]
UTC+8 2019-12-25 12:44:59.021|ZENGINE|00000|7586|ERROR>failed to create lsnr sockets for listener type 1
UTC+8 2019-12-25 12:44:59.021|ZENGINE|00000|206158437794|ERROR>failed to start lsnr for LSNR_ADDR
如上可知,DB在启动过程中依旧请求了原来的IP,那么是哪里记录了。根据数据库启动不同阶段请求文件类型,判断是参数文件。
查看实例参数文件,果然变量LSNR_ADDR记录了原来的IP:
[omm@G0 ~]$ vi opt/gaussdb/data/cfg/zengine.ini
LOG_BUFFER_SIZE = 64M
DBWR_PROCESSES = 8
LOG_BUFFER_COUNT = 8
SESSIONS = 1500
INSTANCE_NAME = zenith
LSNR_ADDR = 127.0.0.1,192.168.0.11
LSNR_PORT = 1888
ENABLE_SYSDBA_LOGIN = TRUE
SHARED_POOL_SIZE = 150M
TEMP_BUFFER_SIZE = 150M
DATA_BUFFER_SIZE = 500m
CONTROL_FILES = (/opt/gaussdb/data/data/cntl1, opt/gaussdb/data/data/cntl2, opt/gaussdb/data/data/cntl3)
[omm@G0 ~]$
这里我们将其修改为新的ip(192.168.10.5)。
修改完成后重新启库
[omm@G0 bin]$ zctl.py -t start
Successfully started instance.
[omm@G0 bin]$
跟踪启动日志,可见,数据库正常启动。
UTC+8 2019-12-25 12:46:37.353|ZENGINE|00000|77309418988|INFO>[LOG] file '/opt/gaussdb/data/log/zenith_alarm.log' is added [srv_param.c:488]
UTC+8 2019-12-25 12:46:37.353|ZENGINE|00000|7660|INFO>[LOG] file '/opt/gaussdb/data/log/run/zengine.rlog' is added [cm_log.c:643]
UTC+8 2019-12-25 12:46:37.353|ZENGINE|00000|7660|INFO>[PARAM] LSNR_ADDR = 127.0.0.1,192.168.10.5
UTC+8 2019-12-25 12:46:37.353|ZENGINE|00000|7660|INFO>[PARAM] LSNR_PORT = 1888
UTC+8 2019-12-25 12:46:37.353|ZENGINE|00000|7660|INFO>[PARAM] DATA_BUFFER_SIZE = 500m
UTC+8 2019-12-25 12:46:37.353|ZENGINE|00000|7660|INFO>[PARAM] SHARED_POOL_SIZE = 150M
UTC+8 2019-12-25 12:46:37.353|ZENGINE|00000|7660|INFO>[PARAM] LOG_BUFFER_SIZE = 64M
UTC+8 2019-12-25 12:46:37.353|ZENGINE|00000|7660|INFO>[PARAM] LOG_BUFFER_COUNT = 8
UTC+8 2019-12-25 12:46:37.353|ZENGINE|00000|7660|INFO>[PARAM] TEMP_BUFFER_SIZE = 150M
UTC+8 2019-12-25 12:46:37.353|ZENGINE|00000|7660|INFO>[PARAM] SESSIONS = 1500
UTC+8 2019-12-25 12:46:37.353|ZENGINE|00000|7660|INFO>[PARAM] CONTROL_FILES = (/opt/gaussdb/data/data/cntl1, opt/gaussdb/data/data/cntl2, opt/gaussdb/data/data/cntl3)
UTC+8 2019-12-25 12:46:37.353|ZENGINE|00000|7660|INFO>[PARAM] DBWR_PROCESSES = 8
UTC+8 2019-12-25 12:46:37.353|ZENGINE|00000|7660|INFO>[PARAM] INSTANCE_NAME = zenith
UTC+8 2019-12-25 12:46:37.353|ZENGINE|00000|7660|INFO>[PARAM] ENABLE_SYSDBA_LOGIN = TRUE
UTC+8 2019-12-25 12:46:37.353|ZENGINE|00000|206158437868|INFO>starting instance(normal)
UTC+8 2019-12-25 12:46:37.353|ZENGINE|00000|7661|INFO>timer thread started
UTC+8 2019-12-25 12:46:37.403|ZENGINE|00000|7674|INFO>index page recycle thread started
UTC+8 2019-12-25 12:46:37.403|ZENGINE|00000|7676|INFO>rollback thread started
UTC+8 2019-12-25 12:46:37.403|ZENGINE|00000|7677|INFO>rmon thread started
UTC+8 2019-12-25 12:46:37.403|ZENGINE|00000|7675|INFO>rollback thread started
UTC+8 2019-12-25 12:46:37.403|ZENGINE|00000|7671|INFO>ckpt thread started
UTC+8 2019-12-25 12:46:37.403|ZENGINE|00000|7666|INFO>dbwr thread started
UTC+8 2019-12-25 12:46:37.403|ZENGINE|00000|7667|INFO>dbwr thread started
UTC+8 2019-12-25 12:46:37.403|ZENGINE|00000|7662|INFO>lgwr thread started
UTC+8 2019-12-25 12:46:37.403|ZENGINE|00000|7663|INFO>dbwr thread started
UTC+8 2019-12-25 12:46:37.403|ZENGINE|00000|7664|INFO>dbwr thread started
UTC+8 2019-12-25 12:46:37.403|ZENGINE|00000|7668|INFO>dbwr thread started
UTC+8 2019-12-25 12:46:37.403|ZENGINE|00000|7669|INFO>dbwr thread started
UTC+8 2019-12-25 12:46:37.403|ZENGINE|00000|7670|INFO>dbwr thread started
UTC+8 2019-12-25 12:46:37.403|ZENGINE|00000|7673|INFO>stats thread started
UTC+8 2019-12-25 12:46:37.403|ZENGINE|00000|7678|INFO>job master thread started
UTC+8 2019-12-25 12:46:37.403|ZENGINE|00000|7665|INFO>dbwr thread started
UTC+8 2019-12-25 12:46:37.403|ZENGINE|00000|7672|INFO>smon thread started
UTC+8 2019-12-25 12:46:37.405|ZENGINE|00000|7660|INFO>local ip: 127.0.0.1
UTC+8 2019-12-25 12:46:37.405|ZENGINE|00000|7660|INFO>local ip: 192.168.10.5
UTC+8 2019-12-25 12:46:37.418|ZENGINE|00000|7679|INFO>reactor thread started
UTC+8 2019-12-25 12:46:37.418|ZENGINE|00000|7680|INFO>tcp-lsnr thread started
UTC+8 2019-12-25 12:46:37.418|ZENGINE|00000|7681|INFO>uds-lsnr thread started
UTC+8 2019-12-25 12:46:37.418|ZENGINE|00000|7660|INFO>SSL disabled: server certificate or private key file is not available.
UTC+8 2019-12-25 12:46:37.418|ZENGINE|00000|7660|INFO>start to alter database MOUNT
UTC+8 2019-12-25 12:46:37.713|ZENGINE|00000|7660|INFO>[ARCH] Init arch is_archive 0
UTC+8 2019-12-25 12:46:37.713|ZENGINE|00000|7660|INFO>[ARCH] Initialization complete
UTC+8 2019-12-25 12:46:37.713|ZENGINE|00000|7660|INFO>sucessfully alter database MOUNT
UTC+8 2019-12-25 12:46:37.713|ZENGINE|00000|206158437868|INFO>start to alter database OPEN
UTC+8 2019-12-25 12:46:38.491|ZENGINE|00000|140733193395692|INFO>[ARCH] Start ARCH thread for ARCHIVE_DEST_1[/opt/gaussdb/data/archive_log]
UTC+8 2019-12-25 12:46:38.491|ZENGINE|00000|11909436955911265772|INFO>The last shutdown is a inconsistent shutdown
UTC+8 2019-12-25 12:46:38.491|ZENGINE|00000|7660|INFO>database start recovery
UTC+8 2019-12-25 12:46:38.491|ZENGINE|00000|7660|INFO>recovery from file:2,point:4065140,lfn:8278
UTC+8 2019-12-25 12:46:38.491|ZENGINE|00000|7660|INFO>recovery expected least end with file:2,point:4065280,lfn:8318
UTC+8 2019-12-25 12:46:39.735|ZENGINE|00000|140733193395692|INFO>[RCY] recovery real end with file:2,point:4065396,lfn:8350
UTC+8 2019-12-25 12:46:39.735|ZENGINE|00000|140733193395692|INFO>[RCY] current lfn 8350, rcy point lfn 8278, consistent point 8318, lrp point lfn 8318
UTC+8 2019-12-25 12:46:40.220|ZENGINE|00000|7675|INFO>rollback thread closed
UTC+8 2019-12-25 12:46:40.220|ZENGINE|00000|7676|INFO>rollback thread closed
UTC+8 2019-12-25 12:46:40.258|ZENGINE|00000|140716013526508|INFO>no valid standby configuration
UTC+8 2019-12-25 12:46:40.258|ZENGINE|00000|7660|INFO>[DB] sse42 available 1
UTC+8 2019-12-25 12:46:40.258|ZENGINE|00000|7660|INFO>sucessfully alter database OPEN
UTC+8 2019-12-25 12:46:40.273|ZENGINE|00000|7660|INFO>instance started
到这里,数据库就启动成功了。
但好学的朋友一定会问,这个变量LSNR_ADDR是干嘛的?
官方文档中的描述是这样的:
LSNR_ADDR 参数描述:设置所侦听的服务器的IP地址。
取值范围:有效的IPv4或IPV6地址。
默认值:127.0.0.1
说直白一点就是记录数据库的监听地址,最多支持8个IP。连接数据库的时候可以指定该变量中诸多ip的一个。如下查询结果,可知该变量是支持在线更改,但是重启才会生效。
SQL> select name,value,RUNTIME_VALUE,DEFAULT_VALUE,ISDEFAULT,MODIFIABLE,EFFECTIVE
from dv_parameters
where name like 'LSNR_ADDR';
NAME VALUE RUNTIME_VALUE DEFAULT_VALUE ISDEFAULT MODIFIABLE EFFECTIVE
---------- ------------------------ ------------------------ --------------- -------------------- -------------------- ----------
LSNR_ADDR 127.0.0.1,192.168.10.11 127.0.0.1,192.168.10.11 127.0.0.1 FALSE TRUE reboot
1 rows fetched.
SQL>