一次工程交付,软件环境为Solaris10U11+Cluster3.3U2+Oracle11g,最后建立数据库资源的时候,遇到一问题,原始命令如下:
# clresource create -g oracleha-rg \
-t SUNW.oracle_server \
-p Connect_string=ora_monitor/ha_monitor\
-p ORACLE_SID=RWDB \
-p ORACLE_HOME=/u01/app/oracle/product/11.2.0\
-p Alert_log_file=/u01/app/oracle/diag/rdbms/rwdb/RWDB/trace/alert_RWDB.log\
-p resource_dependencies=oradbset \
oracledb-rs
之前测试手工启库是没问题的,但通过cluster不行,报:
clresource: (C748634) Resource group oracleha-rg failedto start on chosen node and might fail over to other node(s)
看/var/opt/SUNWscor/oracle_server/message_log.oracledb-rs,详细报错如下:
Executingcommand: /opt/SUNWscor/oracle_server/bin/oracle_server_manage startup FALSE
Jan 0912:39:44 SC[SUNWscor.oracle_server.start]:oracleha-rg:oracledb-rs: Could notstart server
Jan 0912:39:45 SC[SUNWscor.oracle_server.stop]:oracleha-rg:oracledb-rs: Using method'run_setuid_prog' to execute shutdown commands
Jan 0912:39:45 SC[SUNWscor.oracle_server.stop]:oracleha-rg:oracledb-rs: Server is notrunning. Calling shutdown abort to clear shared memory (if any)
Shutting downOracle instance: RWDB : /u01/app/oracle/product/11.2.0.
还有:
SQL>ORA-27102: out of memory
SVR4 Error:22: Invalid argument
SQL>Disconnected
这里很奇怪,手工启库没有报out of memory,通过cluster启库却报了。检查/etc/project,project.max-shm-memory已设置为28G(机器内存32G),是没有问题的,低于我们设定的memory_target,那么为何cluster启动还会报内存不足呢?
后来在support网站的Solaris Cluster产品搜索out of memory关键字,找到了一个id为1007002.1的文档《Solaris Cluster HA-Oracle (SUNW.oracle_server) Resource Fails to Start Database due to Error "ORA-27102: out of memory"》,这里面指出,如果没有给cluster指定project name,cluster会用root用户的project来启动数据库。
于是在重建oracle server资源的时候,给多加一个参数:
# clresource create -g oracleha-rg \
-t SUNW.oracle_server \
-p Connect_string=ora_monitor/ha_monitor\
-p ORACLE_SID=RWDB \
-p ORACLE_HOME=/u01/app/oracle/product/11.2.0\
-p Alert_log_file=/u01/app/oracle/diag/rdbms/rwdb/RWDB/trace/alert_RWDB.log\
-p resource_dependencies=oradbset \
-p Resource_project_name=oracleproj \
oracledb-rs
这样子问题就解决了,其中oracleproj为oracle用户的project。
如果在建立oracle server资源的时候没有指定project,cluster会去用系统默认的user.root:
root@MSPRG-AP1 # prctl -n project.max-shm-memory -i project 1
project: 1: user.root
NAME PRIVILEGE VALUE FLAG ACTION RECIPIENT
project.max-shm-memory
privileged 7.64GB - deny -
system 16.0EB max deny -
可以看到,这个project默认的max-shm-memory是7.64G,如果数据库设定的memory_target低于这个值,那么这个问题便不会被触发。