昨晚,rac节点重启,虽未影响应用,但需查明原因
1,查看数据库日志alert.log,显示数据库直接重启,重启之前没有任何日志
2012-11-11 06:00:00.091000 +08:00 Setting Resource Manager plan SCHEDULER[0x310D]:DEFAULT_MAINTENANCE_PLAN via scheduler window Setting Resource Manager plan DEFAULT_MAINTENANCE_PLAN via parameter Starting background process VKRM VKRM started with pid=60, OS id=23499 2012-11-11 06:00:06.599000 +08:00 Begin automatic SQL Tuning Advisor run for special tuning task "SYS_AUTO_SQL_TUNING_TASK" 2012-11-11 06:01:16.131000 +08:00 End automatic SQL Tuning Advisor run for special tuning task "SYS_AUTO_SQL_TUNING_TASK" 2012-11-11 22:28:42.709000 +08:00 Adjusting the default value of parameter parallel_max_servers from 1280 to 985 due to the value of parameter processes (1000) Starting ORACLE instance (normal) ****************** Huge Pages Information ***************** Huge Pages memory pool detected (total: 35840 free: 35840) DFLT Huge Pages allocation successful (allocated: 3001) *********************************************************** 2012-11-11 22:28:43.755000 +08:00 LICENSE_MAX_SESSION = 0 LICENSE_SESSIONS_WARNING = 0 2012-11-11 22:28:50.135000 +08:00 Private Interface 'bond1:1' configured from GPnP for use as a private interconnect. [name='bond1:1', type=1, ip=169.254.61.86, mac=00-1b-21-d5-26-b0, net=169.254.0.0/16, mask=255.255.0.0, use=haip:cluster_interconnect/62] Public Interface 'bond0' configured from GPnP for use as a public interface. [name='bond0', type=1, ip=10.4.124.235, mac=e4-1f-13-80-57-c1, net=10.4.124.224/27, mask=255.255.255.224, use=public/1] Public Interface 'bond0:1' configured from GPnP for use as a public interface. [name='bond0:1', type=1, ip=10.4.124.245, mac=e4-1f-13-80-57-c1, net=10.4.124.224/27, mask=255.255.255.224, use=public/1] Picked latch-free SCN scheme 3 Using LOG_ARCHIVE_DEST_1 parameter default value as USE_DB_RECOVERY_FILE_DEST Autotune of undo retention is turned on. LICENSE_MAX_USERS = 0 SYS auditing is disabled Starting up: Oracle Database 11g Enterprise Edition Release 11.2.0.2.0 - 64bit Production With the Partitioning, Real Application Clusters, OLAP, Data Mining and Real Application Testing options. Using parameter settings in server-side pfile /oracle/app/oracle/product/11.2.0/db_1/dbs/initSMPDB3.ora System parameters with non-default values:
ASM log
2012-11-11 22:28:05.078000 +08:00 * instance_number obtained from CSS = 3, checking for the existence of node 0... * node 0 does not exist. instance_number = 3 Starting ORACLE instance (normal)
2,linux系统日志/var/log/error和messages
error,疑点是memory crash kernel
Nov 11 22:22:21 dtydb5 kernel: Memory for crash kernel (0x0 to 0x0) notwithin permissible range
Nov 11 22:22:45 dtydb5 automount[17304]: lookup_read_master: lookup(nisplus): couldn't locate nis+ table auto.master
Nov 11 22:26:32 dtydb5 ntpd[19555]: 10.7.0.81 is inappropriate address for the fudge command, line ignored
Nov 11 22:26:33 dtydb5 logger: Oracle HA daemon is enabled for autostart.
Nov 11 22:26:34 dtydb5 logger: exec /oracle/11.2.0/grid/perl/bin/perl -I/oracle/11.2.0/grid/perl/lib /oracle/11.2.0/grid/bin/crswrapexece.pl /oracle/11.2.0/grid/crs/install/s_crsconfig_dtydb5_env.txt /oracle/11.2.0/grid/bin/ohasd.bin "reboot"
Nov 11 22:27:07 dtydb5 smartd[20467]: Problem creating device name scan list
Nov 11 22:27:56 dtydb5 multipathd: asm!.asm_ctl_spec: failed to store path info
Nov 11 22:27:56 dtydb5 multipathd: uevent trigger error
Nov 11 22:27:56 dtydb5 multipathd: asm!.asm_ctl_vmb: failed to store path info
Nov 11 22:27:56 dtydb5 multipathd: uevent trigger error
Nov 11 22:27:56 dtydb5 multipathd: asm!.asm_ctl_vdbg: failed to store path info
mesages 22:18 syslogd 重启,应该没啥问题
Nov 11 22:22:18 dtydb5 syslogd 1.4.1: restart. Nov 11 22:22:19 dtydb5 kernel: klogd 1.4.1, log source = /proc/kmsg started. Nov 11 22:22:19 dtydb5 kernel: Linux version 2.6.18-194.el5 ([email protected]) (gcc version 4.1.2 20080704 (Red Hat 4.1.2-48)) #1 SMP Tue Mar 16 21:52:39 EDT 2010 Nov 11 22:22:19 dtydb5 kernel: Command line: ro root=/dev/rootvg/LogVol00 rhgb quiet Nov 11 22:22:19 dtydb5 kernel: BIOS-provided physical RAM map:
crsd 日志:/oracle/11.2.0/grid/log/dtydb5/crsd/crsdOUT.log
2012-11-11 22:28:14 Changing directory to /oracle/11.2.0/grid/log/dtydb5/crsd 2012-11-11 22:28:14 CRSD REBOOT/oracle/11.2.0/grid/log/dtydb5/crsd/crsd.l01
2012-11-11 22:20:20.413: [UiServer][1171753280] {3:22096:3634} Sending message to PE. ctx= 0xd671ea0 2012-11-11 22:20:20.414: [ CRSPE][1169652032] {3:22096:3634} Processing PE command id=593485. Description: [Stat Resource : 0x2aaaadda9a60] 2012-11-11 22:20:20.418: [UiServer][1171753280] {3:22096:3634} Done for ctx=0xd671ea0 2012-11-11 22:28:14.786: [ default][900772256] First attempt: init CSS context succeeded. [ clsdmt][1087560000]Listening to (ADDRESS=(PROTOCOL=ipc)(KEY=dtydb5DBG_CRSD)) 2012-11-11 22:28:14.791: [ clsdmt][1087560000]PID for the Process [21647], connkey 1 2012-11-11 22:28:14.792: [ clsdmt][1087560000]Creating PID [21647] file for home /oracle/11.2.0/grid host dtydb5 bin crs to /oracle/11.2.0/grid/crs/init/ 2012-11-11 22:28:14.792: [ clsdmt][1087560000]Writing PID [21647] to the file [/oracle/11.2.0/grid/crs/init/dtydb5.pid] 2012-11-11 22:28:15.308: [ default][1087560000] Policy Engine is not initialized yet! 2012-11-11 22:28:15.308: [ default][900772256] CRS Daemon Starting 2012-11-11 22:28:15.311: [ default][900772256] ENV Logging level for Module: AGENT 1 2012-11-11 22:28:15.311: [ default][900772256] ENV Logging level for Module: AGFW 0 2012-11-11 22:28:15.311: [ default][900772256] ENV Logging level for Module: CLSFRAME 0
2012-11-11 22:27:08.498: [ default][3640775072] OHASD Daemon Starting. Command string :reboot 2012-11-11 22:27:08.500: [ default][3640775072] Initializing OLR 2012-11-11 22:27:08.520: [ OCRRAW][3640775072]proprioo: for disk 0 (/oracle/11.2.0/grid/cdata/dtydb5.olr), id match (1), total id sets, (1) need recover (0), my votes (0), total votes (0), commit_lsn (4630), lsn (4630) 2012-11-11 22:27:08.520: [ OCRRAW][3640775072]proprioo: my id set: (931531576, 1028247821, 0, 0, 0) 2012-11-11 22:27:08.520: [ OCRRAW][3640775072]proprioo: 1st set: (931531576, 1028247821, 0, 0, 0) 2012-11-11 22:27:08.520: [ OCRRAW][3640775072]proprioo: 2nd set: (0, 0, 0, 0, 0) 2012-11-11 22:27:08.551: [ default][3640775072] Running mode check... 2012-11-11 22:27:08.551: [ default][3640775072] OHASD running as the Privileged user 2012-11-11 22:27:08.551: [ default][3640775072] Loading debug levels... 2012-11-11 22:27:08.553: [ default][3640775072] OCR Logging level for Module: AGFW 0 2012-11-11 22:27:08.554: [ default][3640775072] OCR Logging level for Module: CLSFRAME 0 2012-11-11 22:27:08.554: [ default][3640775072] OCR Logging level for Module: CLSVER 0 2012-11-11 22:27:08.554: [ default][3640775072] OCR Logging level for Module: CLUCLS 0 2012-11-11 22:27:08.555: [ default][3640775072] OCR Logging level for Module: CRSAPP 0 2012-11-11 22:27:08.555: [ default][3640775072] OCR Logging level for Module: CRSCCL 0
2012-11-11 22:27:08.548 [ohasd(19651)]CRS-2112:The OLR service started on node dtydb5. 2012-11-11 22:27:08.620 [ohasd(19651)]CRS-1301:Oracle High Availability Service started on node dtydb5. 2012-11-11 22:27:08.647 [ohasd(19651)]CRS-8017:location: /etc/oracle/lastgasp has 2 reboot advisory log files, 0 were announced and 0 errors occurred 2012-11-11 22:27:10.481 [/oracle/11.2.0/grid/bin/oraagent.bin(20785)]CRS-5815:Agent '/oracle/11.2.0/grid/bin/oraagent_grid' could not find any base type entry points for type 'ora.daemon.type'. Details at (:CRSAGF00108:) {0:2:2} in /oracle/11.2.0/grid/log/dtydb5/agent/ohasd/oraagent_grid/oraagent_grid.log. 2012-11-11 22:27:10.592 [/oracle/11.2.0/grid/bin/oraagent.bin(20785)]CRS-5011:Check of resource "+ASM" failed: details at "(:CLSN00006:)" in "/oracle/11.2.0/grid/log/dtydb5/agent/ohasd/oraagent_grid/oraagent_grid.log" 2012-11-11 22:27:11.496 2012-11-11 22:27:11.496 [/oracle/11.2.0/grid/bin/orarootagent.bin(20781)]CRS-5016:Process "/oracle/11.2.0/grid/bin/acfsload" spawned by agent "/oracle/11.2.0/grid/bin/orarootagent.bin" for action "check" failed: details at "(:CLSN00010:)" in "/oracle/11.2.0/grid/log/dtydb5/agent/ohasd/orarootagent_root/orarootagent_root.log" 2012-11-11 22:27:26.622 [/oracle/11.2.0/grid/bin/oraagent.bin(20912)]CRS-5815:Agent '/oracle/11.2.0/grid/bin/oraagent_grid' could not find any base type entry points for type 'ora.daemon.type'. Details at (:CRSAGF00108:) {0:5:2} in /oracle/11.2.0/grid/log/dtydb5/agent/ohasd/oraagent_grid/oraagent_grid.log. 2012-11-11 22:27:29.974 [gpnpd(20934)]CRS-2328:GPNPD started on node dtydb5.经检查,无网络和磁盘方面的问题,也无其它问题
4,系统方面无问题,只能看看服务器硬件方面了
登录web登录服务器的管理口,方面如下内容,问题基本可以确定了,硬件报错CPU 4:Cache error occurred.,这个问题只能硬件工程师来了
E 30 11/11/2012 22:19:21 OEM Event OEM Event CPU 4:Cache error occurred.