今天上午有同事反映应用数据库连接不上;于是排查数据库:
[oracle@db ~]$ sqlplus / as sysdba SQL*Plus: Release 10.2.0.4.0 - Production on Thu Dec 12 08:54:39 2013 Copyright (c) 1982, 2007, Oracle. All Rights Reserved. ERROR: ORA-09925: Unable to create audit trail file Linux-x86_64 Error: 30: Read-only file system Additional information: 9925 ORA-09925: Unable to create audit trail file Linux-x86_64 Error: 30: Read-only file system Additional information: 9925 Enter user-name: ERROR: ORA-01017: invalid username/password; logon denied Enter user-name: ERROR: ORA-01017: invalid username/password; logon denied SP2-0157: unable to CONNECT to ORACLE after 3 attempts, exiting SQL*Plus
上网查了,说数据库命令权限不对,解决办法如下:
-rwx------ 1 oracle oinstall 3337 Jan 15 2013 onsctl -rwxr-xr-x 1 oracle oinstall 46 Nov 7 2000 oracg -rwsr-s--x 1 oracle oinstall 112468376 Jan 15 2013 oracle -rwxr-x--- 1 oracle oinstall 0 Mar 12 2008 oracleO -r-sr-s--- 1 root oinstall 14931 Mar 11 2008 oradism -rwxr-x--- 1 oracle oinstall 0 Mar 12 2008 oradismO
[oracle@db bin]$ chmod 6755 /u01/app/oracle/product/10.2.0/bin/oracle chmod: changing permissions of `/u01/app/oracle/product/10.2.0/bin/oracle': Read-only file system [oracle@db bin]$
改变权限,但提示read-only file system。
再排查系统:
[root@db ~]# mount /dev/mapper/LVMgroup-root on / type ext3 (rw) proc on /proc type proc (rw) sysfs on /sys type sysfs (rw) devpts on /dev/pts type devpts (rw,gid=5,mode=620) /dev/sda1 on /boot type ext3 (rw) tmpfs on /dev/shm type tmpfs (rw) none on /proc/sys/fs/binfmt_misc type binfmt_misc (rw) sunrpc on /var/lib/nfs/rpc_pipefs type rpc_pipefs (rw) mount: warning /etc/mtab is not writable (e.g. read-only filesystem). It's possible that information reported by mount(8) is not up to date. For actual information about system mount points check the /proc/mounts file. [root@db ~]# cat /proc/mounts rootfs / rootfs rw 0 0 /dev/root / ext3 ro,data=ordered 0 0 /dev /dev tmpfs rw 0 0 /proc /proc proc rw 0 0 /sys /sys sysfs rw 0 0 /proc/bus/usb /proc/bus/usb usbfs rw 0 0 devpts /dev/pts devpts rw 0 0 /dev/sda1 /boot ext3 rw,data=ordered 0 0 tmpfs /dev/shm tmpfs rw 0 0 tmpfs /dev/shm tmpfs rw 0 0 none /proc/sys/fs/binfmt_misc binfmt_misc rw 0 0 sunrpc /var/lib/nfs/rpc_pipefs rpc_pipefs rw 0 0 /etc/auto.misc /misc autofs rw,fd=7,pgrp=3377,timeout=300,minproto=5,maxproto=5,indirect 0 0 -hosts /net autofs rw,fd=13,pgrp=3377,timeout=300,minproto=5,maxproto=5,indirect 0 0
root根目录变成read-only,无法写文件系统,Oracle因为无法写audit文件也无法正常登陆 。
[root@db ~]# dmesg EXT3-fs error (device dm-0) in ext3_reserve_inode_write: Journal has aborted EXT3-fs error (device dm-0) in ext3_reserve_inode_write: Journal has aborted sd 0:0:0:0: timing out command, waited 1080s sd 0:0:0:0: SCSI error: return code = 0x06000008 end_request: I/O error, dev sda, sector 74923301 EXT3-fs error (device dm-0) in ext3_reserve_inode_write: Journal has aborted ************ *********** **********
通过dmesg发现EXT3 文件系统存在问题。
网上有资料说可以通过使用fsck命令修复文件系统,命令也执行不了(fsck可能会导致数据块损坏,最好是不做fsck,重启一下服务器)。
[root@db ~]# fsck fsck 1.39 (29-May-2006) e2fsck 1.39 (29-May-2006) /dev/LVMgroup/root: recovering journal fsck.ext3: Bad magic number in super-block while trying to re-open /dev/LVMgroup/root e2fsck: io manager magic bad!
[root@db ~]#
通过命令重启服务器也不可行。
[root@db ~]# df -h -bash: df: command not found [root@db ~]# reboot -bash: reboot: command not found
到此,问题已经很明了,Linux系统出问题了,咨询运维,服务器确实出问题。
幸好这是一个临时环境,数据库重新安装即可。
如果是正式环境,麻烦大了。切记,数据库要定期在其他系统上做备份!!