客户的HP unix系统发生错误,刷卡业务不能正常进行,从报错上看是读不到/oracle目录的某个文件, 但这是套HP unix的10.2.0.5的ASM rac系统,怎么会有文件系统的文件,初步怀疑肯定加数据文件的时候有误操作 1.业务层报错 -900~DDM_SJDIYDJY_JYD.OraTransmit: ORA-00376: file 202 cannot be read at this time ORA-01110: data file 202: '/oracle/product_1 s/DATADG'%% DDM_SJDIYDJY_JYD.OraTransmit%% ORA-00376: file 202 cannot be read at this time ORA-01110: data file 202: '/oracle/product _1 s/DATADG'~~~ 2.从alert log提示看 ORA-17503:KSFDOPN:2 Failed to oprn file '+DATADG/rac2/onlinelog/group5 HPUX-ia64 ERROR:28 No Space left on device Additional information:9925 3.登录系统查看/oracle目录的文件系统已经几乎用完 # bdf Filesystem kbytes used avail %used Mounted on /dev/vg00/lvol3 1048576 843648 203368 81% / /dev/vg00/lvol1 1835008 299040 1524048 16% /stand /dev/vg00/lvol8 8912896 2907904 5962000 33% /var /dev/vg00/lvol7 10240000 6512136 3698872 64% /usr /dev/vg00/lvol4 10485760 1293536 9121832 12% /tmp /dev/vg00/lvol6 8912896 6512824 2381744 23% /opt /dev/vg00/lvol5 131072 5824 124280 4% /home /dev/vg00/lvol9 32914560 32285414 629146 99% /oracle 根据ORA-01110: data file 202: '/oracle/product _1 s/DATADG'报错的file_id 为202大致可以判定可能是某个临时文件读取的错误 注:在v$tempfile与v$sort_usage,db_files之间的关系,对于临时文件的绝对文件号(AFN),Oracle的分配规则和常规数据文件并不相同,临时文件的绝对文件号应该等于db_files + file#. 3.登录RAC系统查看临时文件确定之前的判定 rac2$[/oracle/admin/dysb/bdump]sqlplus / as sysdba SQL*Plus: Release 10.2.0.5.0 - Production on Mon May 26 15:06:31 2014 NUMBER SQL> select file_name,file_id from dba_temp_files; FILE_NAME FILE_ID -------------------------------------------------- ---------- +DATADG/dysb/tempfile/temp.292.827668181 1 /oracle/product/db_1/dbs/DATADG 2 果不其然,文件系统有个DATADG的临时文件,工程师在加临时文件的时候没有按照+DATADG的ASM路径来添加,所以文件加到了文件系统的目录 SQL> select file_name,file_id,status from dba_temp_files; FILE_NAME FILE_ID STATUS -------------------------------------------------- ---------- --------- +DATADG/dysb/tempfile/temp.292.827668181 1 AVAILABLE /oracle/product/db_1/dbs/DATADG 2 AVAILABLE 4.查看temp file 状态 根据ORA-00376: file 202 cannot be read at this time报错,错误提示很明显,文件无法读取,可能是file#=2的临时文件此刻offline了 于是查看v$tempfile的file状态,2号临时文件确实被人OFFLINE掉了 SQL> SELECT FILE#,NAME,STATUS FROM V$TEMPFILE; FILE# NAME STATUS ---------- ---------------------------------------- ------- 1 +DATADG/dysb/tempfile/temp.292.827668181 ONLINE 2 /oracle/product/db_1/dbs/DATADG OFFLINE VARCHAR2(513) SQL> select file#,name,status,bytes from v$tempfile; FILE# NAME STATUS BYTES ---------- ---------------------------------------- ------- ---------- 1 +DATADG/dysb/tempfile/temp.292.827668181 ONLINE 2.1475E+10 2 /oracle/product/db_1/dbs/DATADG OFFLINE 0 NUMBER SQL> select file_name,file_id,status,tablespace_name from dba_temp_files; FILE_NAME FILE_ID STATUS TABLESPACE_NAME ----------------------------------------- ---------- --------- ---------------- +DATADG/dysb/tempfile/temp.292.827668181 1 AVAILABLE TEMP /oracle/product/db_1/dbs/DATADG 2 AVAILABLE TEMP SQL> alter tablespace temp tempfile online; Tablespace altered. SQL> select file_name,file_id,status,tablespace_name from dba_temp_files; FILE_NAME STATUS TABLESPACE_NAME -------------------------------------------------- ---------- ------------------------------ +DATADG/dysb/tempfile/temp.292.827668181 AVAILABLE TEMP /oracle/product/db_1/dbs/DATADG AVAILABLE TEMP VARCHAR2(513) SQL> select file_id,status from v$tempfile; FILE_ID STATUS --------- ------- 1 ONLINE 2 ONLINE 5.于是将2号temp file online ,通知业务测试,开发人员说刷卡业务依旧不行 6.尝试删除文件系统的临时文件,但是报错 SQL> ALTER DATABASE TEMPFILE '/oracle/product/db_1/dbs/DATADG ' DROP INCLUDING DATAFILES; ALTER DATABASE TEMPFILE '/oracle/product/db_1/dbs/DATADG' DROP INCLUDING DATAFILES * ERROR at line 1: ORA-25152: TEMPFILE cannot be dropped at this time 次报错应该是有会话还在使用file#=2的临时文件系统,于是找到使用file#=2的会话杀掉他们就OK了,既可以删除没有创建正确的临时表空间文件 7.查找使用临时文件的会话 注:在v$tempfile与v$sort_usage,db_files之间的关系,对于临时文件的绝对文件号(AFN),Oracle的分配规则和常规数据文件并不相同,临时文件的绝对文件号应该等于db_files + file#. file#=2的临时文件的file_id应该为200+2 即 202 SQL> select c.spid, b.tablespace, b.segfile#, b.segblk#, round(((b.blocks * d.VALUE) / 1024 / 1024), 2) size_mb, a.SID, a.serial#, a.username, a.osuser, a.program, a.status from v$session a, v$sort_usage b, v$process c, v$parameter d where b.segfile# = &seg_temp_file_id and d.name = 'db_block_size' and a.saddr = b.session_addr and a.paddr = c.addr order by b.tablespace, b.segfile#, b.segblk#, b.blocks / Enter value for seg_temp_file_id: 202 old 13: where b.segfile# = &seg_temp_file_id new 13: where b.segfile# = 202 SPID TABLESPACE SEGFILE# SEGBLK# SIZE_MB SID SERIAL# USERNAME OSUSER PROGRAM STATUS ------------ ------------ ---------- ---------- ---------- ---- ---------- ---------- --------- ----------------- -------- 11362 TEMP 202 1331081 1 1050 36 DYWX tuxedo Trade@ydtux INACTIVE 20232 TEMP 202 1331209 1 1029 149 DYCXJM oracle JDBC Thin Client INACTIVE 8.kill掉占用202号临时文件的会话 SQL> alter system kill session '1050,36'; System altered. SQL> alter system kill session '1029,149'; 9.删除文件系统目录的临时文件 SQL> alter database tempfile '/oracle/product/db_1/dbs/DATADG' drop including datafiles; Database altered. VARCHAR2(513) SQL> select file#,status from v$tempfile; FILE# STATUS ---------- ------- 1 ONLINE 10.查看空间,文件系统空间已经释放,刷卡业务正常 # bdf Filesystem kbytes used avail %used Mounted on /dev/vg00/lvol3 1048576 843648 203368 81% / /dev/vg00/lvol1 1835008 299040 1524048 16% /stand /dev/vg00/lvol8 8912896 2907904 5962000 33% /var /dev/vg00/lvol7 10240000 6512136 3698872 64% /usr /dev/vg00/lvol4 10485760 1293536 9121832 12% /tmp /dev/vg00/lvol6 8912896 6512824 2381744 23% /opt /dev/vg00/lvol5 131072 5824 124280 4% /home /dev/vg00/lvol9 32914560 8874368 23040192 29% /oracle 对数据库的操作要按照标准的规范来操作,操作需谨慎