【风哥干货】快速解决Oracle数据库故障必备的20个脚本与命令

1.操作系统性能(通常故障出现时最先检查的内容)
top、topas、vmstat、iostat、free、nmon

2.万能重启方法

如应急情况,需要重启数据库:
tail -100f <对应路径>alert_fgedu.log
alter system switch logfile;
alter system checkpoint;
shutdown immediate;  
//风哥提示:如果不能正常关机,可以使用shutdown abort强制关机;
startup

3.批量杀进程(数据库挂起时应急恢复)

3.1.kill所有LOCAL=NO进程
ps -ef|grep LOCAL=NO|grep $ORACLE_SID|grep -v grep|awk '{print $2}' |xargs kill -9

3.2.按用户批量杀进程
select 'alter system kill session ''' || s.sid || ',' || s.serial# ||
       '''; -- kill -9  ' || p.spid
  from v$session s, v$process p
 where s.PADDR = p.addr and s.username='&username'

4.数据库杀会话(应急方法)

4.1.杀某个SID会话
SELECT /*+ rule */ sid, s.serial#, 'kill -9 '||spid, event, blocking_session b_sess 
FROM v$session s, v$process p WHERE sid='&sid' AND s.paddr = p.addr order by 1;

4.2.根据SQL_ID杀会话
SELECT /*+ rule */ sid, s.serial#, 'kill -9 '||spid, event, blocking_session b_sess 
FROM v$session s, v$process p WHERE sql_id='&sql_id' AND s.paddr = p.addr order by 1;

4.3.根据等待事件杀会话
SELECT /*+ rule */ sid, s.serial#, 'kill -9 '||spid, event, blocking_session b_sess 
FROM v$session s, v$process p WHERE event='&event' AND s.paddr = p.addr order by 1;

4.4.根据用户杀会话
SELECT /*+ rule */ sid, s.serial#, 'kill -9 '||spid, event, blocking_session b_sess 
FROM v$session s, v$process p WHERE username='&username' AND s.paddr = p.addr order by 1;


5.性能报告收集与自动诊断报告(性能分析必备)

5.1.statspack (风哥提示:适合于9i以下版本)
spcreate.sql, execute statspack.snap
spreport.sql spdrop.sql

5.2.awr性能监控工具的使用方法(风哥提示:10g/11g/12c/18c/19c使用)

性能报告产生方法(支持txt和html格式):
@$ORACLE_HOME/rdbms/admin/awrrpt.sql

或者
--RAC可以指定实例id
@$ORACLE_HOME/rdbms/admin/awrrpti.sql

5.3. addm自动故障诊断报告(风哥提示:10g/11g/12c/18c/19c使用)
@$ORACLE_HOME/rdbms/admin/addmrpt.sql

或者
--RAC可以指定实例id
@$ORACLE_HOME/rdbms/admin/addmrpti.sql


6.定期检查表空间使用情况(表空间100%导致业务异常)

--from:www.fgedu.net.cn/oracle.html
col f.tablespace_name format a15
col d.tot_grootte_mb format a10
col ts-per format a8
select upper(f.tablespace_name) "TS-name",
       d.tot_grootte_mb "TS-bytes(m)",
       d.tot_grootte_mb - f.total_bytes "TS-used (m)",
       f.total_bytes "TS-free(m)",
       to_char(round((d.tot_grootte_mb - f.total_bytes) / d.tot_grootte_mb * 100,
                     2),
               '990.99') "TS-per"
         from (select tablespace_name,
               round(sum(bytes) / (1024 * 1024), 2) total_bytes,
               round(max(bytes) / (1024 * 1024), 2) max_bytes
          from sys.dba_free_space
         group by tablespace_name) f, 
       (select dd.tablespace_name,
               round(sum(dd.bytes) / (1024 * 1024), 2) tot_grootte_mb
          from sys.dba_data_files dd
         group by dd.tablespace_name) d
where d.tablespace_name = f.tablespace_name
order by 5 desc;


7.捕获占用CPU利用率过高的SQL语句
set lin 1000
set pagesize 1000
col USERNAME format a16
col MACHINE format a16
col SQL_TEXT format a200
SELECT a.username,a.machine,a.program,a.sid,a.serial#,a.status,c.piece,c.sql_text FROM v$session a,v$process b,v$sqltext c WHERE b.spid='&spid' AND b.addr=a.paddr AND a.sql_address=c.address(+) ORDER BY c.piece;


8.查看等待事件(在数据库中首先要检查的操作)

col event for a45
SELECT  inst_id,EVENT, SUM(DECODE(WAIT_TIME, 0, 0, 1)) "Prev", SUM(DECODE(WAIT_TIME, 0, 1, 0)) "Curr", COUNT(*) "Tot" , sum(SECONDS_IN_WAIT) SECONDS_IN_WAIT
FROM GV$SESSION_WAIT
WHERE event NOT
IN ('smon timer','pmon timer','rdbms ipc message','SQL*Net message from client','gcs remote message')
    AND event NOT LIKE '%idle%'
    AND event NOT LIKE '%Idle%'
    AND event NOT LIKE '%Streams AQ%'
GROUP BY inst_id,EVENT
ORDER BY 1,5 desc

风哥教程(www.fgedu.net.cn)提示:数据库中有一些常见异常等待事件,要重点分析,如:row cache lock、buffer busy waits、library cache lock、read by other session、latch:shared pool、gc buffer busy、cursor: pin S on X、direct path read、log file sync、enq: TX - index contention、latch free、enq: TX - row lock contention等等。
 
9.根据等待事件查会话

得到异常等待事件之后,我们就根据等待事件去查会话详情,也就是查看哪些会话执行哪些SQL在等待,另外还查出来用户名和机器名称,以及是否被阻塞。
SELECT /*+rule */ sid, s.serial#, spid, event, sql_id, seconds_in_wait ws, row_wait_obj# obj,
s.username, s.machine, BLOCKING_INSTANCE||'.'||blocking_session b_sess 
FROM v$session s, v$process p 
WHERE event='&event_name' AND s.paddr = p.addr order by 6;

10.查询某个会话详情
得到会话列表之后,可以根据如下SQL查询某个会话的详细信息,如上次个执行的SQL_ID,登录时间等。

SELECT s.sid, s.serial#, spid, event, sql_id, PREV_SQL_ID, seconds_in_wait ws, row_wait_obj# obj,
s.username, s.machine, module,blocking_session b_sess,logon_time  
FROM v$session s, v$process p 
WHERE sid = '&sid' AND s.paddr = p.addr

11.查询对象信息

从前面两个SQL都可以看到会话等待的对象ID,可以通过如下SQL查询对象的详细信息。

col OBJECT_NAME for a30
select owner,object_name,subobject_name,object_type 
from dba_objects 
where object_id=&oid

12.根据SQL_ID、HASH_VALUE查询SQL语句

select sql_id,SQL_fullTEXT 
from v$sqlarea 
where (sql_id='&sqlid' or hash_value=to_number('&hashvale') ) 
and rownum<2

13..查询会话阻塞情况,某个会话阻塞了多少个会话。

select count(*),blocking_session 
from v$session 
where blocking_session is not null 
group by blocking_session;

14.查询数据库的锁

通过如下SQL查询某个会话的锁,有哪些TM、TX锁,以及会话和锁关联查询的SQL。

set linesize 180
col username for a15
col owner for a15
col OBJECT_NAME for a30
col SPID for a10

14.1.查询某个会话的锁
select /*+rule*/SESSION_ID,OBJECT_ID,ORACLE_USERNAME,OS_USER_NAME,PROCESS,LOCKED_MODE 
from gv$locked_object where session_id=&sid;

14.2.查询TM、TX锁
select /*+rule*/* from v$lock 
where ctime >100 and type in ('TX','TM') order by 3,9;

14.3.查询数据库中的锁
select /*+rule*/s.sid,p.spid,l.type,round(max(l.ctime)/60,0) lock_min,s.sql_id,s.USERNAME,b.owner,b.object_type,b.object_name 
from v$session s, v$process p,v$lock l,v$locked_object o,dba_objects b 
where  o.SESSION_ID=s.sid and s.sid=l.sid and o.OBJECT_ID=b.OBJECT_ID 
and s.paddr = p.addr and l.ctime >100 and l.type in ('TX','TM','FB') 
group by s.sid,p.spid,l.type,s.sql_id,s.USERNAME,b.owner,b.object_type,b.object_name 
order by 9,1,3

15.故障信息收集

风哥提示:数据库hang住了之后,需要详细分析原因,或者提供给二线支持的信息,可使用下面脚本,收集systemstate dump和hanganalyze信息,如果有sqlplus无法登陆的情况,可以加-prelim参数。
--systemstate dump
sqlplus -prelim / as sysdba
oradebug setmypid
oradebug unlimit;
oradebug dump systemstate 266;
--wait for 1 min
oradebug dump systemstate 266;
--wait for 1 min
oradebug dump systemstate 266;
oradebug tracefile_name;

--hanganalyze
oradebug setmypid
oradebug unlimit;
oradebug dump hanganalyze 3
--wait for 1 min
oradebug dump hanganalyze 3
--wait for 1 min
oradebug dump hanganalyze 3
oradebug tracefile_name

还有许多类似的脚本,请参考《风哥Oracle数据库巡检工具2.6网页版》,请关注:

公众号:关注"风哥教程"的公众号itpux_com

QQ群:189070296,150201289

同时还推出两套入门视频教程,为新人提供快速学习Oracle/MySQL的机会:

Oracle数据库入门到高薪培训教程(从11g 到 19c) https://edu.51cto.com/sd/0398f

MySQL数据库入门到高薪培训课程(从MySQL 5.7 到 8.0) https://edu.51cto.com/sd/daaa7

你可能感兴趣的:(【风哥干货】快速解决Oracle数据库故障必备的20个脚本与命令)