记一次io流量过高

一、简介

本文主要记录一次主机io流量过高的oracle告警。通过linux命令iotop快速分析主机上io占用大的进程,随后通过oracle上v$process和v$\session视图分析出io占用大的oracle会话信息。

二、iotop分析主机io信息
iotop -oPa

结果如下:

  PID  PRIO  USER    DISK READ>  DISK WRITE  SWAPIN      IO    COMMAND 
78491 be/4 ********     75.66 G    996.00 K  0.00 %  0.24 % *************
68302 be/4 ********     75.09 G    996.00 K  0.00 %  0.22 % *************
74078 be/4 ********     74.77 G    996.00 K  0.00 %  0.21 % *************
66484 be/4 ********     74.73 G   1992.00 K  0.00 %  0.58 % *************
76430 be/4 ********     73.92 G    996.00 K  0.00 %  0.23 % *************
73801 be/4 ********     73.20 G   1992.00 K  0.00 %  0.40 % *************
  663 be/4 ********     29.22 G      0.00 B  0.00 % 51.55 % *************
91884 be/4 ********    709.26 M   1412.13 M  0.00 %  1.61 % *************
62789 be/4 ********    660.98 M      0.00 B  0.00 % 59.23 % *************
54332 be/4 ********    511.54 M      0.00 B  0.00 %  3.26 % *************
52662 be/4 ********    137.42 M      0.00 B  0.00 %  0.34 % *************
66191 be/4 ********     30.80 M      0.00 B  0.00 % 80.65 % *************
19729 be/4 ********     23.02 M     69.38 M  0.00 %  4.22 % *************
16375 be/4 ********     10.86 M      0.00 B  0.00 %  2.85 % *************

显然,进程id为78491、68302、74078、66484、76430和73801的进程占用io较高。

三、v$process和v$session分析数据库会话信息

查询命令

SELECT SID
       || ','
       || SERIAL#                SID#,
       MACHINE,
       USERNAME,
       EVENT,
       STATE,
       WAIT_TIME_MICRO,
       BLOCKING_INSTANCE
       || ( CASE
              WHEN BLOCKING_INSTANCE IS NULL THEN NULL
              ELSE ','
            END )
       || BLOCKING_SESSION       BL_SESS,
       FINAL_BLOCKING_INSTANCE
       || ( CASE
              WHEN FINAL_BLOCKING_INSTANCE IS NULL THEN NULL
              ELSE ','
            END )
       || FINAL_BLOCKING_SESSION FI_BL_SESS,
       FINAL_BLOCKING_SESSION_STATUS,
       STATUS,
       COMMAND,
       SQL_ID,
       SQL_CHILD_NUMBER,
       PREV_SQL_ID,
       PREV_CHILD_NUMBER,
       WAIT_CLASS,
       WAIT_TIME,
       SECONDS_IN_WAIT,
       TIME_REMAINING_MICRO,
       TIME_SINCE_LAST_WAIT_MICRO,
       P1,
       P1RAW,
       P2,
       P2RAW,
       P3,
       P3RAW,
       PADDR,
       TADDR,
       SADDR,
       OSUSER,
       PROCESS,
       CREATOR_ADDR,
       AUDSID,
       ROW_WAIT_OBJ#,
       ROW_WAIT_FILE#,
       ROW_WAIT_BLOCK#,
       ROW_WAIT_ROW#,
       PROGRAM,
       COMMAND,
       PLSQL_ENTRY_OBJECT_ID,
       PLSQL_ENTRY_SUBPROGRAM_ID,
       PLSQL_OBJECT_ID,
       PLSQL_SUBPROGRAM_ID,
       SQL_EXEC_START,
       LOGON_TIME,
       CLIENT_IDENTIFIER,
       SERVICE_NAME,
       RESOURCE_CONSUMER_GROUP
FROM   V$SESSION
WHERE  paddr IN (SELECT addr
                 FROM   v$process
                 WHERE  spid = &spid);

查询结果如下:

SQL> SELECT SID||','||SERIAL# SID#,MACHINE,USERNAME,EVENT,STATE,WAIT_TIME_MICRO,BLOCKING_INSTANCE||(CASE WHEN BLOCKING_INSTANCE IS NULL THEN NULL ELSE ',' END)||BLOCKING_SESSION BL_SESS,FINAL_BLOCKING_INSTANCE||(CASE WHEN FINAL_BLOCKING_INSTANCE IS NULL THEN NULL ELSE ',' END)||FINAL_BLOCKING_SESSION FI_BL_SESS,FINAL_BLOCKING_SESSION_STATUS,STATUS,COMMAND,SQL_ID,SQL_CHILD_NUMBER,PREV_SQL_ID,PREV_CHILD_NUMBER,WAIT_CLASS,WAIT_TIME,SECONDS_IN_WAIT,TIME_REMAINING_MICRO,TIME_SINCE_LAST_WAIT_MICRO,P1,P1RAW,P2,P2RAW,P3,P3RAW,PADDR,TADDR,SADDR,OSUSER,PROCESS,CREATOR_ADDR,AUDSID,ROW_WAIT_OBJ#,ROW_WAIT_FILE#,ROW_WAIT_BLOCK#,ROW_WAIT_ROW#,PROGRAM,COMMAND,PLSQL_ENTRY_OBJECT_ID,PLSQL_ENTRY_SUBPROGRAM_ID,PLSQL_OBJECT_ID,PLSQL_SUBPROGRAM_ID,SQL_EXEC_START,LOGON_TIME,CLIENT_IDENTIFIER,SERVICE_NAME,RESOURCE_CONSUMER_GROUP FROM V$SESSION where paddr in (select addr from v$process where spid in (78491,68302,74078,66484,76430,73801));

SID#         MACHINE            USERNAME     EVENT                        STATE           WT_MILLI BL_SESS  FI_BL_SE FI_BL_S_S  STATUS COMMAND SQL_ID        SCH PREV_SQL_ID   PCH WAIT_CLASS WAIT_TIME SECONDS_IN_WAIT                  P1 P1RAW                             P2 P2RAW                             P3 P3RAW            PADDR            TADDR            SADDR            OSUSER     PROCESS      CREATOR_ADDR         AUDSID ROW_WAIT_OBJ# ROW_WAIT_FILE# ROW_WAIT_BLOCK# ROW_WAIT_ROW# PROGRAM                          COMMAND PLSQL_ENTRY_OBJECT_ID PLSQL_ENTRY_SUBPROGRAM_ID PLSQL_OBJECT_ID PLSQL_SUBPROGRAM_ID SQL_EXEC_START      LOGON_TIME          CLIENT_IDENTIFIER    SERVICE_NA RESOURCE_CONSUMER_GROUP
------------ ------------------ ------------ ---------------------------- ---------- ------------- -------- -------- ---------- ------ ------- ------------- --- ------------- --- ---------- --------- --------------- ------------------- ---------------- ------------------- ---------------- ------------------- ---------------- ---------------- ---------------- ---------------- ---------- ------------ ---------------- ---------- ------------- -------------- --------------- ------------- -------------------------------- ------- --------------------- ------------------------- --------------- ------------------- ------------------- ------------------- -------------------- ---------- --------------------------------
1551,34255   ***********        SYS          Backup Restore Throttle slee WAITED KNO       1000073                   NOT IN WAI ACTIVE       0                   6yu0rb6um4wqm   0 Other            100               2                1551 000000000000060F               34255 00000000000085CF                   0 00               00000011356ED348                  0000001105ED21B0 oracle     61122        00000011356ED348 4294967295            -1              0               0             0 rman@*********** (TNS V1-V3)           0                  9319                       303                                                         2019-08-25 19:02:34                      SYS$USERS
2702,17483   ***********        SYS          Backup Restore Throttle slee WAITING           355797                   UNKNOWN    ACTIVE       0                   6yu0rb6um4wqm   0 Other              0               0                2702 0000000000000A8E               17483 000000000000444B                   0 00               00000011056F44D8                  00000011360C7450 oracle     61122        00000011056F44D8 4294967295            -1              0               0             0 rman@*********** (TNS V1-V3)           0                  9319                       303                                                         2019-08-25 19:02:56                      SYS$USERS
3470,33841   ***********        SYS          Backup Restore Throttle slee WAITED KNO       1000075                   NOT IN WAI ACTIVE       0                   6yu0rb6um4wqm   0 Other            100               3                3470 0000000000000D8E               33841 0000000000008431                   0 00               00000011457AAAC8                  0000001116210B10 oracle     61122        00000011457AAAC8 4294967295            -1              0               0             0 rman@*********** (TNS V1-V3)           0                  9319                       303                                                         2019-08-25 19:03:18                      SYS$USERS
10402,16591  ***********        SYS          Backup Restore Throttle slee WAITING                0                   UNKNOWN    ACTIVE       0                   6yu0rb6um4wqm   0 Other              0               0               10402 00000000000028A2               16591 00000000000040CF                   0 00               00000011457A8958                  0000001136DE76D0 oracle     61122        00000011457A8958 4294967295            -1              0               0             0 rman@*********** (TNS V1-V3)           0                  9319                       303                                                         2019-08-25 19:01:27                      SYS$USERS
11172,23095  ***********        SYS          Backup Restore Throttle slee WAITING           393751                   UNKNOWN    ACTIVE       0                   6yu0rb6um4wqm   0 Other              0               0               11172 0000000000002BA4               23095 0000000000005A37                   0 00               00000011356EC290                  0000001136F37710 oracle     61122        00000011356EC290 4294967295            -1              0               0             0 rman@*********** (TNS V1-V3)           0                  9319                       303                                                         2019-08-25 19:02:11                      SYS$USERS
11945,64993  ***********        SYS          Backup Restore Throttle slee WAITING           510444                   UNKNOWN    ACTIVE       0                   6yu0rb6um4wqm   0 Other              0               1               11945 0000000000002EA9               64993 000000000000FDE1                   0 00               00000011156D5D70                  0000001157068D20 oracle     61122        00000011156D5D70 4294967295            -1              0               0             0 rman@*********** (TNS V1-V3)           0                  9319                       303                                                         2019-08-25 19:01:49                      SYS$USERS

6 rows selected.

由查询结果可知,数据库当前io流量大的主要原因是在做备份,与做备份的同事沟通,确认数据库确实是在做备份,而且开了6个io通道,每个io通道限速200MB/s。可通过以下命令确认是否存在RMAN备份进程

ps -ef | grep rman
四、iotop使用方法
[oracle@dgg-xxxxxxxx-or ~]$ iotop
Netlink error: Operation not permitted (1)

The Linux kernel interfaces that iotop relies on now require root privileges
or the NET_ADMIN capability. This change occurred because a security issue
(CVE-2011-2494) was found that allows leakage of sensitive data across user
boundaries. If you require the ability to run iotop as a non-root user, please
configure sudo to allow you to run iotop as root.

Please do not file bugs on iotop about this.
[oracle@dgg-dggaud10-or ~]$ iotop -h
Usage: /usr/sbin/iotop [OPTIONS]

DISK READ and DISK WRITE are the block I/O bandwidth used during the sampling
period. SWAPIN and IO are the percentages of time the thread spent respectively
while swapping in and waiting on I/O more generally. PRIO is the I/O priority at
which the thread is running (set using the ionice command).

Controls: left and right arrows to change the sorting column, r to invert the
sorting order, o to toggle the --only option, p to toggle the --processes
option, a to toggle the --accumulated option, q to quit, any other key to force
a refresh.

Options:
  --version             show program's version number and exit
  -h, --help            show this help message and exit
  -o, --only            only show processes or threads actually doing I/O
  -b, --batch           non-interactive mode
  -n NUM, --iter=NUM    number of iterations before ending [infinite]
  -d SEC, --delay=SEC   delay between iterations [1 second]
  -p PID, --pid=PID     processes/threads to monitor [all]
  -u USER, --user=USER  users to monitor [all]
  -P, --processes       only show processes, not all threads
  -a, --accumulated     show accumulated I/O instead of bandwidth
  -k, --kilobytes       use kilobytes instead of a human friendly unit
  -t, --time            add a timestamp on each line (implies --batch)
  -q, --quiet           suppress some lines of header (implies --batch)

你可能感兴趣的:(记一次io流量过高)