首先介绍一下 top
命令的使用方法, top 程序提供了运行系统的动态实时视图, 它可以显示系统摘要信息以及当前线程或进程的列表
$ top -h
procps-ng 3.3.12
Usage:
top -hv | -bcHiOSs -d secs -n max -u|U user -p pid(s) -o field -w [cols]
-hv Help/Version 两者都是打印版本等帮助信息
在命令行参数中提供以下选项可以改变默认值
-b Batch-mode 非窗口模式的输出
-c Command-line/Program-name 显示进程的 command
-H Threads-mode 线程模式 指示top显示单个线程。如果没有此命令行选项,则显示每个进程中所有线程的总和。窗口模式下可以用“H”更改
-i Idle-process 空闲任务 当此切换为“关闭”时,自上次更新以来未使用任何CPU的任务将不会显示
-O Output-field-names
-S Cumulative-time 累积模式
-s Secure-mode 安全模式
-d Delay-time 延迟时间
-n 刷新次数
-w 限制列数
默认值如下:
Global-defaults
A - Alt display Off (full-screen)
* d - Delay time 1.5 seconds
* H - Threads mode Off (summarize as tasks)
I - Irix mode On (no, `solaris' smp)
* p - PID monitoring Off (show all processes)
* s - Secure mode Off (unsecured)
B - Bold enable On (yes, bold globally)
Summary-Area-defaults
l - Load Avg/Uptime On (thus program name)
t - Task/Cpu states On (1+1 lines, see `1')
m - Mem/Swap usage On (2 lines worth)
1 - Single Cpu Off (thus multiple cpus)
Task-Area-defaults
b - Bold hilite Off (use `reverse')
* c - Command line Off (name, not cmdline)
* i - Idle tasks On (show all tasks)
J - Num align right On (not left justify)
j - Str align right Off (not right justify)
R - Reverse sort On (pids high-to-low)
* S - Cumulative time Off (no, dead children)
* u - User filter Off (show euid only)
* U - User filter Off (show any uid)
V - Forest view On (show as branches)
x - Column hilite Off (no, sort field)
y - Row hilite On (yes, running tasks)
z - color/mono On (show colors)
要想监控 CPU 使用情况, 我们可以观察 top -bi -n 1
以下是命令watch top -bi -n 1
的输出
Every 2.0s: top -bi -n 1 MyServer: Fri Oct 18 08:45:14 2019
top - 08:45:14 up 36 days, 1:50, 5 users, load average: 0.07, 0.05, 0.01
Tasks: 146 total, 1 running, 144 sleeping, 1 stopped, 0 zombie
%Cpu(s): 0.1 us, 0.0 sy, 0.0 ni, 99.8 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
KiB Mem : 2062096 total, 350188 free, 316304 used, 1395604 buff/cache
KiB Swap: 524284 total, 523764 free, 520 used. 1550992 avail Mem
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
当我开启一个线程空转时
Every 2.0s: top -bi -n 1 MyServer: Fri Oct 18 08:45:55 2019
top - 08:45:55 up 36 days, 1:51, 5 users, load average: 0.12, 0.06, 0.01
Tasks: 148 total, 1 running, 146 sleeping, 1 stopped, 0 zombie
%Cpu(s): 0.1 us, 0.0 sy, 0.0 ni, 99.8 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
KiB Mem : 2062096 total, 339368 free, 327092 used, 1395636 buff/cache
KiB Swap: 524284 total, 523764 free, 520 used. 1540204 avail Mem
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
5595 d 20 0 3100664 33628 24520 S 100.0 1.6 0:04.71 java
当然, top -cbi -n 1
可以显示完整命令行
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
4584 root 20 0 3142032 158940 27872 S 0.3 7.7 1:00.08 java -cp .:bin:SpringDependent/emcat/ref/tomcat-annotations-api-9.0.26.jar:SpringDependent/emcat/ref/tomcat-embed-core-+
使用正则表达式匹配 CPU 和 内存
^.*\s+(\d+\.\d+)\s+(\d+.\d+)\s+.*$
然后就可以编程实现了, 项目地址: https://github.com/develon2015/CPUWarning
采样174 CPU:100.0 Mem: 1.7
采样175 CPU:100.0 Mem: 1.7
CPU平均使用率为 100.1840909090909 %
CPU 超载 (100.0%), 检查上一次警告时间以确认本次是否发送警报邮件
发送邮件 -- (Sat Oct 19 00:53:28 EDT 2019)
已发送警报邮件至 [email protected] : CPU超负荷警告 -> 服务器CPU严重超载(100.1840909090909%), 请管理员立即处理.
top - 00:53:27 up 36 days, 17:58, 5 users, load average: 0.97, 0.39, 0.15
Tasks: 148 total, 1 running, 147 sleeping, 0 stopped, 0 zombie
%Cpu(s): 0.1 us, 0.0 sy, 0.0 ni, 99.8 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
KiB Mem : 2062096 total, 134600 free, 362592 used, 1564904 buff/cache
KiB Swap: 524284 total, 523508 free, 776 used. 1515476 avail Mem
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
13211 d 20 0 3100664 34784 25480 S 100.0 1.7 2:02.85 java
FROM CPUWarning.
采样0 CPU:106.7 Mem: 1.7
采样1 CPU:106.7 Mem: 3.0999999999999996
采样2 CPU:93.8 Mem: 1.7
采样3 CPU:93.8 Mem: 1.7
采样4 CPU:106.7 Mem: 1.7
采样5 CPU:100.0 Mem: 1.7
采样6 CPU:100.0 Mem: 1.7
采样7 CPU:100.0 Mem: 1.7
采样8 CPU:106.7 Mem: 1.7
...
采样158 CPU:6.7 Mem: 2.9
采样159 CPU:0.0 Mem: 0.0
采样160 CPU:0.0 Mem: 0.0
采样161 CPU:0.0 Mem: 0.0
采样162 CPU:0.0 Mem: 0.0
采样163 CPU:0.0 Mem: 0.0
CPU平均使用率为 36.94268292682926 %
警报解除 -- (Sat Oct 19 00:55:28 EDT 2019)
当前处于安全状态(CPU 0.0 %) -- (Sat Oct 19 00:55:30 EDT 2019)
top - 00:55:30 up 36 days, 18:00, 5 users, load average: 0.35, 0.40, 0.18
Tasks: 146 total, 1 running, 145 sleeping, 0 stopped, 0 zombie
%Cpu(s): 0.1 us, 0.0 sy, 0.0 ni, 99.8 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
KiB Mem : 2062096 total, 134324 free, 362808 used, 1564964 buff/cache
KiB Swap: 524284 total, 523508 free, 776 used. 1515260 avail Mem
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
当前处于安全状态(CPU 0.0 %) -- (Sat Oct 19 00:55:33 EDT 2019)
top - 00:55:32 up 36 days, 18:00, 5 users, load average: 0.32, 0.39, 0.18
Tasks: 146 total, 1 running, 145 sleeping, 0 stopped, 0 zombie
%Cpu(s): 0.1 us, 0.0 sy, 0.0 ni, 99.8 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
KiB Mem : 2062096 total, 134324 free, 362808 used, 1564964 buff/cache
KiB Swap: 524284 total, 523508 free, 776 used. 1515260 avail Mem
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
https://github.com/develon2015/CPUWarning