https://blog.csdn.net/sunny05296/article/details/80178470
https://blog.csdn.net/sunny05296/article/details/112707420
Oracle磁盘写满导致的 Oracle sqlplus 连接报错 ORA-09925: Unable to create audit trail file
[oracle@localhost ~]$ sqlplus / as sysdba
SQL*Plus: Release 12.2.0.1.0 Production on Sat Jan 16 20:25:08 2021
Copyright (c) 1982, 2016, Oracle. All rights reserved.
ERROR:
ORA-09925: Unable to create audit trail file
Linux-x86_64 Error: 28: No space left on device
Additional information: 9925
ORA-09925: Unable to create audit trail file
Linux-x86_64 Error: 28: No space left on device
Additional information: 9925
Enter user-name:
导致该报错的原因有多重,需要排查是否存在下面问题导致:
1)磁盘空间不足导致写失败 检查Oracle所在目录磁盘是否写满:df -h && du -sh $ORACLE_HOME
2)dump目录不存在: 检查目录是否存在:ls -l $ORACLE_BASE/admin/$ORACLE_SID/adump
3)dump目录没有写权限: 检查目录用户/组的权限是否正确:ls -l $ORACLE_BASE/admin/$ORACLE_SID/adump
$ df -h
Filesystem Size Used Avail Use% Mounted on
devtmpfs 3.9G 0 3.9G 0% /dev
tmpfs 3.9G 0 3.9G 0% /dev/shm
tmpfs 3.9G 8.9M 3.9G 1% /run
tmpfs 3.9G 0 3.9G 0% /sys/fs/cgroup
/dev/mapper/centos-root 96G 96G 20K 100% /
/dev/sda1 497M 143M 355M 29% /boot
tmpfs 783M 0 783M 0% /run/user/0
我当前环境是磁盘使用100%导致的故障
排查trace文件占用大小(如果sqlplus能连接,通过 show parameter trace_en 可查看trace的参数配置)
Oracle diag 目录下面存在大量的 trace 文件(.trc)。
.trc:是 Oracle 数据库在运行时产生的日志,当系统启动时或运行过程中出现错误时,系统会自动记录跟踪文件到指定的目录,以便于检查。这些文件需定期维护删除,可以删除,对系统没有什么影响。
.trm:伴随着.trc文件产生,一个.trm对应一个.trc文件。.trm文件包含.trc文件的结构化信息
[oracle@localhost ~]$ du -sh $ORACLE_BASE
90G /opt/oracle
[oracle@localhost diag]$ du -sh $ORACLE_BASE/diag
58G /opt/oracle/diag
[oracle@localhost ~]$ for i in $(find $ORACLE_BASE -name trace); do du -sh $i; done
0 /opt/oracle/product/12.2.0.1.0/dbhome_1/network/trace
27G /opt/oracle/diag/rdbms/orcl/ORCL/trace
68M /opt/oracle/diag/tnslsnr/localhost/listener/trace
[oracle@localhost ~]$
[oracle@localhost ~]$ find /opt/oracle/diag/rdbms/orcl/ORCL/trace |wc -l
2277
统计7 天以前的 trace 文件个数:
$ find /opt/oracle/diag/rdbms/orcl/ORCL/trace -ctime +7 |wc -l
删除7天前的 trace 文件:
$ find /opt/oracle/diag/rdbms/orcl/ORCL/trace -ctime +7 -delete
也可以执行:
$ find /opt/oracle/diag/rdbms/orcl/ORCL/trace -ctime +7 |xargs rm -rf
$ df -h
Filesystem Size Used Avail Use% Mounted on
devtmpfs 3.9G 0 3.9G 0% /dev
tmpfs 3.9G 0 3.9G 0% /dev/shm
tmpfs 3.9G 8.9M 3.9G 1% /run
tmpfs 3.9G 0 3.9G 0% /sys/fs/cgroup
/dev/mapper/centos-root 96G 94G 1.6G 99% /
/dev/sda1 497M 143M 355M 29% /boot
tmpfs 783M 0 783M 0% /run/user/0
$ du -sh $ORACLE_BASE
88G /opt/oracle
$ for i in $(find $ORACLE_BASE -name trace); do du -sh $i; done
0 /opt/oracle/product/12.2.0.1.0/dbhome_1/network/trace
25G /opt/oracle/diag/rdbms/orcl/ORCL/trace
68M /opt/oracle/diag/tnslsnr/localhost/listener/trace
虽然文件删除了很多,但磁盘空间只释放了2G,继续排查
$ ls -l /opt/oracle/diag/rdbms/orcl/ORCL/trace |sort -rn -k5 |more
-rwxrwxr-x 1 oracle oinstall 9256665088 Jan 16 20:21 alert_ORCL.log
-rwxr-xr-x 1 oracle oinstall 3844939776 Jan 10 21:21 ORCL_arc0_22267.trc
-rwxr-xr-x 1 oracle oinstall 3827126272 Jan 10 21:21 ORCL_arc1_22271.trc
-rwxr-xr-x 1 oracle oinstall 3710988288 Jan 10 21:21 ORCL_arc3_22275.trc
-rwxr-xr-x 1 oracle oinstall 3662327808 Jan 10 21:21 ORCL_arc2_22273.trc
-rwxr-xr-x 1 oracle oinstall 847063637 Jan 8 23:53 ORCL_arc1_22271.trm
-rwxr-xr-x 1 oracle oinstall 821104640 Jan 8 23:53 ORCL_arc3_22275.trm
-rwxr-xr-x 1 oracle oinstall 83066880 Jan 10 20:15 ORCL_gen0_22191.trc
-rw-rw---- 1 oracle oinstall 55296000 Jan 10 21:10 ORCL_mmon_29982.trc
-rw-r----- 1 oracle oinstall 50618368 Jan 10 20:41 ORCL_dia0_22215_base_2.trc
-rw-r----- 1 oracle oinstall 20414173 Jan 10 08:00 ORCL_dia0_22215_base_2.trm
-rw-rw---- 1 oracle oinstall 16408576 Jan 16 20:21 ORCL_arc0_588.trc
-rw-rw---- 1 oracle oinstall 16240640 Jan 16 20:21 ORCL_arc1_592.trc
-rw-rw---- 1 oracle oinstall 16138240 Jan 16 20:21 ORCL_arc2_594.trc
-rw-rw---- 1 oracle oinstall 15212544 Jan 16 20:21 ORCL_arc3_596.trc
-rw-rw---- 1 oracle oinstall 14813452 Jan 16 20:12 ORCL_arc0_2497.trc
-rw-rw---- 1 oracle oinstall 14166624 Jan 16 20:12 ORCL_arc3_2505.trc
-rw-rw---- 1 oracle oinstall 13591222 Jan 16 20:12 ORCL_arc2_2503.trc
-rw-rw---- 1 oracle oinstall 13370278 Jan 16 20:12 ORCL_arc1_2501.trc
-rwxr-xr-x 1 oracle oinstall 11591680 Jan 10 05:49 ORCL_dbrm_22207.trc
-rw-rw---- 1 oracle oinstall 1843307 Jan 16 20:04 ORCL_arc2_1895.trc
-rw-rw---- 1 oracle oinstall 1731044 Jan 16 20:04 ORCL_arc0_1889.trc
-rw-rw---- 1 oracle oinstall 1010843 Jan 16 20:04 ORCL_arc1_1893.trc
-rw-rw---- 1 oracle oinstall 879135 Jan 16 20:04 ORCL_arc3_1897.trc
-rw-rw---- 1 oracle oinstall 817953 Jan 16 19:47 ORCL_diag_9316_20210116194746.trc
-rw-rw---- 1 oracle oinstall 629874 Jan 16 20:17 ORCL_arc0_588.trm
-rw-rw---- 1 oracle oinstall 622592 Jan 16 20:17 ORCL_arc2_594.trm
alert_ORCL.log 占用比较大,还有几个最近几日的trace文件也比较大
只保留最近3天的,删除3天前的文件
$ find /opt/oracle/diag/rdbms/orcl/ORCL/trace -ctime +3 -delete
$ du -sh /opt/oracle/diag/rdbms/orcl/ORCL/trace
8.8G /opt/oracle/diag/rdbms/orcl/ORCL/trace
$ df -h
Filesystem Size Used Avail Use% Mounted on
devtmpfs 3.9G 0 3.9G 0% /dev
tmpfs 3.9G 0 3.9G 0% /dev/shm
tmpfs 3.9G 8.9M 3.9G 1% /run
tmpfs 3.9G 0 3.9G 0% /sys/fs/cgroup
/dev/mapper/centos-root 96G 79G 18G 82% /
/dev/sda1 497M 143M 355M 29% /boot
tmpfs 783M 0 783M 0% /run/user/0
释放了大约18G的磁盘空间
$ ls -l /opt/oracle/diag/rdbms/orcl/ORCL/trace |sort -rn -k5 |head -n 10
-rwxrwxr-x 1 oracle oinstall 9256665088 Jan 16 20:21 alert_ORCL.log
-rw-rw---- 1 oracle oinstall 16408576 Jan 16 20:21 ORCL_arc0_588.trc
-rw-rw---- 1 oracle oinstall 16240640 Jan 16 20:21 ORCL_arc1_592.trc
-rw-rw---- 1 oracle oinstall 16138240 Jan 16 20:21 ORCL_arc2_594.trc
-rw-rw---- 1 oracle oinstall 15212544 Jan 16 20:21 ORCL_arc3_596.trc
-rw-rw---- 1 oracle oinstall 14813452 Jan 16 20:12 ORCL_arc0_2497.trc
-rw-rw---- 1 oracle oinstall 14166624 Jan 16 20:12 ORCL_arc3_2505.trc
-rw-rw---- 1 oracle oinstall 13591222 Jan 16 20:12 ORCL_arc2_2503.trc
-rw-rw---- 1 oracle oinstall 13370278 Jan 16 20:12 ORCL_arc1_2501.trc
-rw-rw---- 1 oracle oinstall 1843307 Jan 16 20:04 ORCL_arc2_1895.trc
alert_ORCL.log占用9G+空间,单独清理该文件,使用 truncate 裁剪日志文件末尾内容,只保留文件大小1KB:
$ truncate -s 1KB /opt/oracle/diag/rdbms/orcl/ORCL/trace/alert_ORCL.log
查看清理后的最终的trace目录大小 & 磁盘占用情况:
$ du -sh /opt/oracle/diag/rdbms/orcl/ORCL/trace
128M /opt/oracle/diag/rdbms/orcl/ORCL/trace
$ df -h
Filesystem Size Used Avail Use% Mounted on
devtmpfs 3.9G 0 3.9G 0% /dev
tmpfs 3.9G 0 3.9G 0% /dev/shm
tmpfs 3.9G 8.9M 3.9G 1% /run
tmpfs 3.9G 0 3.9G 0% /sys/fs/cgroup
/dev/mapper/centos-root 96G 70G 26G 73% /
/dev/sda1 497M 143M 355M 29% /boot
tmpfs 783M 0 783M 0% /run/user/0
磁盘清理出26G空间
继续排查是否还要清理空间
$ for i in $(ls /opt/oracle/diag/rdbms/orcl/ORCL/); do echo $(du -sh $i);done
31G alert
0 cdump
0 hm
4.6M incident
0 incpkg
4.0K ir
4.0K lck
8.0K log
5.3M metadata
0 metadata_dgif
0 metadata_pv
12K stage
0 sweep
128M trace
发现 alert 目录(alert 日志存储目录)占用31G
$ ls -l /opt/oracle/diag/rdbms/orcl/ORCL/alert |wc -l
3812
Oracle 11g和12c的 alert 告警日志文件的位置有了变化,Oracle 12c的alert日志并不在bdump目录下(show parameter dump),主要是因为引入了Automatic Diagnostic Repository(ADR,自动诊断仓库)。一个存放数据库诊断日志、跟踪文件的目录,关于ADR对应的目录位置可以通过查看 v$diag_info 系统视图。
因此,Oracle 12c环境下查询,alert日志并不在bdump目录下,需要通过 v$diag_info 视图查询
select * from v$diag_info;
查询结果
Diag Trace:对应的目录为文本格式的告警日志文件所在的目录
Diag Alert:对应的目录为XML格式的警告日志(对应为log_xxx.xml)
因此 /opt/oracle/diag/rdbms/orcl/ORCL/alert 也可以清零释放空间
执行命令清理 alert 日志:
find /opt/oracle/diag/rdbms/orcl/ORCL/alert -ctime +3 -delete
$ du -sh /opt/oracle/diag/rdbms/orcl/ORCL/alert
48M /opt/oracle/diag/rdbms/orcl/ORCL/alert
清理后释放30多G的空间
再次查看磁盘情况:
$ df -h
Filesystem Size Used Avail Use% Mounted on
devtmpfs 3.9G 0 3.9G 0% /dev
tmpfs 3.9G 0 3.9G 0% /dev/shm
tmpfs 3.9G 8.9M 3.9G 1% /run
tmpfs 3.9G 0 3.9G 0% /sys/fs/cgroup
/dev/mapper/centos-root 96G 39G 57G 41% /
/dev/sda1 497M 143M 355M 29% /boot
tmpfs 783M 0 783M 0% /run/user/0
$ du -sh $ORACLE_BASE
33G /opt/oracle
通过清理 trace 和 alert 日志,释放了总计大约57G磁盘空间
总结释放了57G的空间
$ sqlplus / as sysdba
连接成功