Oracle CHM使用空间过大,crfclust.bdb文件过大

今天发现我们的数据库服务器CRS安装目录突然增大,经过查找发现在crf目录中存在一个非常大的crf文件,通过MOS查找,发现与Bug 20186278相似。

解决方法:
Remove those large Berkeley database files to free up space by doing the following as root:

$GI_HOME/bin/crsctl stop res ora.crf -init 
cd $GI_HOME/crf/db/ 
rm *.bdb 
$GI_HOME/bin/crsctl start res ora.crf -init

先来简单介绍一下CHM。

What is the Cluster Health Monitor?

The Cluster Health Monitor collects OS statistics (system metrics) such as memory and swap space usage, processes, IO usage, and network related data. The Cluster Health Monitor collects information in real time and usually once a second. The Cluster Health Monitor collects OS statistics using OS API to gain performance and reduce the CPU usage overhead. The Cluster Health Monitor collects as much of system metrics and data as feasible that is restricted by the acceptable level of resource consumption by the tool.


Is the Cluster Health Monitor replacing OSWatcher?

The Cluster Health Monitor has many advantages over OSWatcher, and the most significant is that the Cluster Health Monitor runs in real time and usually once a second, so the Cluster Health Monitor will collect data even when OSWatcher cannot. However, there are some information such as top, traceroute, and netstat that the Cluster Health Monitor does not collect, so running the Cluster Health Monitor while running OSWatcher is ideal. Both tools complement each other rather than supplement.
On the other hand, if only one of the tools can be used, then Oracle recommends that the Cluster Health Monitor is used.


How much of overhead does the Cluster Health Monitor cause?

In today's server environment, the Cluster Health Monitor uses approximately less than 3% of the server's capacity for CPU. The overhead of using the Cluster Health Monitor is minimal.  However. CHM on the server with large number of disks or IO devices and more CPUs/memory would use more CPU than CHM on a server that does not have many disks and CPUs/memory.


What are the processes and components for the Cluster Health Monitor?

Cluster Logger Service (Ologgerd) – there is a master ologgerd that receives the data from other nodes and saves them in the repository (Berkeley database). It compresses the data before persisting to save the disk space. In an environment with multiple nodes, a replica ologgerd is also started on a node where the master ologgerd is not running. The master ologgerd will sync the data with replica ologgerd by sending the data to the replica ologgerd. The replica ologgerd takes over if the master ologgerd dies. A new replica ologgerd starts when the replica ologgerd dies. There is only one master ologgerd and one replica ologgerd per cluster. 
System Monitor Service (Sysmond) – the sysmond process collects the system statistics of the local node and sends the data to the master ologgerd. A sysmond process runs on every node and collects the system statistics including CPU, memory usage, platform info, disk info, nic info, process info, and filesystem info.
To find the master olggerd, one can use the following command:
oclumon manage -get master


What is definition of some of the files like *.bdb, _db.* , *.ldb , log.* files created by tool in the BDB (Berkeley Database) location directory ?

*.bdb & _db.* - These are files created for the berkeley db which stores the data collected.
log.* - These are berkeley bdb logfiles which preserve changes before making them to the db files. We have checkpointing setup and it reuses the log files.
*.ldb - This is the local logging file and MUST be present on all servers.
Do not delete above files except in case of trying to reduce the size of bdb file that get grow to a large size.  To reduce the size of bdb file, refer to the question "How can you reduce the size of bdb file that became big for any reason?" in this document.


参考文件:
Cluster Health Monitor (CHM) FAQ (Doc ID 1328466.1)
Oracle Cluster Health Monitor (CHM) using large amount of space (more than default) (Doc ID 1343105.1)

你可能感兴趣的:(Oracle,Oracle,RAC)