Oracle Grid Cluster Health Monitor (oclumon)

The Oracle Grid Cluster Health Monitor (CHM) stores operating system metrics in the CHM repository for all nodes in a RAC cluster. It stores information on CPU, memory, process, network and other OS data. This information can later be retrieved and used to troubleshoot and identify any cluster related issues. It is a default component of the 11gr2 grid install. The data is stored in the master repository and also replicated to a standby repository on a different node.

Oracle Grid OCLUMON commands

This Oracle Grid OCLUMON utility is used to manage the repository. it can be used to identify any thresholds exceeded events and can be scripted to alert using this information.

Extract data

a. Over a specific time period. In this case 12 hours.
oclumon dumpnodeview -allnodes -last "12:00:00" b. Over last thirty minutes.
oclumon dumpnodeview -allnodes -last "00:30:00" c. By alert event.
oclumon dumpnodeview -v -allnodes -alert -last "01:00:00" d. Check network related issues for the last 15 minutes.
oclumon showtrail -n node1 -nicid eth0 effectivebw errors
-c "red" "yellow" "orange" -last "00:15:00"
e. All information on the CPU for the last 15 minutes.
oclumon showtrail -n node1 -sys usagepc cpuqlen cpunumprocess
memfree numrt numofiosps openfds -c "red" "yellow"
"orange" -last "00:15:00"
Information for a particular incident can also be collected and uploaded to Oracle support using the diagcollection script.
$GRID_HOME/bin/ -collect -chmos
-incidenttime inc_time -incidentduration duration

Repository Commands

The following provide some miscellaneous information.

a. What is the repository path?
oclumon manage -get reppath
CHM Repository Path = /opt/oracle/grid/AFNPOL
b. Which one is the master?
oclumon manage -get master
Master = AFNPOL1
c. What is the repository size?
oclumon manage -get repsize
CHM Repository Size = 93400

OCLUMON Maintenance

The following can be used to modify the repository size and the retention time.

a. Modify Repository location.
oclumon manage -repos reploc /nfs1/oltp/chm b. Resize Repository size.
oclumon manage -db resize 43200

Read how to recover a Local Oracle Registry in a Grid Cluster.

Leave a Reply

Your email address will not be published.