APPLIES TO:
Oracle Database - Enterprise Edition
HP-UX PA-RISC (64-bit)
HP-UX Itanium
Oracle Solaris on x86-64 (64-bit)
Linux x86-64
IBM AIX on POWER Systems (64-bit)
IBM: Linux on System z
Linux Itanium
Oracle Solaris on SPARC (64-bit)
Linux x86
MAIN CONTENT
Purpose
Trace File Analyzer Collector (TFA) is a diagnostic collection utility to simplify diagnostic data collection on Oracle Clusterware/Grid Infrastructure, RAC and Single Instance Database systems. TFA is similar to the diagcollection utility packaged with Oracle Clusterware in the fact that it collects and packages diagnostic data however TFA is MUCH more powerful than diagcollection with its ability to centralize and automate the collection of diagnostic information. TFA provides the following key benefits:
- Increase Productivity with Reduction in time and effort in diagnostic collection
- Diagnostic collection is performed with a single command executed from a single node
- Diagnostics collected include log and trace files for Grid Infrastructure/Oracle Clusterware, ASM, RAC and the OS
- Output is consolidated to a single node for ease of upload
- Reduce diagnostic data uploads to support by a factor of 10x or morein most cases
- Diagnostic files are isolated and trimmed to only include data for incident time
- Reduce SR Resolution times
- Proper diagnostics collection for incidents ensure complete diagnostic data collections thus minimizing the need for support to request additional data
How to Get Started with TFA
Note: For those new to TFA, be advised that in many cases it till take less time to install TFA (notwithstanding any change control procedures that you may have) and perform a TFA diagnostic collection than it would take to perform the traditional methods (manual) diagnostic collection.
- Download and Install TFA
Note: Starting with 11.2.0.4 and 12.1.0.2 Grid Infrastructure Patchset, TFA is installed with Grid Infrastructure. However, TFA is written outside of the GI/RAC/RDBMS product lines and as such could be used for any trace data, and is version agnostic.
-
- Perform Diagnostic Collection - The command below is diagcollect in its most simplistic form. It will collect and trim log and trace files for the past 4 hours, these logs/traces include: Clusterware/GI, ASM, Database (from all databases) OS, OSWatcher, CHMOS and Procwatcher. All diagnostic data will be trimmed to this 4 hour window and consolidated for upload to the node in which the diagcollection command was executed on. There are multiple arguments to diagcollection to isolate time periods and product components, a full listing can be found in the Diagnostics Collection section below as well as the TFA Users Guide.
# ./tfactl diagcollect
Requirements
Supported Platforms
TFA is supported on the following platforms:
Note: BASH shell 3.2 and JRE 1.5 or higher are required on all platforms.
- Intel Linux(Enterprise Linux, RedHat Linux, SUSE Linux)
- Linux Itanium
- zLinux
- Oracle Solaris SPARC
- Oracle Solaris x86-64
- AIX
- HPUX Itanium
- HPUX PA-RISC
Note: The JRE supplied with Grid Infrastructure may be used to fulfill the JRE requirement.
Supported Oracle Releases
As previously stated TFA is written outside of the GI/RAC/RDBMS product lines and as such could be used for any trace data, and is version agnostic.
Download TFA Collector
The current version of TFA is 12.1.2.1.2. This version of TFA and the associated TFA Users Guide is available by clicking the relevant Download link below.
The checksum of TFALite_<version>.zip file can be verified using any checksum utility and should match the cksum/md5sum output below:
$ cksum installTFALite_121212.zip
896609746 14931703 TFALite_121212.zip
Note: As mentioned in the TFA Requirements, JRE 1.5 or higher for the respective platform must be installed on ALL cluster nodes prior to installing TFA.
If you have Grid Infrastructure installed, the JRE included with GI should be used. If you are installing TFA on a system not running Grid Infrastructure the relevant JRE downloads are:
Note: TFA Collector can also be downloaded as part of the RAC and DB Support Tools Bundle (Doc ID 1594347.1)
Configuring
Please download and review the TFA Collector User Guide for detailed instructions on how to configure and run TFA. A Quick Start Guide is provided in the Instructions section of this Note.
What's new?
TFA v12.1.2.1.2
- Fix to address CVE-2014-3566 SSL V3.0 Poodle vulnerability. CVE-2014-3566 affects SSL V3.0 so all components that utlize ssl should be configured to disable SSL V3.0. TFA uses SSL for communication between TFA daemons and their clients. From TFA version 12.1.2.1.2 on SSL V3.0 had been disabled.
TFA v12.1.2.1.0
- TFA will now collect trace files even when they have not been allocated a type if their directory is of the right component and they have been updated within the time period requested
- There is now a tfactl shell that can be used to call commands. This is a beta feature that will be enhanced in upcoming releases to provide a flexible user interface
- TFA collections can now be stopped using ‘tfactl collection’
- Diagcollect will not return to the prompt until all nodes have finished their collections
- Critical Bug fixes
TFA v12.1.2.0.0
- Support for Oracle GI and RAC 12.1.0.2 trace files
- TFA Inventory performance improvements
- TFA Collection performance improvements
- New –oda and –odastorage flags for enhanced ODA data collections
- Collection of available oracheck and oratop reports
- More granular trace level adjustment
- Use of a 2048 bit certificate
- Critical Bug fixes
TFA v3.2
- Log Analyzer functionality
- Addition of –notrim for diagnostic collections
- Support for TFA on zLinux
- Collection of ExaWatcher data
- Sundiag collection
- Aging of TFA log files restricting maximum size
- Multiple performance enhancements
- Critical Bug fixes
TFA v3.1
- Non root user collections using TFA
- Collections from Exadata Storage Cells
- Collection of whole directories and bucket directories
- Collection of DBWLM directories and statistics
- Critical Bug fixes
TFA v2.5.1.6
- TFA pre collection inventory will only run an inventory for components to be collected
- Some file types now have the expected date format for timestamps in the file type xml. This means we do not have to determine this which improves inventory performance
- Collection of system data during diagnostic collection is now done in parallel with trace file trimming and collection.
- When TFA 2.5.1.6 is patched as part of a Grid Infrastructure installation or patch it will move the TFA_HOME to the Grid Home and the repository, database and logs to the Grid owners Oracle Base
- Initial install will now run a rediscovery of resources (databases etc) 5 minutes after installation before switching to running a six hour cycle
- Added confirmation when patching or uninstalling to ensure only the expected nodes are affected
- Installation issues on Single Instance and 12.1 Grid Infrastructure have been corrected
- Bug Fixes
TFA v2.5.1.5
- Improved security to allow sudo running of tfactl
- TFA JVM can now run on one of a number of available ports so will not fail if the preferred port is not available. Also the JVM can run on different ports on each node as required
- All new releases and patches after 2.5.1.5 will be able to be patched from a single node across the cluster without requiring ssh or local patch installs
- TFA can now be installed in a shared file system with the repository for diagnostic collections also supported on a shared file system
- Uninstall now does not remove the repository
- TFA will now collect and correctly classify compressed OSW files
- Additional file type support
- tfactl syntax has been modified to provide a more simple interface and some of the options have been removed from the help and documentation
- Diagcollect now collects last 4 hours worth of data from the time when diagnostic collections is run
- Diagnostic collection nomonitor option, which allows diagcollect to run in the background
- Support added for local uninstall of TFA
- Installation supports deferred discovery in order to speed up installation process
- Installation supports –javahome flag for silent and deferred installs
- TFA uses CRS to automatically add nodes from a GI cluster into TFA
- When installing TFA into GI_HOME – default repository, logs and database are placed under ORACLE_BASE
- Bug Fixes
TFA v2.5.1.3
Quick Start Guide
Note: For detailed instructions on how to install and run TFA please download and review the latest TFA Collector User Guide
Installing TFA
Note: Prior to installing TFA you MUST install JRE 1.5 or higher in the SAME location on ALL cluster nodes. If an existing installation of JRE 1.5 or higher exists on the systems and is installed on the same location on all cluster nodes, you may use the existing JRE installation. TFA can use JRE that is installed in Grid Infrastructure home.
Note: The recommended installation location for TFA is $ORACLE_BASE for Grid Infrastructure. For pre-11.2.0.4 and 12.1.0.1 systems it is NOT recommended to install TFA in the Grid Infrastructure Home. In this section, we would cover interactive installation of TFA within $ORACLE_BASE. For other installation scenarios, please consult the User Guide.
- Log into the system as the ROOT user
- Determine the appropriate location for the TFA installation, this location must exist on all cluster nodes. A viable location would be $ORACLE_base (for pre-11.2.0.4 and 12.1.0.1).
- Stage the installTFALite installer on node 1. After staging the installer you may query the available install flags via ./installTFALite -h:
# ./installTFALite -h
Usage for ./installTFALite
./installTFALite [-local][-deferdiscovery][-tfabase <install dir>][-javahom
-local - Only install on the local node
-deferediscovery - Discover Oracle trace directories after installati
-tfabase - Install into the directory supplied
-javahome - Use this directory for the JRE
-silent - Do not ask any install questions
-debug - Print debug tracing and do not remove TFA_HOME on
Note: Without parameters TFA will take you through an interview process for in
/tfa will be appended to -tfabase if it is not already there.
Note: The JRE supplied with Grid Infrastructure (eg., /u01/app/11.2.0/grid/jdk) may be used to fulfill the JRE requirement for the -javahome and would probably be the best option as it is in a predictable location.
- The TFA installer may be launched without any flags allowing for an interactive/interview based installation. For this example we will specify TFA Base and the Java Home:
Note: In this example we are installing TFA into $ORACLE_BASE and using the JDK (version 1.5) that ships with Grid Infrastructure. Do note that TFA will be installed in a directory called tfa within TFA Base.
# ./installTFALite -tfabase /u01/app/oracle -javahome /u01/app/11.2.0/grid/jdk
Starting TFA installation
Using JAVA_HOME : /u01/app/11.2.0/grid/jdk
Running Auto Setup for TFA as user root...
Would you like to do a [L]ocal only or [C]lusterwide installation ? [L|l|C|c] [C] :
- Generally speaking it is easiest to Perform a Clusterwide installation however this does temporarily require SSH equivalency to be configured for the root user. The TFA Installer will take care of the temporary SSH equivalency configuration and will remove that configuration once the install is complete. If for some reason you are unable to allow for SSH equivalency as root (temporarily), you may perform a local install of TFA on every node (Please see the TFA Users Guide on how to do so).
Would you like to do a [L]ocal only or [C]lusterwide installation ? [L|l|C|c] [C] : C
The following installation requires temporary use of SSH.
If SSH is not configured already then we will remove SSH
when complete.
Do you wish to Continue ? [Y|y|N|n] [Y] Y
Installing TFA now...
Discovering Nodes and Oracle resources
Checking whether CRS is up and running
Getting list of nodes in cluster . . . . .
List of nodes in cluster
1. cetrain01
2. cetrain02
Checking ssh user equivalency settings on all nodes in cluster
Node cetrain02 is configured for ssh user equivalency for root user
- Follow the remaining prompts to complete the installation.
Searching for running databases . . . . .
.
List of running databases registered in OCR
1. ORCL
. .
Checking Status of Oracle Software Stack - Clusterware, ASM, RDBMS
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
TFA Will be Installed on the Following Nodes:
++++++++++++++++++++++++++++++++++++++++++++
Install Nodes
=============
cetrain01
cetrain02
Do you wish to make changes to the Node List ? [Y/y/N/n] [N] N
TFA will scan the following Directories
++++++++++++++++++++++++++++++++++++++++++++
.-------------------------------------------------------------------------.
| cetrain02 |
+--------------------------------------------------------------+----------+
| Trace Directory | Resource |
+--------------------------------------------------------------+----------+
| /u01/app/11.2.0/grid/OPatch/crs/log | CRS |
| /u01/app/11.2.0/grid/cfgtoollogs | INSTALL |
| /u01/app/11.2.0/grid/crs/log | CRS |
....<snip>....
Installing TFA on cetrain01:
HOST: cetrain01 TFA_HOME: /u01/app/oracle/tfa/cetrain01/tfa_home
Installing TFA on cetrain02:
HOST: cetrain02 TFA_HOME: /u01/app/oracle/tfa/cetrain02/tfa_home
.-------------------------------------------------------------------------.
| Host | Status of TFA | PID | Port | Version | Build ID |
+-----------+---------------+-------+------+---------+--------------------+
| cetrain01 | RUNNING | 32242 | 5000 | 3.1 | 310020140205043544 |
| cetrain02 | RUNNING | 22098 | 5000 | 3.1 | 310020140205043544 |
'-----------+---------------+-------+------+---------+--------------------'
Running Inventory in All Nodes...
Enabling Access for Non-root Users on cetrain01...
Adding default users and groups to TFA Access list...
Summary of TFA Installation:
.--------------------------------------------------------------.
| cetrain01 |
+---------------------+----------------------------------------+
| Parameter | Value |
+---------------------+----------------------------------------+
| Install location | /u01/app/oracle/tfa/cetrain01/tfa_home |
| Repository location | /u01/app/oracle/tfa/repository |
| Repository usage | 4 MB out of 992 MB |
'---------------------+----------------------------------------'
.--------------------------------------------------------------.
| cetrain02 |
+---------------------+----------------------------------------+
| Parameter | Value |
+---------------------+----------------------------------------+
| Install location | /u01/app/oracle/tfa/cetrain02/tfa_home |
| Repository location | /u01/app/oracle/tfa/repository |
| Repository usage | 1 MB out of 992 MB |
'---------------------+----------------------------------------'
TFA is successfully installed...
Starting and Stopping TFA
TFA runs out of init on Linux and Unix platforms so that it will be started automatically whenever a server is started. The Name and location of the TFA init script (init.tfa) is dependent on the platform:
- Linux and Solaris: /etc/init.d/init.tfa
- Aix: /etc/init.tfa
- HP-UX: /sbin/init.d/init.tfa
Note: All examples below are on the Linux platform, the location of the init.tfa script for other platforms is displayed above.
To start TFA execute the init.tfa script with the start argument as the root user:
# /etc/init.d/init.tfa start
To stop TFA execute the init.tfa script with the stop argument as the root user:
# /etc/init.d/init.tfa stop
To restart TFA (stop and restart in a single command) execute the init.tfa script with the restart argument as the root user:
# /etc/init.d/init.tfa restart
Diagnostics Collection
The diagnostic collection module is invoked by tfactl with the diagcollect verb. The diagcollection module has the capability to take a number of different parameters to allow the user to control how large or detailed the required evidence set is. It is possible to provide a specific time of an incident or a time range for data to be collected. It is also possible to collect whole files that have relevant data or "trim" the files in the evidence set to just a time slice of data.
Note: With no flags specified the default is for tfactl diagcollect to collect files from all nodes for all components where the file has been updated in the last 4 hours and will trim files it considers excessive. If an incident occurred prior to this period then the parameters documented below should be used to target the correct data collection.
The tfactl parameters and examples for data collection can be viewed by invoking "tfactl diagcollect -h":
# ./tfactl diagcollect -h
./tfactl diagcollect -h
/u01/app/11.2.0/grid/tfa/bin/tfactl diagcollect [-all | -database <all|d1,d2..> | -asm | -crs | -dbwlm | -acfs | -os | -install | -cfgtools | -chmos<all | local | n1,n2,..>] [-tag <description>] [-z <filename>] [-since <n><h|d>| -from <time> -to <time> | -for <time>] [-nocopy] [-notrim] [-nomonitor] [-1,dir2..>]
Options:
-all Collect all logs (If no time is given for collection then files
for the last 4 hours will be collected)
-crs Collect CRS logs
-dbwlm Collect DBWLM logs
-acfs Collect ACFS logs
-asm Collect ASM logs
-database Collect database logs from databases specified
-os Collect OS files such as /var/log/messages
-install Collect Oracle Installation related files
-cfgtools Collect CFGTOOLS logs
-chmos Collect CHMOS files (Note that this data can be large for
longer durations)
-nochmos Do not collect CHMOS data when it would normally have been
collected
-sundiag Collect sundiag logs
-node Specify comma separated list of host names for collection
-nocopy Does not copy back the zip files to initiating node from all nodes
-notrim Does not trim the files collected
-nomonitor This option is used to submit the diagcollection as a background
process
-collectalldirs Collect all files from a directory marked "Collect All"
flag to true
-collectdir Specify comma separated list of directories and collection will
include all files from these irrespective of type and time constraints
in addition to components specified
-since <n><h|d> Files from past 'n' [d]ays or 'n' [h]ours
-from "MMM/dd/yyyy hh:mm:ss" From <time>
-to "MMM/dd/yyyy hh:mm:ss" To <time>
-for "MMM/dd/yyyy" For <date>.
-tag <tagname> The files will be collected into tagname directory inside
repository
-z <zipname> The files will be collected into tagname directory with the
specified zipname
Examples:
/u01/app/11.2.0/grid/tfa/bin/tfactl diagcollect
Trim and Zip all files updated in the last 4 hours as well as chmos/osw data
from across the cluster and collect at the initiating node
Note: This collection could be larger than required but is there as the
simplest way to capture diagnostics if an issue has recently occurred.
/u01/app/11.2.0/grid/tfa/bin/tfactl diagcollect -all -since 8h
Trim and Zip all files updated in the last 8 hours as well as chmos/osw data
from across the cluster and collect at the initiating node
/u01/app/11.2.0/grid/tfa/bin/tfactl diagcollect -database hrdb,fdb -since 1d -z foo
Trim and Zip all files from databases hrdb & fdb in the last 1 day and
collect at the initiating node
/u01/app/11.2.0/grid/tfa/bin/tfactl diagcollect -crs -os -node node1,node2 -since 6h
Trim and Zip all crs files, o/s logs and chmos/osw data from node1 & node2
updated in the last 6 hours and collect at the initiating node
/u01/app/11.2.0/grid/tfa/bin/tfactl diagcollect -asm -node node1 -from Mar/4/2013 -to "Mar/5/2013 21:0
Trim and Zip all ASM logs from node1 updated between from and to time and
collect at the initiating node
/u01/app/11.2.0/grid/tfa/bin/tfactl diagcollect -for "Mar/2/2013"
Trim and Zip all log files updated on "Mar/2/2013" and collect at the
initiating node
/u01/app/11.2.0/grid/tfa/bin/tfactl diagcollect -for "Mar/2/2013 21:00:00"
Trim and Zip all log files updated from 09:00 on March 2 to 09:00 on March 3
(i.e. 12 hours before and after the time given) and collect at the initiating node
/u01/app/11.2.0/grid/tfa/bin/tfactl diagcollect -crs -collectdir /tmp_dir1,/tmpdir_2
Trim and Zip all crs files updated in the last 4 hours
Also collect all files from /tmp_dir1 and /tmp_dir2 at the initiating node
In the following example, we use the -all, -from and -to switches to tfactl diagcollect which tells TFA to collect diagnostic logs of all types from midnight Feb 5th to 13:00 on Feb 5th. The command will launch the specified diag collection on ALL cluster nodes and neatly places them in a zip file for each node under $TFA_HOME/repository on the TFA master node (node 1 in our case).
# ./tfactl diagcollect -all -from "Feb/05/2014" -to "Feb/05/2014 13:00:00"
Collecting data for all components using above parameters...
Collecting data for all nodes
Scanning files from Feb/05/2014 to Feb/05/2014 13:00:00
Repository Location in cetrain01 : /u01/app/oracle/tfa/repository
2014/02/07 13:47:45 EST : Running an inventory clusterwide ...
2014/02/07 13:47:45 EST : Collection Name : tfa_Fri_Feb_7_13_47_41_EST_2014.zip
2014/02/07 13:47:45 EST : Sending diagcollect request to host : cetrain02
2014/02/07 13:47:55 EST : Run inventory completed locally ...
2014/02/07 13:47:55 EST : Getting list of files satisfying time range [02/05/2014 00:00:00 EST, 02/05/2014 13:00:00 EST]
2014/02/07 13:47:55 EST : Collecting extra files...
2014/02/07 13:48:00 EST : cetrain01: Zipping File: /u01/app/oracle/diag/asm/+asm/+ASM1/trace/+ASM1_asmb_2246.trc
2014/02/07 13:48:00 EST : cetrain01: Zipping File: /u01/app/oracle/diag/asm/+asm/+ASM1/trace/alert_+ASM1.log
...<snip>...
2014/02/07 13:48:16 EST : Collecting ADR incident files...
2014/02/07 13:49:22 EST : Total Number of Files checked : 3155
2014/02/07 13:49:22 EST : Total Size of all Files Checked : 2.4GB
2014/02/07 13:49:22 EST : Number of files containing required range : 74
2014/02/07 13:49:22 EST : Total Size of Files containing required range : 289MB
2014/02/07 13:49:22 EST : Number of files trimmed : 23
2014/02/07 13:49:22 EST : Total Size of data prior to zip : 74MB
2014/02/07 13:49:22 EST : Saved 216MB by trimming files
2014/02/07 13:49:22 EST : Zip file size : 2.9MB
2014/02/07 13:49:22 EST : Total time taken : 97s
2014/02/07 13:49:22 EST : Completed collection of zip files.
Logs are collected to:
/u01/app/oracle/tfa/repository/collection_Fri_Feb_7_13_47_41_EST_2014_node_all/cetrain01.tfa_Fri_Feb_7_13_47_41_EST_2014.zip
/u01/app/oracle/tfa/repository/collection_Fri_Feb_7_13_47_41_EST_2014_node_all/cetrain02.tfa_Fri_Feb_7_13_47_41_EST_2014.zip
By default diagcollection does display the collection status in the foreground (as shown above), this behavior may be modified by specifying the -nomonitor flag with diagcollection:
# ./tfactl diagcollect -all -from "Feb/05/2014" -to "Feb/05/2014 13:00:00" -nomonitor
Please use "/u01/app/oracle/tfa/bin/tfactl print actions" to monitor the status of this run.
Collecting data for all components using above parameters...
Collecting data for all nodes
Scanning files from Feb/05/2014 to Feb/05/2014 13:00:00
Logs are collected to:
/u01/app/oracle/tfa/repository/collection_Fri_Feb_7_13_53_20_EST_2014_node_all/cetrain01.tfa_Fri_Feb_7_13_53_20_EST_2014.zip
/u01/app/oracle/tfa/repository/collection_Fri_Feb_7_13_53_20_EST_2014_node_all/cetrain02.tfa_Fri_Feb_7_13_53_20_EST_2014.zip
Note: When the -nomonitor flag has been specificed, the status of diagcollect can be determined with "tfactl print actions":
# ./tfactl print actions
./tfactl print actions
.-----------------------------------------------------------------------------.
| HOST | TIME | ACTION | STATUS | COMMENTS |
+-----------+--------------+----------------+-----------+---------------------+
| cetrain01 | Feb 07 13:53 | Run inventory | COMPLETE | -c:RDBMS |
| | | | | all:ASM:CRS:DBWLM:A |
| | | | | CFS:CRS:ASM:OS:INST |
| | | | | ALL:TNS:CHMOS |
+-----------+--------------+----------------+-----------+---------------------+
| cetrain01 | Feb 07 13:53 | Collect traces | RUNNING | Collection details: |
| | | & zip | | |
| | | | | Zip file: |
| | | | | tfa_Fri_Feb_7_13_53 |
| | | | | _20_EST_2014.zip |
| | | | | Tag: |
| | | | | collection_Fri_Feb_ |
| | | | | 7_13_53_20_EST_2014 |
| | | | | _node_all |
+-----------+--------------+----------------+-----------+---------------------+
| cetrain02 | Feb 07 13:53 | Collect traces | REQUESTED | Collection details: |
| | | & zip | | |
| | | | | Zip file: |
| | | | | tfa_Fri_Feb_7_13_53 |
| | | | | _20_EST_2014.zip |
| | | | | Tag: |
| | | | | collection_Fri_Feb_ |
| | | | | 7_13_53_20_EST_2014 |
| | | | | _node_all |
+-----------+--------------+----------------+-----------+---------------------+
| cetrain02 | Feb 07 13:53 | Run inventory | REQUESTED | RDBMS |
| | | | | all:ASM:CRS:DBWLM:A |
| | | | | CFS:CRS:ASM:OS:INST |
| | | | | ALL:TNS:CHMOS |
'-----------+--------------+----------------+-----------+---------------------'
Automatic Diagnostic Collection
TFA does have the capability to automatically perform diagnostics collection when an incident is detected by TFA. This feature eases the diagnostic collection burdon by covers the situations where diagnostic data would have rotated past the point of this After the initial inventory is completed by TFA all files that are determined to be Alert Logs are monitored in real time so that TFA can take action when certain messages are seen. By default these logs are RDBMS alert logs, ASM alert logs and CRS alert logs. When specific strings (pre-defined in TFA) are found in the logs information on the strings is saved to the Berkeley Database and when enabled, Automatic Diagnostic Collection will be triggered. Exactly what is collected is dependent on the string match that was found but potentially trimmed versions of all O/S, CRS, ASM and RDBMS logs could be collected for each string. Clearly this type of operation could be called many times a second if not controlled so there can never be a collection generated at less than 5 minute intervals due to the TFA implementation of flood control. Also note that if the TFA repository reaches its repository max size then Automatic Diagnostic Collection will be disabled until sufficient space is free.
Managing Automatic Diagnostic Collection
To Enable or Disable Automatic Diagnostic Collection (disabled by default), the autodiagcollect TFA parameter simply needs to be set to ON or OFF. When set to OFF (default) automatic diagnostic collection will be disabled. If set to ON, diagnostics will be collected when the required search strings (see User Guide) are detected whilst scanning the alert logs. To set automatic collection for all nodes of the TFA cluster the -c flag must be used:
# ./tfactl set autodiagcollect=ON -c
Successfully set autodiagcollect=ON
.---------------------------------------------------.
| cetrain01 |
+-------------------------------------------+-------+
| Configuration Parameter | Value |
+-------------------------------------------+-------+
| TFA version | 3.1 |
| Automatic diagnostic collection | ON |
| Trimming of files during diagcollection | ON |
| Repository current size (MB) in cetrain01 | 15 |
| Repository maximum size (MB) in cetrain01 | 992 |
| Trace level | 1 |
'-------------------------------------------+-------'
The trimming of files collected by Automatic Diagnostic Collection can be controlled by the trimfiles TFA parameter. When ON (default) files will be trimmed to only include data around the time of the event, when OFF any file that was written to at the time of the event (that also pertains to the event as determined by TFA) will be collected in its entirety. To set trimfiles for all nodes of the TFA cluster the ‘-c’ flag must be used:
# ./tfactl set trimfiles=ON -c
Successfully set trimfiles=ON
.---------------------------------------------------.
| cetrain01 |
+-------------------------------------------+-------+
| Configuration Parameter | Value |
+-------------------------------------------+-------+
| TFA version | 3.1 |
| Automatic diagnostic collection | ON |
| Trimming of files during diagcollection | ON |
| Repository current size (MB) in cetrain01 | 15 |
| Repository maximum size (MB) in cetrain01 | 992 |
| Trace level | 1 |
'-------------------------------------------+-------'
TFA Repository Space Management
By default, maximum repository size is 10 GB or 50% of the filesystem size that the TFA Repository resides on (whichever one is greater). If the TFA repository filesystem free space is reduced to 1GB the TFA respository will be closed until space is freed. There is no in-built automatic way to manage TFA repository space at this point. Please make sure to monitor TFA repository space usage. The "tfactl print config" command shows current and maximum size of repository.
# ./tfactl print config
.---------------------------------------------------.
| cetrain01 |
+-------------------------------------------+-------+
| Configuration Parameter | Value |
+-------------------------------------------+-------+
| TFA version | 3.1 |
| Automatic diagnostic collection | ON |
| Trimming of files during diagcollection | ON |
| Repository current size (MB) in cetrain01 | 15 |
| Repository maximum size (MB) in cetrain01 | 992 |
| Trace level | 1 |
'-------------------------------------------+-------'
To remove files from the TFA Repostitory the "tfactl purge" command should be used with the -older #[d][h] (d is days, h is hours). To remove diagcollections older than 30 days you would execute:
./tfactl purge -older 30d
Patching/Upgrading TFA Collector
Once TFA Collector 2.5.1.6 has been installed patching is accomplished by running the install script for the new version on one node and the other nodes will be patched automatically with no requirement for ssh root user equivalence.
Please see the TFA Collector User Guide for more details and how to patch older versions of TFA Collector.
TFA commands at a glance
Note: For detailed information regarding the avilable TFA commands please refer to the TFA Collector User Guide which is available in the Download section of this document.
Overall help for TFA can be displayed with:
# ./tfactl -h
Usage : /u01/app/11.2.0/grid/tfa/bin/tfactl <command> [options]
<command> =
start Starts TFA
stop Stops TFA
enable Enable TFA Auto restart
disable Disable TFA Auto restart
print Print requested details
access Add or Remove or List TFA Users and Groups
purge Delete collections from TFA repository
directory Add or Remove or Modify directory in TFA
host Add or Remove host in TFA
diagcollect Collect logs from across nodes in cluster
analyze List events summary and search strings in alert logs.
set Turn ON/OFF or Modify various TFA features
uninstall Uninstall TFA from this node
For help with a command: /u01/app/11.2.0/grid/tfa/bin/tfactl <command> -help
Print Details on the TFA configuration:
# ./tfactl print -h
Usage: /u01/app/oracle/tfa/bin/tfactl print [status|config|directories|hosts|actions|repository]
Prints requested details.
Options:
status Print status of TFA across all nodes in cluster
config Print current TFA config settings
directories Print all the directories to Inventory
hosts Print all the Hosts in the Configuration
actions Print all the Actions requested and their status
repository Print the zip file repository information
Add/Remove/Modify directories known to TFA:
# ./tfactl directory -h
Usage: /u01/app/oracle/tfa/bin/tfactl directory add <dir> [ -public ] [ -exclusions | -noexclusions | -collectall ] [ -node <all | n1,n2..> ]
Usage: /u01/app/oracle/tfa/bin/tfactl directory remove <dir> [ -node <all | n1,n2..> ]
Usage: /u01/app/oracle/tfa/bin/tfactl directory modify <dir> [ -private | -public ] [ -exclusions | -noexclusions | -collectall ]
Add or Remove or Modify directory in TFA
-private This flag implies that if the person executing this command
does not have privileges to see files in those underlying directories
then they will not be able to collect those files.
-public The Oracle directories are public as this is the purpose of the tool
to allow for collection of these files.
-exclusions This flag implies that files in this directory are eligible for
collection satisfying type and time range.
-noexclusions This flag implies that files in this directory are eligible for
collection satisfying time range.
-collectall This flag implies that all files in this directory are eligible for
collection irrespective of type and time range when "-collectalldirs"
flag is specified in diagcollection.
Examples:
/u01/app/oracle/tfa/bin/tfactl directory add /u01/app/grid/diag/asm/+ASM1/trace
/u01/app/oracle/tfa/bin/tfactl directory remove /u01/app/grid/diag/asm/+ASM1/trace
/u01/app/oracle/tfa/bin/tfactl directory remove /u01/app/grid/diag/asm/+ASM1/trace -node all
/u01/app/oracle/tfa/bin/tfactl directory modify /u01/app/grid/diag/asm/+ASM1/trace -private
/u01/app/oracle/tfa/bin/tfactl directory modify /u01/app/grid/diag/asm/+ASM1/trace -noexclusions
/u01/app/oracle/tfa/bin/tfactl directory modify /u01/app/grid/diag/asm/+ASM1/trace -private -noexclusions
/u01/app/oracle/tfa/bin/tfactl directory modify /tmp/for_support -private -collectall
Purge TFA Repository data:
# ./tfactl purge -h
Usage: /u01/app/oracle/tfa/bin/tfactl purge -older x[h|d]
Remove file(s) from repository that are older than the time specified.
Examples:
/u01/app/oracle/tfa/bin/tfactl purge -older 30d - To remove file(s) older than 30 days.
/u01/app/oracle/tfa/bin/tfactl purge -older 10h - To remove file(s) older than 10 hours.
Add/Remove Hosts within the TFA Configuration:
# ./tfactl host -h
Usage: /u01/app/oracle/tfa/bin/tfactl host [ add <hostname> | remove <hostname> ]
Add or Remove a host in TFA
Examples:
/u01/app/oracle/tfa/bin/tfactl host add myhost.domain.com
/u01/app/oracle/tfa/bin/tfactl host remove myhost.domain.com
Modify TFA Parameters:
]# ./tfactl set -h
Usage: /u01/app/oracle/tfa/bin/tfactl set [ autodiagcollect=<ON | OFF> | trimfiles=<ON | OFF> | tracelevel=<1 | 2 | 3 | 4> | reposizeMB=<n> | repositorydir=<dir> ] [-c]
Turn ON/OFF or Modify various TFA features
autodiagcollect allow for automatic diagnostic collection when an event
is observed (default OFF)
trimfiles allow trimming of files during diagcollection (default ON)
tracelevel control the trace level of log files in /u01/app/oracle/tfa/cetrain33/tfa_home/log
(default 1)
repositorydir=<dir> set the diagcollection repository to <dir>
reposizeMB=<n> set the maximum size of diagcollection repository to <n>MB
-c set the value on all nodes (Does not apply to repository
settings)
Examples:
/u01/app/oracle/tfa/bin/tfactl set autodiagcollect=ON
/u01/app/oracle/tfa/bin/tfactl set tracelevel=4
/u01/app/oracle/tfa/bin/tfactl set reposizeMB=20480
Add and Remove TFA users/groups
# ./tfactl access -h
Usage: /u01/app/oracle/tfa/bin/tfactl access [ lsusers |
add -user <user_name> [ -group <group_name> ] [ -local ] |
remove -user <user_name> [ -group <group_name> ] [ -all ] [ -local ] |
block -user <user_name> [ -local ] |
allow -user <user_name> [ -local ] |
enable [ -local ] | disable [ -local ] |
reset [ -local ] | removeall [ -local ]
Add or Remove or List TFA Users and Groups
Options:
lsusers Print all the TFA Users and Groups
enable Enable TFA access for Non-root users
disable Disable TFA access for Non-root users
add Add user or a group
remove Remove user or a group
block Block TFA Access for Non-root User
allow Allow TFA Access for Non-root User who was blocked before
reset Reset to default TFA Users and Groups
removeall Remove All TFA Users and Groups
Examples:
/u01/app/oracle/tfa/bin/tfactl access add -user abc
User 'abc' is able to access TFA accross cluster
/u01/app/oracle/tfa/bin/tfactl access add -group xyz -local
All members of group 'xyz' will be able to access TFA on localhost
/u01/app/oracle/tfa/bin/tfactl access remove -user abc
User 'abc' will not be to access TFA
/u01/app/oracle/tfa/bin/tfactl access block -user xyz
Access to user 'xyz' will be blocked
/u01/app/oracle/tfa/bin/tfactl access remove -all
All TFA Users and groups will be removed
Uploading Files to Service Requests in MOS
Regardless of the Diagnostic Collection method used (Manual or Automatic), tfactl diagcollect collects and stores collected diagnostics in the repository. Repository location can be obtained using the “tfactl print repository” command. The files that reside in the repository directory are those that would be uploaded to MOS for a particular incident. If TFA is installed in Grid Infrastructure home, repository would be located in ORACLE_BASE/tfa directory:
# ls -al /u01/app/oracle/tfa/repository
total 56
drwxrwxrwx. 7 root root 4096 Feb 7 13:54 .
drwxr-xr-x. 5 root root 4096 Feb 7 13:43 ..
drwxr-xr-x. 2 oracle oinstall 4096 Feb 7 12:08 collection_Fri_Feb_7_12_07_11_EST_2014_node_all
drwxr-xr-x. 2 root root 4096 Feb 7 13:47 collection_Fri_Feb_7_13_47_41_EST_2014_node_all
drwxr-xr-x. 2 root root 4096 Feb 7 13:53 collection_Fri_Feb_7_13_53_20_EST_2014_node_all
drwxr-xr-x. 2 root root 4096 Feb 6 11:15 collection_Thu_Feb_6_11_13_19_EST_2014_node_all
drwxr-xr-x. 2 root root 4096 Jan 14 17:01 collection_Tue_Jan_14_16_59_38_EST_2014_node_all
Known Issues
1. Upper-case TFA directory name leads to an error during install. Example:
Installing TFA on ao1
HOST: ao1 TFA_HOME: /u01/app/TFA/ao1/tfa_home
Can't open file /u01/app/TFA/ao1/bin/tfactl: Not a directory <---- here is the error
Removing TFA Setup files...
Fixed: version 2.5.1.6.
Work-around for earlier version: change the directory name from TFA to tfa (lower-case)
2. Bug 18123522 : TFA SSL CERTIFICATE EXPIRY IN PUBLIC.JKS
Fixed: version 3.1