【Troubleshooting】用户组/权限_导致grid日志不轮循清理_Bug 9595783

解决方法:

1、修改用户组/权限 root:root

cd $GRID_HOME/log/`hostname`/crsd ;

chown root:root *

cd $GRID_HOME/log/`hostname`/ohasd;

chown root:root *

cd $GRID_HOME/log/`hostname`/agent/crsd/orarootagent_root;

chown root:root *

cd $GRID_HOME/log/`hostname`/agent/ohasd/orarootagent_root;

chown root:root *

 

2、重启cluster,使rotation生效

1).  Shutdown CRS on thenode reporting the problem.

# crsctl stop crs

2).  Once CRS isdown,  proceed to manually delete the 'ocssd.l10' file, or copy thelogfile to another location if you need to keep a backup.

# rm $GRID_HOME/log/<hostname>/cssd/ocssd.l10

3).  Startup Clusterwareagain

# crsctl start crs

 

如果不能停机,则使用如下命令来释放空间:

echo 0> ocssd.l10     -- ocssd.l10为要清理的文件名

 

参考文档:

GI ocssd.logrotation fails with error LFI-00142 and logfile grows to huge size (文档 ID 1900986.1)

In this Document

Symptoms

 

Changes

 

Cause

 

Solution

 

References

APPLIES TO:

Oracle Database - Enterprise Edition - Version 11.2.0.2 to 11.2.0.4 [Release 11.2]
Information in this document applies to any platform.

SYMPTOMS

GI logfile "ocssd.log"  grows beyond the default file size.

The logfile rotation fails with error LFI-00142: Unable to delete an existing file [ocssd][110] not owned by Oracle.

This problem can cause file system space issues!


Example ->   ocssd.log located in $GRID_HOME/log/<hostname>/cssd/  below has grown to 5Gb in size

-rw-r--r-- 1 grid oinstall 5517323953 Jun  6 07:57 ocssd.log                ---->  Here!
-rw-rw-r-- 1 grid oinstall     483269 Jun  5 15:37 cssdOUT.log
-rw------- 1 grid oinstall   74092544 Jan 31 10:36 core.9352
-rw-r--r-- 1 grid oinstall  158399110 Jan 29 14:35 ocssd.l01
-rw-r--r-- 1 grid oinstall  158423139 Jan 24 17:00 ocssd.l02
-rw-r--r-- 1 grid oinstall  158422898 Jan 18 06:26 ocssd.l03
-rw-r--r-- 1 grid oinstall  158402809 Jan 11 20:54 ocssd.l04
-rw-r--r-- 1 grid oinstall  158413241 Jan  5 13:07 ocssd.l05
-rw-r--r-- 1 grid oinstall  158413772 Dec 30 17:44 ocssd.l06
-rw-r--r-- 1 grid oinstall  158404163 Dec 24 15:57 ocssd.l07
-rw-r--r-- 1 grid oinstall  158391594 Dec 18 16:22 ocssd.l08
-rw-r--r-- 1 grid oinstall  158406347 Dec 12 22:09 ocssd.l09
-rw-r--r-- 1 grid oinstall  158420893 Dec  6 12:09 ocssd.l10

 

In the above example the expected file size should not exceed 150mb. Query CSS as follows to confirm logfile size limit -->

% crsctl get css logfilesize
CRS-4676: Successful get logfilesize 157286400 for Cluster Synchronization Services.

  

CHANGES

 None

CAUSE

It is caused by unpublished Bug 18700935 - CLOUD:ACLDX0085 OCSSD LOG IS NOT ROTATED

At some point in time, the Clusterware alert log reports an attempted logfile rotation failure.

As a result, the last logfile 'ocssd.110' is never deleted. This may be due to the logfile being open during logfile delete or a permissions issue on the file itself.  

The ocssd.bin thread that performs log file rotation 'clsd_logThread' encounters the delete failure and this causes the logfile never to be deleted/rotated, resulting in ocssd.log continually growing in size.


Extract of the error in $GRID_HOME/log/<hostname>/alert<hostname>.log

[cssd(29355)]CRS-1713:CSSD daemon is started in clustered mode
2014-06-05 15:37:44.512:
[cssd(29355)]CRS-0009:log file "/u01/app/11.2.0.3/grid/log/ed28db01/cssd/ocssd.log" reopened
2014-06-05 15:37:44.512:
[cssd(29355)]CRS-0019:file rotation terminated. log file: "/u01/app/11.2.0.3/grid/log/ed28db01/cssd/ocssd.log"
[cssd(29355)]CRS-0014:An error occurred while attempting to delete file "/u01/app/11.2.0.3/grid/log/ed28db01/cssd/ocssd.l10" during log file rotation. Additional diagnostics:LFI-00142: Unable to delete an existing file [ocssd][l10] not owned by Oracle.

 --用户组不对导致文件轮循清理失败

 

SOLUTION

The CSSD thread that encountered the LFI-00142 error needs to be restarted to ensure log rotation works again.

Manually deleting the logfile will not resolve the log rotation problem.


1).  Shutdown CRS on the node reporting the problem.

# crsctl stop crs

2).  Once CRS is down,  proceed to manually delete the 'ocssd.l10' file, or copy the logfile to another location if you need to keep a backup.

# rm  $GRID_HOME/log/<hostname>/cssd/ocssd.l10

3).  Startup Clusterware again

# crsctl start crs

 

If you are NOT able to schedule downtime and file size growth in the GRID Home is causing a space issue then copy the logs to another location and do the following

% echo 0 > ocssd.l10

  
Please note this does not resolve the log rotation problem but only allows you to free up some space.


4). Bug 18700935 has been fixed in 11.2.0.4.5 PSU for Unix/Linux platform and 11.2.0.4.12 Bundle for Windows platform. Please apply the patch if required.

 

Bug 9595783  Owner andpermission problemsafter applying a patch to an 11.2.0.1 Grid InfrastructureHome

 This note givesa brief overview of bug 9595783. 
 The content was last updated on: 02-DEC-2013
 Click here fordetails ofeach of the sections below.

Affects:

Product (Component)

Oracle Server (Rdbms)

Range of versions believed to be affected

Versions >= 11.2.0.1 but BELOW 11.2.0.2

Versions confirmed as being affected

  • 11.2.0.1

Platforms affected

Generic (all / most platforms affected)

Fixed:

The fix for 9595783 is first included in

  • 11.2.0.2 (Server Patch Set)


Interim patches may be available for earlierversions - click here tocheck.

Symptoms:

Related To:

  • Install/Patching Is Not Performed Correctly
  • Wrong OS file/directory Permissions
  • Cluster Ready Services / Parallel Server Management

Description

Whileapplying a patch to a Grid Infrastructure environment,

ownersand permissions of several files and directories are

temporarilychanged by a script called rootcrs.pl. Those files

anddirectories are changed back to its original owners/permissions

whenpatch application completes, but there are some files and

directoriesthat remained changed.

 

Thishas caused problems such as:

 

A:incorrect permission set for oradism after patching

 

B:incorrect permission for log files or log directories

   under $GI_HOME/log/, causing issues such aslog filenot being

   outputted at all or not being backed up to.l01, .l02,etc.

 

 

 

Workaround

 Manually change the file/directory owners andpermissions to

 their original values.

 eg:

  For issue A above:

    chown root:oinstall $GRID_HOME/bin/oradism

    chmod 4750 $GRID_HOME/bin/oradism

 

  For issue B above:

   To ensure that the log files of processeslike crsd,orarootagent

   are owned correctly you can use the chowncommand likebelow:

    cd $GRID_HOME/log/`hostname`/crsd ;

    chown root:root *

    cd $GRID_HOME/log/`hostname`/ohasd ;

    chown root:root *

    cd$GRID_HOME/log/`hostname`/agent/crsd/orarootagent_root;

    chownroot:root *

   cd$GRID_HOME/log/`hostname`/agent/ohasd/orarootagent_root ;

    chown root:root *

 

    (If the "hostname" commandreturns astring which also includes

     the domain suffix then just use thelocalhostname in the above

     commands without any domain portion)

 

Please note: The above is a summary description only. Actual symptoms can vary. Matching to any symptoms here does not confirm that you are encountering this problem. For questions about this bug please consult Oracle Support.

References

Bug:9595783 (This link willonly work forPUBLISHED bugs)
Note:245840.1 Information onthe sectionsin this article

 http://nervinformatica.com.br/Downloads/raccheck_nerv01_RAC01_092513_055913.html


Sharon

2015.05.12

----------------------------------------------------------------------------------------------

转载须注明出处!

http://blog.csdn.net/sharqueen_wu/article/details/45668011

转载须注明出处!

http://blog.csdn.net/sharqueen_wu/article/details/45668011

转载须注明出处!

http://blog.csdn.net/sharqueen_wu/article/details/45668011



你可能感兴趣的:(oracle,grid,ohasd,crsd)