Roger的数据库恢复专栏

11.2.0.3 RAC（VCS）节点crash以及hang的问题分析

昨天某个客户的一套双节RAC其中一个节点crash,同时最后导致另外一个节点也hang住,只能shutdown abort.
且出现shutdown abort实例之后，还有部分进程无法通过kill -9 进行kill的情况。其中有lgwr，arch等进程.

首先我们来看下，在下午出现crash的节点的alert log信息：

 
     ? 
    
          Tue Apr 22 17:16:04 2014 
         
          Deleted Oracle managed file /oa1/arch/AUTHORCL/archivelog/2014_03_14/o1_mf_2_4878_9l51y1cc_.arc 
         
          Deleted Oracle managed file /oa1/arch/AUTHORCL/archivelog/2014_03_14/o1_mf_2_4879_9l529hc6_.arc 
         
          Archived Log entry 10847 added  
          for 
          thread 1 sequence 5314 ID 0xffffffffae21a60f dest 1: 
         
          Tue Apr 22 17:25:05 2014 
         
          IPC Send timeout detected. Sender: ospid 27573 [oracle@xhdb-server3 (LMON)] 
         
          Receiver: inst 2 binc 95439 ospid 13752 
         
          Communications reconfiguration: instance_number 2 
         
          Tue Apr 22 17:26:49 2014 
         
          LMON (ospid: 27573) has not called a wait  
          for 
          89 secs. 
         
          Errors in file /u01/app/oa_base/diag/rdbms/authorcl/authorcl1/trace/authorcl1_lmhb_27613.trc  (incident=14129): 
         
          ORA-29770:  
          global 
          enqueue process LMON (OSID 27573) is hung  
          for 
          more than 70 seconds 
         
          Incident details in: /u01/app/oa_base/diag/rdbms/authorcl/authorcl1/incident/incdir_14129/authorcl1_lmhb_27613_i14129.trc 
         
          Tue Apr 22 17:26:58 2014 
         
          Sweep [inc][14129]: completed 
         
          Sweep [inc2][14129]: completed 
         
          ERROR: Some process(s) is not making progress. 
         
          LMHB (ospid: 27613) is terminating the instance. 
         
          Please check LMHB trace file  
          for 
          more details. 
         
          Please also check the CPU load, I/O load  
          and 
          other system properties  
          for 
          anomalous behavior 
         
          ERROR: Some process(s) is not making progress. 
         
          Tue Apr 22 17:26:58 2014 
         
          System state dump requested by (instance=1, osid=27613 (LMHB)), summary=[abnormal instance termination]. 
         
          LMHB (ospid: 27613): terminating the instance due to error 29770 
         
          System State dumped to trace file /u01/app/oa_base/diag/rdbms/authorcl/authorcl1/trace/authorcl1_diag_27561.trc 
         
          Tue Apr 22 17:27:00 2014 
         
          ORA-1092 : opitsk aborting process 
         
          Tue Apr 22 17:27:01 2014 
         
          License high water mark = 144 
         
          Tue Apr 22 17:27:08 2014 
         
          Termination issued to instance processes. Waiting  
          for 
          the processes to  
          exit 
         
          Instance termination failed to kill one  
          or 
          more processes 
         
          Instance terminated by LMHB, pid = 27613 
         
          Tue Apr 22 17:27:15 2014 
         
          USER (ospid: 1378): terminating the instance 
         
          Termination issued to instance processes. Waiting  
          for 
          the processes to  
          exit 
         
          Tue Apr 22 17:27:25 2014 
         
          Instance termination failed to kill one  
          or 
          more processes 
         
          Instance terminated by USER, pid = 1378 
         
          Tue Apr 22 21:51:56 2014 
         
          Adjusting the  
          default 
          value of parameter parallel_max_servers 
         
          from 640 to 135 due to the value of parameter processes (150) 
         
          Starting ORACLE instance (normal)

我们可以看到，最早在Apr 22 17:25:05 2014 时间点,即抛出LMON IPC send timeout的错误了。

Receiver: inst 2 binc 95439 ospid 13752 这里的receiver进程为节点2的13752进程，即节点2的LMON进程。

对于LMON进程，主要是监控RAC的GES信息，当然其作用不仅仅局限于此，其还负责检查集群中各个Node的健康
情况，当有节点出现故障是，负责进行reconfig以及GRD(global resource Directory)的恢复等等。我们知道
RAC的脑裂机制，如果IO fencing是Oracle本身来完成，也就是说由CLusterware来完成。那么Lmon进程检查到
实例级别出现脑裂时，会通知clusterware来进行脑裂操作，然而其并不会等待clusterware的处理结果。当等待
超过一定时间，那么LMON进程会自动触发IMR(instance membership recovery)，这实际上也就是我们所说的
Instance membership reconfig。

其次，lmon进程主要通过2种心跳机制来检查判断集群节点的健康状态：
1) 网络心跳（主要是通过ping进行检测）
2) 控制文件磁盘心跳，其实就是每个节点的ckpt进程每3s更新一次controlfile的机制。

所以这里大家可以看出，Lmon进程是需要操作controlfile的。否则无法判断第2点。
虽然从上面的错误来看，该实例是被LMHB进程给终止掉的，这里我们需要说明一下，LMBH进程的原理。
LMBH进程是Oracle 11R2版本引入的一个进程，该进程的作用的监控LMD,LMON,LCK,LMS等核心进程，防止这些Oracle

关键性后台进程spin或不被阻塞。该进程会定时的将监控的信息打印输出在trace文件中，便于我们进行诊断，

这也是11gR2一个亮点。当LMBH进程发现其他核心进程出现异常时，会尝试发起一些kill动作，如何有进程被阻塞的话。
如果一定时间内仍然无法解决，那么将触发保护，将实例强行终止掉，当然，这是为了保证RAC节点数据的完整性和一致性。

这里比较郁闷的是，这个diag的dump并没有产没 /u01/app/oa_base/diag/rdbms/authorcl/authorcl1/trace/authorcl1_diag_27561.trc

我们首先来看下Node1的Lmon进程的信息：

 
     ? 
    
          *** ACTION NAME:() 2014-04-22 17:26:56.052 
         
          *** 2014-04-22 17:26:49.401 
         
          ============================== 
         
          LMON (ospid: 27573) has not moved  
          for 
          105 sec (1398158808.1398158703) 
         
          kjfmGCR_HBCheckAll: LMON (ospid: 27573) has status 6 
         
          ================================================== 
         
          === LMON (ospid: 27573) Heartbeat Report 
         
          ================================================== 
         
          LMON (ospid: 27573) has no heartbeats  
          for 
          105 sec. (threshold 70 sec) 
         
          : Not in wait; last wait ended 89 secs ago.              -------------等待了89秒 
         
          : last wait_id 165313538 at  
          'enq: CF - contention' 
          . 
         
          ============================== 
         
          Dumping PROCESS LMON (ospid: 27573) States 
         
          ============================== 
         
          ===[ System Load State ]=== 
         
          CPU Total 16 Core 16 Socket 16 
         
          Load normal: Cur 988 Highmark 20480 (3.85 80.00) 
         
          ===[ Latch State ]=== 
         
          Not in Latch Get 
         
          ===[ Session State Object ]=== 
         
          ---------------------------------------- 
         
          SO: 0x52daba340, type: 4, owner: 0x52f5d8330, flag: INIT/-/-/0x00  
          if 
          : 0x3 c: 0x3 
         
          proc=0x52f5d8330, name=session, file=ksu.h LINE:12624 ID:, pg=0 
         
          (session) sid: 1057 ser: 1 trans: 0x0, creator: 0x52f5d8330 
         
          flags: (0x51) USR/- flags_idl: (0x1) BSY/-/-/-/-/- 
         
          flags2: (0x409) -/-/INC 
         
          DID: , short-term DID: 
         
          txn branch: 0x0 
         
          oct: 0, prv: 0, sql: 0x0, psql: 0x0, user: 0/SYS 
         
          ksuxds FALSE at location: 0 
         
          service name: SYS 
          $BACKGROUND 
         
          Current Wait Stack: 
         
          Not in wait; last wait ended 1 min 29 sec ago 
         
          There are 1 sessions blocked by this session. 
         
          Dumping one waiter: 
         
          inst: 1, sid: 297, ser: 6347 
         
          wait event:  
          'name-service call wait' 
         
          p1:  
          'waittime' 
          =0x32 
         
          p2:  
          '' 
          =0x0 
         
          p3:  
          '' 
          =0x0 
         
          row_wait_obj#: 4294967295, block#: 0, row#: 0, file# 0 
         
          min_blocked_time: 0 secs, waiter_cache_ver: 30272 
         
          Wait State: 
         
          fixed_waits=0 flags=0x20 boundary=0x0/-1 
         
          Session Wait History: 
         
          elapsed time of 1 min 29 sec since last wait  ---LMON进程等待enq: CF - contention，等待了1分29秒,即89秒 
         
          0: waited  
          for 
          'enq: CF - contention' 
         
          name|mode=0x43460006, 0=0x0, operation=0x3 
         
          wait_id=165313538 seq_num=35946 snap_id=1 
         
          wait times: snap=1.027254 sec, exc=1.027254 sec, total=1.027254 sec 
         
          wait times: max=1.000000 sec 
         
          wait counts: calls=1 os=1 
         
          occurred after 0.000109 sec of elapsed time 
         
          。。。。。。

如下是该进程的资源使用情况：

 
     ? 
    
          *** 2014-04-22 17:26:57.229 
         
          loadavg : 3.94 3.80 3.99 
         
          swap info: free_mem = 36949.53M rsv = 24548.22M 
         
          alloc = 23576.62M avail = 45643.61M swap_free = 46615.21M 
         
          F S      UID   PID  PPID   C PRI NI     ADDR     SZ    WCHAN    STIME TTY         TIME CMD 
         
          0 O       oa 27573     1   6  79 20        ? 799187            Jan 23 ?        1589:29 ora_lmon_authorcl1

我们可以看到，系统在该时间点load并不高，Memory也很充足。

这里有一个问题，该节点LMON进程hung的原因是什么？从日志分析来看，是由于无法获得enq: CF – contention。

我们知道ckpt 进程会定时更新操作controlfile,且就需要获得该enqueue。所有这里我大胆的假设,是由于ckpt持有CF的latch

不释放,导致LMON进程无法获得. 根据这一点我搜mos 发现一个bug，可惜该bug说已经在11.2.0.3中fixed了。

Bug 10276173 LMON hang possible while trying to get access to CKPT progress record

该bug描述说，当在进行reconfig时，lmon会尝试去获得ckpt processes record，会等待enq: CF -contention，会导致hung.

根据文档来看，显然这跟我们的实际情况不符。
下面我们来结合Node2的日志进行综合分析：

 
     ? 
    
          Tue Apr 22 17:25:06 2014 
         
          IPC Send timeout detected. Receiver ospid 13752 [oracle@xhdb-server4 (LMON)] 
         
          Tue Apr 22 17:26:59 2014 
         
          Dumping diagnostic data in directory=[cdmp_20140422172658], requested by (instance=1, osid=27613 (LMHB)), summary=[abnormal instance termination]. 
         
          Tue Apr 22 17:29:22 2014 
         
          WARNING: aiowait timed out 1 times 
         
          Tue Apr 22 17:29:53 2014 
         
          Errors in file /u01/app/oa_base/diag/rdbms/authorcl/authorcl2/trace/authorcl2_dia0_13750.trc  (incident=3681): 
         
          ORA-32701: Possible hangs up to hang ID=3 detected 
         
          Incident details in: /u01/app/oa_base/diag/rdbms/authorcl/authorcl2/incident/incdir_3681/authorcl2_dia0_13750_i3681.trc 
         
          Tue Apr 22 17:30:24 2014 
         
          DIA0 terminating blocker (ospid: 16818 sid: 578 ser#: 39069) of hang with ID = 3 
         
          requested by master DIA0 process on instance 2 
         
          Hang Resolution Reason: Automatic hang resolution was performed to free a 
         
          significant number of affected sessions. 
         
          by terminating session sid: 578 ospid: 16818 
         
          DIA0 successfully terminated session sid:578 ospid:16818 with status 31. 
         
          DIA0 successfully resolved a LOCAL, HIGH confidence hang with ID=3. 
         
          Tue Apr 22 17:30:33 2014 
         
          LMS2 (ospid: 13764) has detected no messaging activity from instance 1 
         
          LMS2 (ospid: 13764) issues an IMR to resolve the situation 
         
          Please check LMS2 trace file  
          for 
          more detail. 
         
          Tue Apr 22 17:31:48 2014 
         
          LMD0 (ospid: 13754) has detected no messaging activity from instance 1 
         
          LMD0 (ospid: 13754) issues an IMR to resolve the situation 
         
          Please check LMD0 trace file  
          for 
          more detail. 
         
          Tue Apr 22 17:32:03 2014 
         
          Errors in file /u01/app/oa_base/diag/rdbms/authorcl/authorcl2/trace/authorcl2_dia0_13750.trc  (incident=3682): 
         
          ORA-32701: Possible hangs up to hang ID=3 detected 
         
          Incident details in: /u01/app/oa_base/diag/rdbms/authorcl/authorcl2/incident/incdir_3682/authorcl2_dia0_13750_i3682.trc 
         
          Tue Apr 22 17:32:16 2014 
         
          IPC Send timeout detected. Sender: ospid 23666 [oracle@xhdb-server4 (TNS V1-V3)] 
         
          Receiver: inst 1 binc 380222 ospid 27575 
         
          IPC Send timeout to 1.0 inc 20  
          for 
          msg type 12 from opid 154 
         
          IPC Send timeout: Terminating pid 154 osid 23666 
         
          Tue Apr 22 17:32:20 2014 
         
          IPC Send timeout detected. Sender: ospid 13746 [oracle@xhdb-server4 (PING)] 
         
          Receiver: inst 1 binc 380164 ospid 27565 
         
          Tue Apr 22 17:32:34 2014 
         
          DIA0 terminating blocker (ospid: 16818 sid: 578 ser#: 39069) of hang with ID = 3 
         
          requested by master DIA0 process on instance 2 
         
          Hang Resolution Reason: Automatic hang resolution was performed to free a 
         
          significant number of affected sessions. 
         
          by terminating the process 
         
          DIA0 successfully terminated process ospid:16818. 
         
          DIA0 successfully resolved a LOCAL, HIGH confidence hang with ID=3. 
         
          Tue Apr 22 17:32:35 2014 
         
          LMS1 (ospid: 13760) has detected no messaging activity from instance 1 
         
          LMS1 (ospid: 13760) issues an IMR to resolve the situation 
         
          Please check LMS1 trace file  
          for 
          more detail. 
         
          Tue Apr 22 17:32:44 2014 
         
          IPC Send timeout detected. Sender: ospid 13754 [oracle@xhdb-server4 (LMD0)] 
         
          Receiver: inst 1 binc 380222 ospid 27575 
         
          IPC Send timeout to 1.0 inc 20  
          for 
          msg type 65521 from opid 12 
         
          Tue Apr 22 17:33:11 2014 
         
          LMS0 (ospid: 13756) has detected no messaging activity from instance 1 
         
          LMS0 (ospid: 13756) issues an IMR to resolve the situation 
         
          Please check LMS0 trace file  
          for 
          more detail. 
         
          Tue Apr 22 17:34:29 2014 
         
          IPC Send timeout detected. Sender: ospid 13764 [oracle@xhdb-server4 (LMS2)] 
         
          Receiver: inst 1 binc 380309 ospid 27585 
         
          IPC Send timeout to 1.1 inc 20  
          for 
          msg type 65522 from opid 15 
         
          Tue Apr 22 17:36:31 2014

我们可以看到Node2 在Apr 22 17:26:59 2014 节点Node1的LMBH终止instance的信息了。然后在后面抛出hung的信息，
不过Oracle自动解决了hung的session。下面我们来看下Node2上lmon进程的trace内容：

 
     ? 
    
          *** 2014-04-22 17:26:59.377 
         
          Process diagnostic dump  
          for 
          oracle@xhdb-server4 (LMON), OS id=13752, 
         
          pid: 11, proc_ser: 1, sid: 353, sess_ser: 1 
         
          ------------------------------------------------------------------------------- 
         
          current sql:  
         
          Current Wait Stack: 
         
          0: waiting  
          for 
          'control file sequential read' 
         
          file#=0x0, block#=0x23, blocks=0x1 
         
          wait_id=272969233 seq_num=24337 snap_id=1 
         
          wait times: snap=7 min 42 sec, exc=7 min 42 sec, total=7 min 42 sec   ---已经等待了7分42秒 
         
          wait times: max=infinite, heur=7 min 42 sec 
         
          wait counts: calls=0 os=0 
         
          in_wait=1 iflags=0x5a0 
         
          There are 1 sessions blocked by this session. 
         
          Dumping one waiter: 
         
          inst: 2, sid: 1092, ser: 49369 
         
          wait event:  
          'name-service call wait' 
         
          p1:  
          'waittime' 
          =0x32 
         
          p2:  
          '' 
          =0x0 
         
          p3:  
          '' 
          =0x0 
         
          row_wait_obj#: 4294967295, block#: 0, row#: 0, file# 0 
         
          min_blocked_time: 0 secs, waiter_cache_ver: 6248 
         
          Wait State: 
         
          fixed_waits=0 flags=0x22 boundary=0x0/-1 
         
          Session Wait History:

从lmon的trace信息我们可以看出，该进程正在等待control file sequential read,且已经等待了7分42秒。

根据trace的时间点，我们可以向前推进7分42秒，换句话讲，从17:19:18秒就开始等待了。

既然是controlfile的等待，那么我们就有必要来看下Node2节点上的ckpt进程在干什么了？如下是ckpt进程的信息：

 
     ? 
    
          Redo thread mounted by this instance: 2 
         
          Oracle process number: 26 
         
          Unix process pid: 13788, image: oracle@xhdb-server4 (CKPT) 
         
          *** 2014-04-22 17:26:59.882 
         
          *** SESSION ID:(833.1) 2014-04-22 17:26:59.882 
         
          *** 2014-04-22 17:26:59.882 
         
          Process diagnostic dump  
          for 
          oracle@xhdb-server4 (CKPT), OS id=13788, 
         
          pid: 26, proc_ser: 1, sid: 833, sess_ser: 1 
         
          ------------------------------------------------------------------------------- 
         
          current sql:  
         
          Current Wait Stack: 
         
          0: waiting  
          for 
          'control file sequential read' 
         
          file#=0x0, block#=0x1, blocks=0x1 
         
          wait_id=14858985 seq_num=48092 snap_id=1 
         
          wait times: snap=7 min 40 sec, exc=7 min 40 sec, total=7 min 40 sec    ----等待了7分40秒 
         
          wait times: max=infinite, heur=7 min 40 sec 
         
          wait counts: calls=0 os=0 
         
          in_wait=1 iflags=0x5a0 
         
          There are 2 sessions blocked by this session. 
         
          Dumping one waiter: 
         
          inst: 2, sid: 291, ser: 59157 
         
          wait event:  
          'DFS lock handle' 
         
          p1:  
          'type|mode' 
          =0x43490005 
         
          p2:  
          'id1' 
          =0xa 
         
          p3:  
          'id2' 
          =0x2 
         
          row_wait_obj#: 4294967295, block#: 0, row#: 0, file# 0 
         
          min_blocked_time: 352 secs, waiter_cache_ver: 6248 
         
          Wait State: 
         
          fixed_waits=0 flags=0x22 boundary=0x0/-1

我们可以看到，Node2的ckpt进程等待control file sequential read,等待了7分40秒。同时大家还可以看到，ckpt
进程阻塞了2个进程，也就是说ckpt进程有2个waiter，其中一个waiter的信息是：sid:291,ser:59157
且该waiter进程的等待事件居然是DFS lock handle,这是一个比较危险的event。这里我们还无法确认这个waiter是什么？
同时ckpt进程为啥等待这么长的时间？

大家知道11g引入的hung auto resolution，那么我们就来看下Node1上的diag的信息：

 
     ? 
    
          Unix process pid: 27571, image: oracle@xhdb-server3 (DIA0) 
         
          *** 2014-04-22 17:22:01.536 
         
          *** SESSION ID:(961.1) 2014-04-22 17:22:01.536 
         
          *** CLIENT ID:() 2014-04-22 17:22:01.536 
         
          *** SERVICE NAME:(SYS 
          $BACKGROUND 
          ) 2014-04-22 17:22:01.536 
         
          *** MODULE NAME:() 2014-04-22 17:22:01.536 
         
          *** ACTION NAME:() 2014-04-22 17:22:01.536 
         
          One  
          or 
          more possible hangs have been detected on your system. 
         
          These could be genuine hangs in which no further progress will 
         
          be made without intervention,  
          or 
          it may be very slow progress 
         
          in the system due to high load. 
         
          Previously detected  
          and 
          output hangs are not displayed again. 
         
          Instead, the  
          'Victim Information' 
          section will indicate that 
         
          the victim is from an  
          'Existing Hang' 
          under the  
          'Previous Hang' 
         
          column. 
         
          'Verified Hangs' 
          below indicate one  
          or 
          more hangs that were found 
         
          and 
           identify the  
          final 
           blocking session  
          and 
           instance on which 
         
          they occurred. Since the current hang resolution state is  
          'PROCESS' 
          , 
         
          any hangs requiring session  
          or 
          process termination will be 
         
          automatically resolved. 
         
          Any hangs with a  
          'Hang Resolution Action' 
           of  
          'Unresolvable' 
         
          will be ignored. These types of hangs will either be resolved 
         
          by another layer in the RDBMS  
          or 
          cannot be resolved  
          as 
          they may 
         
          require 
           external intervention. 
         
          Deadlocks (also named cycles) are currently NOT resolved even  
          if 
         
          hang resolution is enabled.  The  
          'Hang Type' 
          of DLCK in the 
         
          'Verified Hangs' 
          output identifies these hangs. 
         
          Below that are the complete hang chains from the time the hang 
         
          was detected. 
         
          The following information will assist Oracle Support Services 
         
          in further analysis of the root cause of the hang. 
         
          *** 2014-04-22 17:22:01.537 
         
          Verified Hangs in the System 
         
          Root       Chain Total               Hang 
         
          Hang Hang          Inst Root  #hung #hung  Hang   Hang  Resolution 
         
          ID Type Status   Num  Sess   Sess  Sess  Conf   Span  Action 
         
          ----- ---- -------- ---- ----- ----- ----- ------ ------ ------------------- 
         
          2 HANG VICSELTD    2   833     2     2   HIGH  LOCAL IGNRD:InstKillNotA 
         
          Hang Ignored Reason: Since instance termination is not allowed, automatic 
         
          hang resolution cannot be performed. 
         
          inst# SessId  Ser#     OSPID PrcNm Event 
         
          ----- ------ ----- --------- ----- ----- 
         
          2    291 59157     10646  M000 DFS lock handle                   ----大家注意这里的sid和ser#以及PrcNm 
         
          2    833     1     13788  CKPT control file sequential read

这里提到M000进程，大家应该知道这是跟AWR快照有关系的进程，该进程其实是被CKPT所阻塞，我们也可以来看下该进程
的情况，如下：

 
     ? 
    
          *** 2014-04-22 17:27:00.778 
         
          Process diagnostic dump  
          for 
          oracle@xhdb-server4 (M000), OS id=10646, 
         
          pid: 57, proc_ser: 143, sid: 291, sess_ser: 59157 
         
          ------------------------------------------------------------------------------- 
         
          current sql: 
         
          select tablespace_id, rfno, allocated_space, file_size, file_maxsize, changescn_base, changescn_wrap, flag, 
         
          inst_id from sys.ts$, GV 
          $FILESPACE_USAGE 
          where ts# = tablespace_id  
          and 
          online$ != 3  
          and 
           (changescn_wrap > PITRSCNWRP 
         
          or     
           (changescn_wrap = PITRSCNWRP  
          and 
           changescn_base >= PITRSCNBAS))  
          and 
           inst_id != :inst  
          and 
           (changescn_wrap > :w 
         
          or 
           (changescn_wrap = :w  
          and 
           changescn_base >= :b)) 
         
          Current Wait Stack: 
         
          0: waiting  
          for 
          'DFS lock handle' 
         
          type|mode=0x43490005, id1=0xa, id2=0x2 
         
          wait_id=6 seq_num=7 snap_id=1 
         
          wait times: snap=6 min 12 sec, exc=6 min 12 sec, total=6 min 12 sec 
         
          wait times: max=infinite, heur=6 min 12 sec 
         
          wait counts: calls=818 os=818 
         
          in_wait=1 iflags=0x15a2 
         
          There is at least one session blocking this session. 
         
          Dumping 2 direct blocker(s): 
         
          inst: 2, sid: 833, ser: 1 
         
          inst: 1, sid: 482, ser: 1 
         
          Dumping  
          final 
          blocker: 
         
          inst: 2, sid: 833, ser: 1           -----最终的blocker是833，也就是Node2节点的CKPT进程。 
         
          There are 1 sessions blocked by this session. 
         
          Dumping one waiter: 
         
          inst: 1, sid: 581, ser: 36139 
         
          wait event:  
          'DFS lock handle' 
         
          p1:  
          'type|mode' 
          =0x43490005 
         
          p2:  
          'id1' 
          =0xa 
         
          p3:  
          'id2' 
          =0x5

从这里看,root sess却是833,也就是我们Node2节点的CKPT进程。到这里或许有人会说，问题的原因
应该很明确了，由于Node2 ckpt的异常，到这Node2 节点Lmon进程异常，由于需要和Node1的Lmon进程
进行通信，导致Node1 的lmon进程出现IPc send timeout的情况。

其实不然，到最后至始至终我们都没有完全弄清楚为何CKPT进程会等待这么长时间？

到这里我不得不怀疑IO的问题了，再回过头来分析Node1的diag trace时，发现一个有趣的事情：

 
     ? 
    
          *** 2014-04-22 17:24:08.363 
         
          Verified Hangs in the System 
         
          Root       Chain Total               Hang 
         
          Hang Hang          Inst Root  #hung #hung  Hang   Hang  Resolution 
         
          ID Type Status   Num  Sess   Sess  Sess  Conf   Span  Action 
         
          ----- ---- -------- ---- ----- ----- ----- ------ ------ ------------------- 
         
          7 HANG VICSELTD    2   801     2     3   HIGH  LOCAL IGNRD:InstKillNotA 
         
          Hang Ignored Reason: Since instance termination is not allowed, automatic 
         
          hang resolution cannot be performed. 
         
          inst# SessId  Ser#     OSPID PrcNm Event 
         
          ----- ------ ----- --------- ----- ----- 
         
          2    996 39917      6598    FG log file sync 
         
          2    801     1     13786  LGWR log file parallel write 
         
          。。。。。省略部分内容 
         
          ------------------------------------------------------------------------------- 
         
          Chain 2: 
         
          ------------------------------------------------------------------------------- 
         
          Oracle session identified by: 
         
          { 
         
          instance: 1 (authorcl.authorcl1) 
         
          os id: 22323 
         
          process id: 64, oracle@xhdb-server3 
         
          session id: 14 
         
          session serial #: 29257 
         
          } 
         
          is waiting  
          for 
          'log file sync' 
          with wait info: 
         
          { 
         
          p1:  
          'buffer#' 
          =0x2ab 
         
          p2:  
          'sync scn' 
          =0x1ed4d8be 
         
          time in wait: 4 min 18 sec 
         
          timeout after: never 
         
          wait id: 13287 
         
          blocking: 0 sessions 
         
          wait history: 
         
          * time between current wait  
          and 
          wait #1: 0.000388 sec 
         
          1.       event:  
          'SQL*Net message from client' 
         
          time waited: 0.000486 sec 
         
          wait id: 13286           p1:  
          'driver id' 
          =0x54435000 
         
          p2:  
          '#bytes' 
          =0x1 
         
          * time between wait #1  
          and 
          #2: 0.000027 sec 
         
          2.       event:  
          'SQL*Net message to client' 
         
          time waited: 0.000003 sec 
         
          wait id: 13285           p1:  
          'driver id' 
          =0x54435000 
         
          p2:  
          '#bytes' 
          =0x1 
         
          * time between wait #2  
          and 
          #3: 0.001494 sec 
         
          3.       event:  
          'SQL*Net message from client' 
         
          time waited: 0.002089 sec 
         
          wait id: 13284           p1:  
          'driver id' 
          =0x54435000 
         
          p2:  
          '#bytes' 
          =0x1 
         
          } 
         
          and 
          is blocked by 
         
          => Oracle session identified by: 
         
          { 
         
          instance: 1 (authorcl.authorcl1) 
         
          os id: 27634 
         
          process id: 20, oracle@xhdb-server3 (LGWR) 
         
          session id: 386 
         
          session serial #: 1 
         
          } 
         
          which is waiting  
          for 
          'log file parallel write' 
          with wait info: 
         
          { 
         
          p1:  
          'files' 
          =0x1 
         
          p2:  
          'blocks' 
          =0x2 
         
          p3:  
          'requests' 
          =0x1 
         
          time in wait: 4 min 32 sec            -----等待了4分32秒 
         
          timeout after: never 
         
          wait id: 51742574 
         
          blocking: 17 sessions            ------阻塞了17个Session 
         
          wait history: 
         
          * time between current wait  
          and 
          wait #1: 0.000176 sec 
         
          1.       event:  
          'rdbms ipc message' 
         
          time waited: 0.047194 sec 
         
          wait id: 51742573        p1:  
          'timeout' 
          =0x22 
         
          * time between wait #1  
          and 
          #2: 0.000399 sec 
         
          2.       event:  
          'log file parallel write' 
         
          time waited: 0.004006 sec 
         
          wait id: 51742572        p1:  
          'files' 
          =0x1 
         
          p2:  
          'blocks' 
          =0x2 
         
          p3:  
          'requests' 
          =0x1 
         
          * time between wait #2  
          and 
          #3: 0.000318 sec 
         
          3.       event:  
          'rdbms ipc message' 
         
          time waited: 2.654606 sec 
         
          wait id: 51742571        p1:  
          'timeout' 
          =0x12c 
         
          } 
         
          Chain 2 Signature:  
          'log file parallel write' 
          <= 
          'log file sync'

我们可以看到，Node1上，lgwr进程阻塞了17个会话，等待log file parallel write，一直持续了4分32秒。

如果将时间点2014-04-22 17:24:08，向前推进4分32秒，那么就是2014-04-22 17:19:36.

我们再来检查Node2的操作系统日志你会发现有意思的事情：

 
     ? 
    
          Apr 22 17:19:16 xhdb-server4 vxio: [ID 442312 kern.notice] NOTICE: VxVM VVR vxio V-5-0-209 Log overflow protection on rlink oa_zs_bj triggered DCM protection 
         
          Apr 22 17:19:16 xhdb-server4 vxio: [ID 636438 kern.warning] WARNING: VxVM VVR vxio V-5-0-1905 Replication stopped  
          for 
          RVG rvg_oa due to SRL overflow, DCM protection is triggered. To start replication, perform DCM resynchronization using  
          "vradmin resync" 
          command. 
         
          Apr 22 19:41:19 xhdb-server4 su: [ID 810491 auth.crit]  
          'su root' 
          failed  
          for 
          oa on /dev/pts/5 
         
          Apr 22 21:31:38 xhdb-server4 su: [ID 810491 auth.crit]  
          'su grid' 
          failed  
          for 
          root on /dev/pts/14 
         
          Apr 22 21:36:03 xhdb-server4 AgentFramework[5814]: [ID 702911 daemon.notice] VCS ERROR V-16-2-13067 Thread(4) Agent is calling clean  
          for 
          resource(cssd_oaora) because the resource became OFFLINE unexpectedly, on its own. 
         
          Apr 22 21:36:03 xhdb-server4 Had[5704]: [ID 702911 daemon.notice] VCS ERROR V-16-2-13067 (xhdb-server4) Agent is calling clean  
          for 
          resource(cssd_oaora) because the resource became OFFLINE unexpectedly, on its own. 
         
          Apr 22 21:36:06 xhdb-server4 AgentFramework[5814]: [ID 702911 daemon.notice] VCS ERROR V-16-2-13068 Thread(4) Resource(cssd_oaora) - clean completed successfully. 
         
          Apr 22 21:40:51 xhdb-server4 reboot: [ID 662345 auth.crit] rebooted by root 
         
          Apr 22 21:41:14 xhdb-server4 amf: [ID 451996 kern.notice] AMF NOTICE V-292-1-67 Signal received  
          while 
          waiting  
          for 
          event on reaper  
          'CFSMount' 
          . Returning. 
         
          Apr 22 21:41:14 xhdb-server4 amf: [ID 451996 kern.notice] AMF NOTICE V-292-1-67 Signal received  
          while 
          waiting  
          for 
          event on reaper  
          'CFSfsckd' 
          . Returning. 
         
          Apr 23 00:31:59 xhdb-server4 genunix: [ID 540533 kern.notice] ^MSunOS Release 5.10 Version Generic_147440-25 64-bit

我们可以看到,在17:19:16秒时vertias的VVR出现了异常。这也就是为什么我们在前面分析看到Node2在17:19:18时出现control file sequential read
等待的原因。虽然从vcs的日志什么信息：

 
     ? 
    
          2014/04/22 14:39:12 VCS INFO V-16-1-53504 VCS Engine Alive message!! 
         
          2014/04/22 18:39:13 VCS INFO V-16-1-53504 VCS Engine Alive message!! 
         
          2014/04/22 21:36:03 VCS ERROR V-16-2-13067 (xhdb-server4) Agent is calling clean  
          for 
          resource(cssd_oaora) because the resource became OFFLINE unexpectedly, on its own. 
         
          2014/04/22 21:36:06 VCS INFO V-16-2-13068 (xhdb-server4) Resource(cssd_oaora) - clean completed successfully. 
         
          2014/04/22 21:36:09 VCS INFO V-16-1-10307 Resource cssd_oaora (Owner: Unspecified, Group: sg_oaora) is offline on xhdb-server4 (Not initiated by VCS) 
         
          2014/04/22 21:36:09 VCS NOTICE V-16-1-10446 Group sg_oaora is offline on system xhdb-server4

所以，最后我的感觉是根本原因是vcs的问题。虽然vertias的工程师一直解释这里的Log overflow protection没有太大的关系。

针对这个问题，欢迎大家探讨。

补充：关于ora-29770导致实例crash的问题，Oracle确实有不少的bug，但是针对这个情况，目前没有发现符合的。如下是来自Oracle MOS的搜索结果：

Bug 11890804：LMHB crashes instance with ORA-29770 after long “control file sequential read” waits
Bug 8888434： LMHB crashes the instance with LMON waiting on controlfile read
Bug 11890804: LMHB TERMINATE INSTANCE WHEN LMON WAIT CHANGE FROM CF READ AFTER 60 SEC
Bug 13467673: CSS MISSCOUNT AND ALL ASM DOWN WITH ORA-29770 BY LMHB
Bug 13390052: KJFMGCR_HBCHECKALL MESSAGES ARE CONTINUOUSLY LOGGED IN LMHB TRACE FILE.
Bug 13322797: LMHB TERMINATES THE INSTANCE DUE TO ERROR 29770
Bug 13061883: LMHB IS TERMINATING THE INSTANCE DURING SHUTDOWN IMMEDIATE
Bug 12886605: ESSC: LMHB TERMINATE INSTANCE DUE TO 29770 – LMON WAIT ENQ: AM – DISK OFFLINE
Bug 12757321: LMHB TERMINATING THE INSTANCE DUE TO ERROR 29770
Bug 10296263: LMHB (OSPID: 15872): TERMINATING THE INSTANCE DUE TO ERROR 29770
Bug 10431752: SINGLE NODE RAC: LMHB TERMINATES INSTANCE DUE TO 29770
Bug 11656856: LMHB (OSPID: 27701): TERMINATING THE INSTANCE DUE TO ERROR 29770
Bug 10411143: INSTANCE CRASHES WITH IPC SEND TIMEOUT AND LMHB TERMINATES WITH ORA-29770
Bug 11704041: DATABASE INSTANCE CRASH BY LMHB PROCESS
Bug 10412545: ORA-29770 LMHB TERMINATE INSTANCE DUE TO VARIOUS LONG CSS WAIT
Bug 10147827: INSTANCE TERMINATED BY LMHB WITH ERROR ORA-29770
Bug 10016974: ORA-29770 LMD IS HUNG FOR MORE THAN 70 SECONDS AND LMHB TERMINATE INSTANCE
Bug 9376100: LMHB TERMINATING INSTANCE DUE ERROR 29770

trouble9.28 小小_d574
1.（基础篇）英英释义：tocauseinconvenienceordiscomfortto例句：Hewastroubledbyhishealth.2.体会这个词（进阶篇）我们都知道“trouble”是“麻烦”，不过我们这里要学习的是它的动词形式，表示“让某人感到痛苦或焦虑”。我们在口语和写作中都能用到它。在口语中，我们想麻烦别人做某事时，可以先客气地说一句：I’msorrytotroubleyo
无需插件就能实现异构数据库的互联互通？（powershell妙用）潇湘秦数据库 powershell oracle mysql postgresql
欢迎关注作者，更多数据库相关安装配置，troubleshooting，调优，备份恢复等资源墨天伦：潇湘秦的个人主页-墨天轮CSDN：潇湘秦-CSDN博客公众号：潇湘秦---------------------------------------------------------------------------------------------前两天在DBA群里有大佬分享了利用OracleD
gitlab设置 zhanghaisong_2015 gitlab
错误提示:remote:HTTPBasic:Accessdenied.Theprovidedpasswordortokenisincorrectoryouraccounthas2FAenabledandyoumustuseapersonalaccesstokeninsteadofapassword.Seehttp://gitlab.com/help/topics/git/troubleshooti
【TroubleShoot】Windows10视频文件没有预览图的解决办法 tealcwu windows 音视频 h.264 vp9
【问题描述】Windows文件夹中，有一部分的MP4视频在文件以图标显示的时候，是可以直接看到预览图的，但有一部分没法预览，只能显示为统一的文件类型图标。【解决方法】检查发现，两类MP4都是使用H264格式，用VLCMediaPlayer等专门的播放器都能播放，但使用Windows自带的WindowsMediaPlayer，可以预览的就可以正常播放，不能预览的就不能播放，而且播放器会停止响应。由此
IDM6.32的安装与激活IDM Crack 6.32 Build 8 + Patch 2019 free (100% working) 早睡的叶子 #软件安装 IDM IDM下载 IDM激活 IDM2019 IDM6.32
IDM是国外的软件，流传到国内，点击下面链接，直接进行下载与激活即可！点击此处进行跳转到下载地址建议英文IDMCrack6.32Build8+Patch2019free(100%working)RateThisPost:1Star2Stars3Stars4Stars5Stars(6,465votes)IDMCrack2019cansaveyoufromtroubleofslowdownloadin
Accurate DOS/ISMEAR=-5 郭令举 VASP 科技
AccurateDOS/ISMEAR=-5某小混混：大牛回答到：小混混膜拜中：某小混混：Dearall,IexperiencedsometroubleswhencomputingDOSwithISMEAR=-5.HereisanextractofaDOSCAR:1.5980.0000E+000.0000E+000.2000E+030.2000E+031.7440.0000E+000.0000E+0
kubernetes(K8S)学习（九）：K8S之日志监控 ꯭ 瞎꯭扯꯭蛋꯭ Kubernetes kubernetes 学习容器
K8S之日志监控一、LogandMonitor1.1Log1.1.1容器级别1.1.2Pod级别1.1.3组件服务级别1.1.4LogPilot+ES+Kibana1.2Monitor1.2.1Prometheus简介1.2.2Prometheus架构1.2.3Prometheus知识普及1.2.4数据采集1.2.5Prometheus+Grafana二、TroubleShooting（问题排查）
VS Code SSH 远程连接时断时续 / 不稳定问题排查斐夷所非 clean code VS Code SSH
注：本文为“VSCodeSSH远程连接问题”相关文章合辑。英文部分机翻，未校。TroubleshootingVSCodeRemoteSSHConnectionIssues:AStep-by-StepGuide排查VSCode远程SSH连接问题：分步指南CodedJava,11-11-2024Introduction介绍VisualStudioCode(VSCode)hasrevolutionize
【FAQ】HarmonyOS SDK 闭源开放能力 —Vision Kit (3)
1.问题描述：通过CardRecognition识别身份证拍照拿到的照片地址，使用该方法获取不到图片文件，请问如何解决？解决方案：//卡证识别实现页，文件名为CardDemoPage，需被引入至入口页import{CallbackParam,CardRecognition,CardSide,CardType,ShootingMode}from'@kit.VisionKit';import{hilo
A fatal error occurred: Failed to connect to ESP32: No serial data received. 无处在 ubuntu
环境：Ubuntu22.04问题：CH340系列串口驱动（没有ttyUSB）整句错误信息：Afatalerroroccurred:FailedtoconnecttoESP32:Noserialdatareceived.Fortroubleshootingstepsvisit:Troubleshooting-ESP32-—esptool.pylatestdocumentation***[upload
mysql 与 sqlite 数学运算精度问题 wowocpp mysql
mysql与sqlite数学运算精度问题在Excel中，浮点运算得到的结果可能不准确https://learn.microsoft.com/zh-cn/office/troubleshoot/excel/floating-point-arithmetic-inaccurate-result本文讨论MicrosoftExcel如何存储和计算浮点数。由于存在舍入或数据截断，这可能会影响某些数字或公式的
【Python基础】条件语句丷从心 Python基础 python
文章目录@[toc]布尔类型示例条件表达式比较运算符示例逻辑运算符and示例or示例not示例特殊情况下的逻辑运算符andorif语句语法示例else语句语法示例elif语句语法执行流程示例if语句嵌套语法示例布尔类型布尔类型只有两个值True：表示“真”，表示条件成立False：表示“假”，表示条件不成立示例have_money=Truehave_troubles=False条件表达式条件表达式
centos开机启动流程乐闻w linux centos linux 运维
为什么了解开机启动流程？排查服务器启动不了的故障，了解原因（troubleshooting）开机->post开机自检->BIOS对硬件进行检测->boot启动顺序检查->硬盘->MBR->grub2引导程序->文件系统驱动->内核文件vmlinuz/intramfs->systemd进程->启动对应的运行级别的服务->登录->检查用户信息是否正确->运行家目录下的环境变量文件。开机自检是主板上的B
Chapter 4-12. Troubleshooting Congestion in Fibre Channel Fabrics mounter625 Linux kernel 网络运维服务器 kernel linux infiniband
Error-statisticsTheshowloggingonboardmoduleerror-statscommanddisplaysspecificerror-statistics,likeTxWait,thatarerecordedevery20seconds.Eachmodulecheckseachofitsinterfaces’errorcountersevery20secondsto
solution的一知半解 zilan23 英文
1.solution作为“解决办法；解答”意义时，后接介词to,for,of均可Perhapseconomyisthesolutionof/toyourfinancialtroubles.也许节约是解决你财务困难的办法。Thereseemstobenosolutiontotheproblem.这个问题似乎没有解决的办法。Wehavenowworkedoutabettersolutionforthe
Chapter 4-16. Troubleshooting Congestion in Fibre Channel Fabrics mounter625 Linux kernel 服务器运维 linux kernel
ShowFCSIeExample4-17showstheNX-OScommandshowfcsieonCiscoMDSswitches.例4-17显示了CiscoMDS交换机上的NX-OS命令showfcsie。Example4-17NX-OScommandshowfcsieonCiscoMDSswitchesMDS9706-C#showfcsieIEListforVSAN:20---------
Liunx 删除 /boot 恢复方法 IT 敲你鸡娃 linux
用公网的包（1）进入救援模式将CentOS7.9安装DVD插入光驱，设置计算机从DVD启动。不同计算机进入BIOS/UEFI设置启动顺序的方式不同，常见的有按Del、F2、F10等键。在启动界面选择Troubleshooting（故障排查）选项。接着选择RescueaCentOSsystem（救援CentOS系统）。等待系统加载相关文件，加载完成后，按提示选择Continue，系统会自动将原系统挂
Centos 7拯救boot下的文件 IT小饕餮 linux基础 centos linux 运维
1.进入救援模式插入CentOS7安装光盘，重启系统。在开机时按BIOS设置对应的按键（通常是F2等），将启动顺序调整为CD-ROM优先。系统从光盘启动后，选择“Troubleshooting”，然后选择“RescueaCentOSsystem”，按提示选择语言等设置，进入救援模式。系统会提示你如何挂载现有的系统，选择“Continue”选项，系统会将你的现有系统挂载到/mnt/sysimage目
通过css和js实现流星雨效果郭宝 Web前端
页面代码：============================流星雨============================import"./css/style.css"exportdefault{name:'HelloWorld',data(){return{msg:'WelcometoYourVue.jsApp'}},mounted(){this.shootingStar();},meth
Chapter 4-6. Troubleshooting Congestion in Fibre Channel Fabrics mounter625 Linux kernel 网络运维服务器 linux kernel tcp/ip
Oncethecongestionischasedtotheadjacentswitch(Switch-1inFigure4-8),thencontinuelookingforcongestiononthatswitchandrepeatthesesteps.IfthecongestionindicationontheISLportonthelocalswitch(Switch-3inFigure
Chapter 4-8. Troubleshooting Congestion in Fibre Channel Fabrics mounter625 Linux kernel 服务器网络 kernel linux
Utilizingtheshowtech-supportslowdrainCommandTheshowtech-supportslowdrainisasinglecommandonCiscoMDSswitchesthataggregatesalltheothercommandsnormallynecessaryfortroubleshootingcongestionintoasingleoutpu
离线安装IE 11(Internet Explorer 11)/脱机安装IE11 brian0031 windows ie11 脱机版离线安装
离线安装IE11(InternetExplorer11)/脱机安装IE11如果电脑需要在不联网的情况下安装IE11(InternetExplorer11)，需要提前安装好6个补丁程序，请看微软的官方说明https://docs.microsoft.com/zh-cn/troubleshoot/developer/browsers/installation/prerequisite-updates-
your HTTP request connection start duration too long hshpy http 网络协议网络
IfyourHTTPrequestconnectionstartdurationistakingmorethan7seconds,herearesomepossiblecausesandtroubleshootingsteps:PossibleCauses:NetworkLatency–Slowinternetornetworkcongestion.DNSResolutionDelay–SlowD
Flowerpot S ^O^凡人多烦事 mysql 数据库 c语言
[USACO12MAR]FlowerpotSDescriptionFarmerJohnhasbeenhavingtroublemakinghisplantsgrow,andneedsyourhelptowaterthemproperly.YouaregiventhelocationsofNraindrops(1structs{intx,y;};structsa[100002],e[100002];
Linux nmcli 命令使用详解 linux
简介nmcli是与NetworkManager交互的命令行工具，用于管理Linux系统上的网络连接。它提供了一种配置、监控和排除网络连接故障的全面方法。nmcli特性Networkmanagement：轻松配置网络接口（Wi-Fi、以太网、VPN等）Automation：通过脚本自动执行网络设置或状态检查Monitoring：检查网络状态和统计数据Troubleshooting：从终端快速诊断网络
Linux: 一切皆文件; peekfd: 偷看一切文件读写后端服务器
引内容简介Linux大部分数据流动，包括进程间通讯，socket……均通过文件描述符(fd)读写实现。在troubleshooting时，如果可以偷看到fd的流量，那么很多问题可以加速证明/证伪。本文介绍一个老工具peekfd，可以在一定环境中完成这个任务。我遇到的问题我在《小编码，我输给AI了——简记一次父子进程互锁的坑，自己挖的》中说了一个场景。下面是进程父子关系图，我想用kill-QUIT$
软件复位 ESP8266 armcsdn Nodemcu ESP8266
ThistutorialshowshottosoftwareresetESP8266inArduinoIDE.Thissketch/exampleshowssoftwareresetusingsimplecommandESP.restart()orESP.reset().SoftwareresetforESP8266isrequiredwhenyougettroubletoconnectWiFir
【github | SSH key】配置ssh key过程 & trouble shooting 只要你一直跑 ssh github git
本文记录配置githubsshkey的步骤以及遇到过的问题，内容包括：生成密钥检查密钥配置到sshagent配置到github检查连接下载代码troubleshooting大体上讲参照官方文档就能配好，但是如果自己是第二次给别的github账号配可能会遇到别的问题，所以如果是第一次配可以直接参考官方文档（见文末）1.生成公钥和私钥去到指定目录（~/.ssh）cd~/.ssh创建密钥$ssh-key
yarn：安装依赖包出现“There appears to be trouble with your network connection. Retrying...” dingcho 前端 yarn
//设置镜像地址为淘宝（地址1，推荐）：yarnconfigsetregistryhttps://registry.npmmirror.com//设置镜像地址为淘宝（地址2）：yarnconfigsetregistryhttps://registry.npm.taobao.org
筱轩的ScalersTalk第四轮新概念朗读持续力训练Day28 2018-11-04 筱轩私塾
［day2820181104］练习材料：图片发自App任务配置：L0+L4L0时间：73″知识笔记：1.第一遍听几乎没听懂；ancientmyths/owners/garage/stonehead/MedusatheGorgon/turnedtostone2.发音：Jasper/myths/garage/effect3.词组：believesin/hashadtroublewithsthorsb/
面向对象面向过程 3213213333332132 java
面向对象：把要完成的一件事，通过对象间的协作实现。面向过程：把要完成的一件事，通过循序依次调用各个模块实现。我把大象装进冰箱这件事为例，用面向对象和面向过程实现，都是用java代码完成。 1、面向对象 package bigDemo.ObjectOriented; /** * 大象类 * * @Description * @author FuJian
Java Hotspot: Remove the Permanent Generation bookjovi HotSpot
openjdk上关于hotspot将移除永久带的描述非常详细，http://openjdk.java.net/jeps/122 JEP 122: Remove the Permanent Generation Author Jon Masamitsu Organization Oracle Created 2010/8/15 Updated 2011/
正则表达式向前查找向后查找,环绕或零宽断言 dcj3sjt126com 正则表达式
向前查找和向后查找 1. 向前查找：根据要匹配的字符序列后面存在一个特定的字符序列(肯定式向前查找)或不存在一个特定的序列(否定式向前查找)来决定是否匹配。.NET将向前查找称之为零宽度向前查找断言。对于向前查找，出现在指定项之后的字符序列不会被正则表达式引擎返回。 2. 向后查找：一个要匹配的字符序列前面有或者没有指定的
BaseDao 171815164 seda
import java.sql.Connection; import java.sql.DriverManager; import java.sql.SQLException; import java.sql.PreparedStatement; import java.sql.ResultSet; public class BaseDao { public Conn
Ant标签详解--Java命令 g21121 Java命令
这一篇主要介绍与java相关标签的使用终于开始重头戏了，Java部分是我们关注的重点也是项目中用处最多的部分。 1
[简单]代码片段_电梯数字排列 53873039oycg 代码
今天看电梯数字排列是9 18 26这样呈倒N排列的,写了个类似的打印例子，如下: import java.util.Arrays; public class 电梯数字排列_S3_Test { public static void main(S
Hessian原理云端月影 hessian原理
Hessian 原理分析一．远程通讯协议的基本原理网络通信需要做的就是将流从一台计算机传输到另外一台计算机，基于传输协议和网络 IO 来实现，其中传输协议比较出名的有 http 、 tcp 、 udp 等等， http 、 tcp 、 udp 都是在基于 Socket 概念上为某类应用场景而扩展出的传输协
区分Activity的四种加载模式----以及Intent的setFlags aijuans android
在多Activity开发中，有可能是自己应用之间的Activity跳转，或者夹带其他应用的可复用Activity。可能会希望跳转到原来某个Activity实例，而不是产生大量重复的Activity。这需要为Activity配置特定的加载模式，而不是使用默认的加载模式。加载模式分类及在哪里配置 Activity有四种加载模式： standard singleTop
hibernate几个核心API及其查询分析 antonyup_2006 html .net Hibernate xml 配置管理
(一) org.hibernate.cfg.Configuration类读取配置文件并创建唯一的SessionFactory对象.(一般,程序初始化hibernate时创建.) Configuration co
PL/SQL的流程控制百合不是茶 oracle PL/SQL编程循环控制
PL/SQL也是一门高级语言,所以流程控制是必须要有的,oracle数据库的pl/sql比sqlserver数据库要难,很多pl/sql中有的sqlserver里面没有流程控制; 分支语句 if 条件 then 结果 else 结果 end if ; 条件语句 case when 条件 then 结果; 循环语句 loop
强大的Mockito测试框架 bijian1013 mockito 单元测试
一.自动生成Mock类在需要Mock的属性上标记@Mock注解，然后@RunWith中配置Mockito的TestRunner或者在setUp()方法中显示调用MockitoAnnotations.initMocks(this);生成Mock类即可。二.自动注入Mock类到被测试类 &nbs
精通Oracle10编程SQL(11)开发子程序 bijian1013 oracle 数据库 plsql
/* *开发子程序 */ --子程序目是指被命名的PL/SQL块，这种块可以带有参数，可以在不同应用程序中多次调用 --PL/SQL有两种类型的子程序：过程和函数 --开发过程 --建立过程：不带任何参数 CREATE OR REPLACE PROCEDURE out_time IS BEGIN DBMS_OUTPUT.put_line(systimestamp); E
【EhCache一】EhCache版Hello World bit1129 Hello world
本篇是EhCache系列的第一篇，总体介绍使用EhCache缓存进行CRUD的API的基本使用，更细节的内容包括EhCache源代码和设计、实现原理在接下来的文章中进行介绍环境准备 1.新建Maven项目 2.添加EhCache的Maven依赖 <dependency> <groupId>ne
学习EJB3基础知识笔记白糖_ bean Hibernate jboss webservice ejb
最近项目进入系统测试阶段，全赖袁大虾领导有力，保持一周零bug记录，这也让自己腾出不少时间补充知识。花了两天时间把“传智播客EJB3.0”看完了，EJB基本的知识也有些了解，在这记录下EJB的部分知识，以供自己以后复习使用。 EJB是sun的服务器端组件模型，最大的用处是部署分布式应用程序。EJB (Enterprise JavaBean)是J2EE的一部分，定义了一个用于开发基
angular.bootstrap boyitech AngularJS AngularJS API angular中文api
angular.bootstrap 描述：手动初始化angular。这个函数会自动检测创建的module有没有被加载多次，如果有则会在浏览器的控制台打出警告日志，并且不会再次加载。这样可以避免在程序运行过程中许多奇怪的问题发生。使用方法： angular .
java-谷歌面试题-给定一个固定长度的数组，将递增整数序列写入这个数组。当写到数组尾部时，返回数组开始重新写，并覆盖先前写过的数 bylijinnan java
public class SearchInShiftedArray { /** * 题目：给定一个固定长度的数组，将递增整数序列写入这个数组。当写到数组尾部时，返回数组开始重新写，并覆盖先前写过的数。 * 请在这个特殊数组中找出给定的整数。 * 解答： * 其实就是“旋转数组”。旋转数组的最小元素见http://bylijinnan.iteye.com/bl
天使还是魔鬼？都是我们制造 ducklsl 生活教育情感
----------------------------剧透请原谅，有兴趣的朋友可以自己看看电影，互相讨论哦！！！从厦门回来的动车上，无意中瞟到了书中推荐的几部关于儿童的电影。当然，这几部电影可能会另大家失望，并不是类似小鬼当家的电影，而是关于“坏小孩”的电影！自己挑了两部先看了看，但是发现看完之后，心里久久不能平
[机器智能与生物]研究生物智能的问题 comsci 生物
我想,人的神经网络和苍蝇的神经网络,并没有本质的区别...就是大规模拓扑系统和中小规模拓扑分析的区别.... 但是,如果去研究活体人类的神经网络和脑系统,可能会受到一些法律和道德方面的限制,而且研究结果也不一定可靠,那么希望从事生物神经网络研究的朋友,不如把
获取Android Device的信息 dai_lm android
String phoneInfo = "PRODUCT: " + android.os.Build.PRODUCT; phoneInfo += ", CPU_ABI: " + android.os.Build.CPU_ABI; phoneInfo += ", TAGS: " + android.os.Build.TAGS; ph
最佳字符串匹配算法（Damerau-Levenshtein距离算法）的Java实现 datamachine java 算法字符串匹配
原文：http://www.javacodegeeks.com/2013/11/java-implementation-of-optimal-string-alignment.html------------------------------------------------------------------------------------------------------------
小学5年级英语单词背诵第一课 dcj3sjt126com english word
long 长的 show 给...看，出示 mouth 口，嘴 write 写 use 用，使用 take 拿，带来 hand 手 clever 聪明的 often 经常 wash 洗 slow 慢的 house 房子 water 水 clean 清洁的 supper 晚餐 out 在外 face 脸，
macvim的使用实战 dcj3sjt126com mac vim
macvim用的是mac里面的vim, 只不过是一个GUI的APP, 相当于一个壳 1. 下载macvim https://code.google.com/p/macvim/ 2. 了解macvim :h vim的使用帮助信息 :h macvim
java二分法查找蕃薯耀 java二分法查找二分法 java二分法
java二分法查找 >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 蕃薯耀 2015年6月23日 11:40:03 星期二 http:/
Spring Cache注解+Memcached hanqunfeng spring memcached
Spring3.1 Cache注解依赖jar包：  <dependency> <groupId>com.google.code.simple-spring-memcached</groupId> <artifactId>simple-s
apache commons io包快速入门 jackyrong apache commons
原文参考 http://www.javacodegeeks.com/2014/10/apache-commons-io-tutorial.html Apache Commons IO 包绝对是好东西，地址在http://commons.apache.org/proper/commons-io/，下面用例子分别介绍： 1）工具类 2
如何学习编程 lampcy java 编程 C++c
首先,我想说一下学习思想.学编程其实跟网络游戏有着类似的效果.开始的时候,你会对那些代码,函数等产生很大的兴趣,尤其是刚接触编程的人,刚学习第一种语言的人.可是,当你一步步深入的时候,你会发现你没有了以前那种斗志.就好象你在玩韩国泡菜网游似的,玩到一定程度,每天就是练级练级,完全是一个想冲到高级别的意志力在支持着你.而学编程就更难了,学了两个月后,总是觉得你好象全都学会了,却又什么都做不了,又没有
架构师之spring-----spring3.0新特性的bean加载控制@DependsOn和@Lazy nannan408 Spring3
1.前言。如题。 2.描述。 @DependsOn用于强制初始化其他Bean。可以修饰Bean类或方法，使用该Annotation时可以指定一个字符串数组作为参数，每个数组元素对应于一个强制初始化的Bean。 @DependsOn({"steelAxe","abc"}) @Comp
Spring4+quartz2的配置和代码方式调度 Everyday都不同代码配置 spring4 quartz2.x 定时任务
前言：这些天简直被quartz虐哭。。因为quartz 2.x版本相比quartz1.x版本的API改动太多，所以，只好自己去查阅底层API…… quartz定时任务必须搞清楚几个概念： JobDetail——处理类 Trigger——触发器，指定触发时间，必须要有JobDetail属性，即触发对象 Scheduler——调度器，组织处理类和触发器，配置方式一般只需指定触发
Hibernate入门 tntxia Hibernate
前言使用面向对象的语言和关系型的数据库，开发起来很繁琐，费时。由于现在流行的数据库都不面向对象。Hibernate 是一个Java的ORM（Object/Relational Mapping）解决方案。 Hibernte不仅关心把Java对象对应到数据库的表中，而且提供了请求和检索的方法。简化了手工进行JDBC操作的流程。如
Math类 xiaoxing598 Math
一、Java中的数字（Math）类是final类，不可继承。 1、常数 PI：double圆周率 E：double自然对数 2、截取（注意方法的返回类型） double ceil(double d) 返回不小于d的最小整数 double floor(double d) 返回不大于d的整最大数 int round(float f) 返回四舍五入后的整数 long round

11.2.0.3 RAC（VCS）节点crash以及hang的问题分析

你可能感兴趣的:(trouble,shooting)