weblogic server hang problem on AIX platform patch and root cause description

WLS server hang on AIX

Patch :
@     PATCH REPOSITORY INFORMATION
@ ------------------------------------------
@ WLS Version | Patch ID |  Passcode
@ --------------+----------+----------------
@       9.2      |  T4DV    |  7C7PYV9B
@       9.2mp1   |  HZHQ    |  PTUYCCSI
@       9.2mp2   |  WJD2    |  GU1CW2AB   
@       9.2mp3   |  GNLT    |  8J9L6Q4Y
@       10.0     |  PMAJ    |  9UQ69LLT
@       10.0mp1  |  ITVL    |  K8RBHQQ2
@       10.3     |  9YT5    |  I1DB5QSV


Root cause description :

@ Hi Tushar,
@         As promised, here is a possible solution for the issue seen at
@ Mattel and elsewhere, that can be used in WLS. This builds on what you were
@ trying to do initially, and from whatever little testing I did, I saw that
@ the problem was getting resolved.
@       
@         To recap quickly from what we discussed yesterday, changing the
@ timeout value in poll to a configurable option may have a performance impact,
@  and a good value for it may be tough to find. You had described the option
@ of using a local "loopback" socket, which allows the poll() call to break
@ out. Even in that case, we could have a possible race condition between the
@ close() and poll(), and if the close() thread is starved for CPU for some
@ reason, we could end up having the same hang issue again.
@
@         I was first thinking in terms of using mutexes or conditional
@ signals, but they had the associated risk that if the thread owning the
@ mutex dies before releasing the mutex, we could get into a nasty situation.
@ So I did not pursue that line of thought further.
@
@         Instead, a simple & elegant solution has been proposed by a
@ colleague of mine, and I have implemented it as the attached testcase to
@ illustrate how these problems could be avoided. This assumes that the
@ loopback thread is being used for only the purpose of waking up from poll().
@
@         So, in poll(), we add an additional fd, the loopback socket, whose
@ sole purpose is to listen for notifcations from other threads before any
@ socket close is attempted. Before another thread tries to close, say, fd1,
@ it will write the numeric value of fd1 to the loopback socket.
@
@         When poll() wakes up (due to incoming data on loopback socket), it
@ will read the value of the fd being closed (fd1 in above example) and REMOVE
@ it from the poll() array. So, even if poll() is called before close() gets a
@ chance to execute, there will not be any race condition.
@
@         In the testcase, 10 sockets are created and provided to the native
@ "callPoll" method for waiting. The first one, on port 56789, is the loopback
@ socket. Then, in another thread, we close
@ these sockets one by one. As each socket is closed, the callPoll method
@ removes these sockets from the poll() fd array, ensuring that we do not
@ encounter the hang again.
@
@         It is important to first check for input on loopback, so that the
@ penalty associated with this change is restricted to only one "test". The
@ testcase I attached is still walking through the entire array, but I would
@ prefer checking the loopback fd as a special case, to minimize any kind of
@ impact on performance.
@
@         I hope this is useful. Please let me know if you have any questions.
@
@         The doMakeAIX.sh script should be executed after including Java 1.4
@ in the path and including the current directory in LIBPATH. Then you can run
@ "java PollChecker" to see the testcase in action.
@

你可能感兴趣的:(thread,socket,weblogic,AIX,performance)