The short article will provide you with the most common problem patterns you can face with hanging Java Threads at socketinputstream.socketread0.
For more detail and troubleshooting approaches for this type of problem, please visit my orignal post on this subject.
Problem overview
Any communication protocol such as HTTP / HTTPS, JDBC, RMI etc. ultimately rely on the JDK java.net layer to perform lower TCP-IP / Socket operations. The creation of ajava.net.Socket is required in order for your application & JVM to connect, send and receive the data from an external source (Web Service, Oracle database etc.).
The SocketInputStream.socketRead0 is the actual native blocking IO operation executed by the JDK in order to
read and
receive the data from the remote source. This is why Thread hanging problems on such operation is so common in the Java EE world.
java.net.socketinputstream.socketread0() – why is it hanging?
There are a few common scenarios which can lead your application and Java EE server Threads to hang for some time or even forever at java.net.socketinputstream.socketread0.
# Problem pattern #1
Slowdown or instability of a remote service provider invoked by your application such as:
- A Web Service provider (via HTTP/HTTPS)
- A RDBMS (Oracle) database
- A RMI server etc.
- Other remote service providers (FTP, pure TCP-IP etc.)
This is by far the most common problem
~90%+. See below an example of hang Thread from the Thread Dump data extract due to instability of a remote Web Service provider:
# Problem pattern #2
Functional problem causing long running transaction(s) from your remote service provider
This is quite similar to problem pattern #1 but the difference is that the remote service provider is healthy but taking more time to process certain requests from your application due to a bad functional behaviour.
A good example is a long running Oracle database SQL query (lack of indexes, execution plan issue etc.) that will show up in the Thread Dump as per below:
# Problem pattern #3
Intermittent or consistent network slowness or latency.
Severe network latency will cause the data transfer between the client and server to slowdown, causing the SocketInputStream write() and read() operations to take more time to complete and Thread to hang at socketinputstream.socketread0 until the bytes data is received from the network .
java.net.socketinputstream.socketread0() – what is the solution?
# Problem pattern #1
The solution for this problem pattern is to contact the support team of the affected remote service provider and share your observations from your application, Threads etc. so they can investigate and resolve their global system problem.
# Problem pattern #2
The solution for this problem pattern will depend of the technology involved. A root cause analysis must be performed in order to identify and fix the culprit (missing table indexes, too much data returned from the Web Service etc.).
# Problem pattern #3
The solution for this problem pattern will require the engagement of your network support team so they can conduct some a “sniffing” activity of the TCP-IP packets between your application server(s) and your affected remote service provider (s).
You should also attempt to replicate the problem using OS commands such as ping andtraceroute to provide some guidance to your network support team.
Final recommendation – timeout implementation is critical!
Proper timeouts should be implemented, when possible, in order to prevent a domino affect situation on your Java EE application server. The timeout value should be low enough to prevent your Threads to hang for too long and tested properly (negative testing).
Socket timeouts (connect, read & write operations) for Web Services via HTTP / HTTPS are quite easy to implement and can be achieved by following your proper API documentation (JAX-WS, Weblogic, WAS, JBoss WS, Apache AXIS etc.).