What is HTTP Persistent Connections?
HTTP persistent connections, also called HTTP keep-alive, or HTTP connection reuse, is the idea of using the same TCP connection to send and receive multiple HTTP requests/responses, as opposed to opening a new one for every single request/response pair. Using persistent connections is very important for improving HTTP performance.
什么是HTTP长连接?
HTTP长连接,与一般每次发起http请求或响应都要建立一个tcp连接不同,http长连接利用同一个tcp连接处理多个http请求和响应,也叫HTTP keep-alive,或者http连接重用。使用http长连接可以提高http请求/响应的性能。
There are several advantages of using persistent connections, including:
Network friendly. Less network traffic due to fewer setting up and tearing down of TCP connections. What does the current JDK do for Keep-Alive?
The JDK supports both HTTP/1.1 and HTTP/1.0 persistent connections.
When the application finishes reading the response body or when the application calls close() on the InputStream returned by URLConnection.getInputStream(), the JDK's HTTP protocol handler will try to clean up the connection and if successful, put the connection into a connection cache for reuse by future HTTP requests.
The support for HTTP keep-Alive is done transparently. However, it can be controlled by system properties http.keepAlive, and http.maxConnections, as well as by HTTP/1.1 specified request and response headers.
The system properties that control the behavior of Keep-Alive are:
http.keepAlive=<boolean>
default: true
Indicates if keep alive (persistent) connections should be supported.
http.maxConnections=<int>
default: 5
Indicates the maximum number of connections per destination to be kept alive at any given time
HTTP header that influences connection persistence is:
Connection: close
If the "Connection" header is specified with the value "close" in either the request or the response header fields, it indicates that the connection should not be considered 'persistent' after the current request/response is complete.
控制Keep-Alive表现的系统属性有:
http.keepAlive=<布尔值>
默认: true
指定长连接是否支持
http.maxConnections=<整数>
默认: 5
指定对同一个服务器保持的长连接的最大个数。
影响长连接的HTTP header是:
Connection: close
如果请求或响应中的Connection header被指定为close,表示在当前请求或响应完成后将关闭TCP连接。
The current implementation doesn't buffer the response body. Which means that the application has to finish reading the response body or call close() to abandon the rest of the response body, in order for that connection to be reused. Furthermore, current implementation will not try block-reading when cleaning up the connection, meaning if the whole response body is not available, the connection will not be reused.
JDK中的当前实现不支持缓存响应体,所以应用程序必须读取完响应体内容或者调用close()关闭流并丢弃未读内容来重用连接。此外,当前实现在清理连接时并未使用阻塞读,这就意味这如果响应体不可用,连接将不能被重用。
What's new in Tiger?
When the application encounters a HTTP 400 or 500 response, it may ignore the IOException and then may issue another HTTP request. In this case, the underlying TCP connection won't be Kept-Alive because the response body is still there to be consumed, so the socket connection is not cleared, therefore not available for reuse. What the application needs to do is call HttpURLConnection.getErrorStream() after catching the IOException , read the response body, then close the stream. However, some existing applications are not doing this. As a result, they do not benefit from persistent connections. To address this problem, we have introduced a workaround.
The workaround involves buffering the response body if the response is >=400, up to a certain amount and within a time limit, thus freeing up the underlying socket connection for reuse. The rationale behind this is that when the server responds with a >=400 error (client error or server error. One example is "404: File Not Found" error), the server usually sends a small response body to explain whom to contact and what to do to recover.
JDK1.5中的新特性
当应用接收到400或500的HTTP响应时,它将忽略IOException 而另发一个HTTP 请求。这种情况下,底层的TCP连接将不会再保持,因为响应内容还在等待被读取,socket 连接未清理,不能被重用。应用可以在捕获IOException 以后调用HttpURLConnection.getErrorStream() ,读取响应内容然后关闭流。但是现存的应用没有这么做,不能体现出长连接的优势。为了解决这个问题,介绍下workaround。
当响应体的状态码大于或等于400的时候,workaround 将在一定时间内缓存一定数量的响应内容,释放底层的socket连接来重用。基本原理是当响应状态码大于或等于400时,服务器端会发送一个简短的响应体来指明连接谁以及如何恢复连接。
Several new Sun implementation specific properties are introduced to help clean up the connections after error response from the server.
The major one is:
sun.net.http.errorstream.enableBuffering=<boolean>
default: false
With the above system property set to true (default is false), when the response code is >=400, the HTTP handler will try to buffer the response body. Thus freeing up the underlying socket connection for reuse. Thus, even if the application doesn't call getErrorStream(), read the response body, and then call close(), the underlying socket connection may still be kept-alive and reused.
The following two system properties provide further control to the error stream buffering behavior:
sun.net.http.errorstream.timeout=<int> in millisecond
default: 300 millisecond
sun.net.http.errorstream.bufferSize=<int> in bytes
default: 4096 bytes
下面介绍一些SUN实现中的特定属性来帮助接收到错误响应体后清理连接:
主要的一个是:
sun.net.http.errorstream.enableBuffering=<布尔值>
默认: false
当上面属性设置为true后,在接收到响应码大于或等于400是,HTTP 句柄将尝试缓存响应内容。释放底层的socket连接来重用。所以,即便应用不调用getErrorStream()来读取响应内容,或者调用close()关闭流,底层的socket连接也将保持连接状态。
下面的两个系统属性是为了更进一步控制错误流的缓存行为:
sun.net.http.errorstream.timeout=<int> in 毫秒
默认: 300 毫秒
sun.net.http.errorstream.bufferSize=<int> in bytes
默认: 4096 bytes
What can you do to help with Keep-Alive?
Do not abandon a connection by ignoring the response body. Doing so may results in idle TCP connections. That needs to be garbage collected when they are no longer referenced.
If getInputStream() successfully returns, read the entire response body.
When calling getInputStream() from HttpURLConnection, if an IOException occurs, catch the exception and call getErrorStream() to get the response body (if there is any).
Reading the response body cleans up the connection even if you are not interested in the response content itself. But if the response body is long and you are not interested in the rest of it after seeing the beginning, you can close the InputStream. But you need to be aware that more data could be on its way. Thus the connection may not be cleared for reuse.
Here's a code example that complies to the above recommendation:
你如何做可以保持连接为连接状态呢?
不要忽略响应体而丢弃连接。这样会是TCP连接闲置,当不再被引用后将会被垃圾回收器回收。
如果getInputStream()返回成功,读取全部响应内容。如果抛出IOException ,捕获异常并调用getErrorStream() 读取响应内容(如果存在响应内容)。
即便你对响应内容不感兴趣,也要读取它,以便清理连接。但是,如果响应内容很长,你读取到开始部分后就不感兴趣了,可以调用close()来关闭流。值得注意的是,其他部分的数据已在读取中,所以连接将不能被清理进而被重用。
下面是一个基于上面建议的代码样例:
If you know ahead of time that you won't be interested in the response body, you should issue a HEAD request instead of a GET request. For example when you are only interested in the meta info of the web resource or when testing for its validity, accessibility and recent modification. Here's a code snippet:
如果你预先就对响应内容不感兴趣,你可以使用HEAD 请求来代替GET 请求。例如,获取web资源的meta信息或者测试它的有效性,可访问性以及最近的修改。下面是代码片段: