http://today.java.net/article/2010/03/31/html5-server-push-technologies-part-1
The upcoming HTML5 specification includes a lot of powerful and exiting features which turn web browsers into a fully capable rich internet application (RIA) client platform. This article takes a deeper look into two new HTML5 communication standards, Server-Sent Events and WebSockets. These new standards have the potential to become the dominant Server-push technologies for helping developers to write real-time web applications.
Since its beginning in the 1990s, the World Wide Web has grown rapidly and became more and more dynamic. While the first generation of web pages were static documents, later generations of web sites have been enriched by dynamic elements, followed ultimately by the development of highly-interactive browser-based rich internet applications.
The Hypertext Transfer Protocol (HTTP) was designed in the early days of the internet to transport information entities such as web pages or multimedia files between clients and servers. HTTP became a fundamental part of the World Wide Web initiative. The design of the HTTP protocol has been driven by the idea of a distributed, collaborative, hypermedia information system "to give universal access to a large universe of documents." The main design goal of the HTTP protocol was to minimize latency and network communication on the one hand, and to maximize scalability and independence of the components on the other hand.
The HTTP protocol implements a strict request-response model, which means that the client sends a HTTP request and waits until the HTTP response is received. The protocol has no support wherein the server initiates interaction with the client.
Listing 1. HTTP request on the wire
GET /html/rfc2616.html HTTP/1.1
Host: tools.ietf.org
User-Agent: xLightweb/2.11.3
For instance, a browser or other user agent will request a specific web page by performing an HTTP GET
request such as shown in Listing 1. The server returns the requested information entity by a HTTP response such as shown in Listing 2. The response consists of the HTTP response header, and the HTTP response body, which are separated by a blank line. The GET
request, in contrast, includes a HTTP header only.
Listing 2. HTTP response on the wire
HTTP/1.1 200 OK
Server: Apache/2.2.14 (Debian)
Date: Tue, 09 Feb 2010 05:00:01 GMT
Content-Length: 510825
Content-Type: text/html; charset=UTF-8
Last-Modified: Fri, 25 Dec 2009 05:49:54 GMT
ETag: "302e72-7cb69-47b871ff82480"
Accept-Ranges: bytes
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html lang="en" xml:lang="en">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=us-ascii" />
<meta name="robots" content="index,follow" />
<meta name="creator" content="rfcmarkup version 1.53" />
<link rel="icon" href="/images/rfc.png" type="image/png" />
<link rel="shortcut icon" href="/images/rfc.png" type="image/png" />
<title>RFC 2616 Hypertext Transfer Protocol -- HTTP/1.1</title>
[...]
</small></small></span>
</body></html>
To create more interactive web applications, the AJAX approach has been established as a popular solution for dynamically pulling data requests from the server. By using AJAX the browser application performs HTTP requests based on the XMLHttpRequest
API. The XMLHttpRequest
API enables performing the HTTP request in the background in an asynchronous way without blocking the user interface. AJAX does not define new kinds of HTTP requests or anything else. It just performs a HTTP request in the background.
In the case of a Web mail application, for instance, the web application could periodically perform an "AJAX request" to ask the server if the mailbox content is changed. This polling approach causes an event latency which depends on the polling period. Increasing the polling rate reduces event latency. The downside of this is that frequent polling wastes system resources and scales poorly. Most polling calls return empty, with no new events having occurred.
While Ajax is a popular solution for dynamically pulling data requests from the server, it does nothing to help to push data to the client. Sure, a server push channel could be emulated by an AJAX polling approach as described above, but this would waste resources. Comet, also known as reverse Ajax, enhances the Ajax communication pattern by defining architecture for pushing data from the server to the client. For instance, the Comet pattern would allow pushing a 'new mail available' event from the mail server to the WebMail client immediately.
Like AJAX, Comet is build on the top of the existing HTTP protocol without modifying it. This is a little bit tricky because the HTTP protocol is not designed to send unrequested responses from the server to the client. A HTTP response always requires a previous HTTP request initiated by the client. The Comet approach breaks this limitation by maintaining long-lived HTTP connections and "faking" notifications.
In practice, two popular strategies has been established. With long-polling, the client sends a HTTP request, waiting for a server event. If an event occurs on the server-side, the server sends the response including the event data. After receiving the response containing the event data, the client will send a request again, waiting for the next event. There is always a pending request which allows the server to send a response at any time. With Http Streaming, the server keeps the response message open. In contrast to long polling the HTTP response message (body) will not be closed after sending an event to the client. If an event occurs on the server-side, the server will write this event to the open response message body. The HTTP response message body represents a unidirectional event stream to the client.
The Comet approach allows implementing a real-time web. In contrast of the beginning of the web, many of today's web applications have to receive information as soon as it's been published on the server-side. For instance a web-based chat application will be successful only if the underlying architecture supports real-time communication.
The upcoming HTML 5 extends the HTML language to better support highly interactive web applications. HTML 5 turns web browsers into a fully capable rich internet application client platform, and is a direct competitor of other client platform environments such as Adobe Flash and Microsoft Silverlight.
For instance the HTML 5 standard includes technologies such as the WebWorkers API which allows running scripts in the background independently, the WebStorage API to store structured data on the client side or the new canvas element which supports dynamic scriptable rendering of 2D bitmap images.
HTML5 also applies the Comet communication pattern by defining Server-Sent Events (SSE), in effect standardizing Comet for all standards-compliant web browsers. The Server-Sent Events specification "defines an API for opening an HTTP connection for receiving push notifications from a server." Server-Sent Events includes the new HTML element EventSource
as well as a new mime type text/event-stream
which defines an event framing format.
Listing 3. Example JavaScript using the EventSource interface
<html>
<head>
<script type='text/javascript'>
var source = new EventSource('Events');
source.onmessage = function (event) {
ev = document.getElementById('events');
ev.innerHTML += "<br>[in] " + event.data;
};
</script>
</head>
<body>
<div id="events"></div>
</body>
</html>
The EventSource
represents the client-side end point to receive events. The client opens an event stream by creating an EventSource
, which takes an event source URL as its constructor argument. The onmessage
event handler will be called each time new data is received.
In general, browsers limit the connections per server. Under some circumstances, loading multiple pages that include an EventSource from the same domain can result in each EventSource
creating a dedicated connection. Often the maximum number of connections is quickly exceeded in such situations. To handle the per-server connection limitation, a shared WebWorker
, which shares a single EventSource object, can be used. Furthermore, by definition the browser-specific EventSource
implementation is free to reuse an existing connection if the event source absolute URL is equal to the required one. In this case, sharing connections will be managed by the browser-specific EventSource
implementation.
Figure 1 shows the HTTP request which will be sent by a browser if an event stream is opened. The Accept
header indicates the required format, text/event-stream
. Although the new mime type text/event
is defined by the Server-Sent Events spec, the specification also allows using other formats for event framing. However, a valid Server-Sent Events implementation has to support the mime type text/event-stream
at minimum.
REQUEST:
GET /Events HTTP/1.1
Host: myServer:8875
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; de-DE)
AppleWebKit/532+ (KHTML, like Gecko) Version/4.0.4
Safari/531.21.10
Accept-Encoding: gzip, deflate
Referer: http://myServer:8875/
Accept: text/event-stream
Last-Event-Id: 6
Accept-Language: de-DE
Cache-Control: no-cache
Connection: keep-alive
RESPONSE:
HTTP/1.1 200 OK
Server: xLightweb/2.12-HTML5Preview6
Content-Type: text/event-stream
Expires: Fri, 01 Jan 1990 00:00:00 GMT
Cache-Control: no-cache, no-store, max-age=0, must-revalidate
Pragma: no-cache
Connection: close
: time stream
retry: 5000
id: 7
data: Thu Mar 11 07:31:30 CET 2010
id: 8
data: Thu Mar 11 07:31:35 CET 2010
[...]
Figure 1. An example event-stream
According to the mime type text/event-stream
, an event consists of one or more comment lines and/or field lines on the wire level. The event is delimited by a blank line. A comment always starts with a colon (:). Fields, on the other hand, consist of a field name and field value separated by a colon.
Figure 1 also shows an example response. To avoid caching, the response header includes cache directives which disable caching of the response. Event streams should not be cached by definition.
The example response includes 3 events. The first event includes a comment and a retry field; the second and third event includes an id field and event data. The data
field holds the event data, which is the current time in the example above. The second and third event also includes an id
field to track progress through the event stream. The example server application writes a new event on the wire every 5 seconds. If the EventSource
receives the event, the onmmessage
handler will be called.
In contrast to the second and third event, the first event will not trigger the onmmessage
handler. The first event does not contain data. It includes an comment
field and a retry
field for reconnecting purposes. The retry
field defines the reconnect time in milliseconds. If such a field is received, the EventSource
will update its associated reconnection time property with the received one. The reconnect time plays an important role in improving reliability in cases where network errors arise. If the event source instance detects that the connection is dropped, the connection will be re-established automatically after a delay equal to the reconnection time.
As shown in Figure 1, the HTTP request to establish the connection can be enriched by the Last-Event-Id
header. This header will be set if the EventSource
's last event id property is set with a non-empty string. The EventSource
's last event id property will be updated each time an event is received that contains an id
, or it will be updated an empty string if no id
is present. Because of this, an event stream can be re-established without repeating or missing any events. This guarantees message delivery if lastEventId
handling is implemented on the server side.
Listing 4 shows an example HttpServer
based on the Java HTTP library xLightweb (with HTML5 preview extension), which is maintained by the author.
Listing 4. Example Java-based server delivering text/event-stream
class ServerHandler implements IHttpRequestHandler {
private final Timer timer = new Timer(false);
public void onRequest(IHttpExchange exchange) throws IOException {
String requestURI = exchange.getRequest().getRequestURI();
if (requestURI.equals("/ServerSentEventExample")) {
sendServerSendPage(exchange, requestURI);
} else if (requestURI.equals("/Events")) {
sendEventStream(exchange);
} else {
exchange.sendError(404);
}
}
private void sendServerSendPage(IHttpExchange exchange,
String uri) throws IOException {
String page = "<html>/r/n " +
" <head>/r/n" +
" <script type='text/javascript'>/r/n" +
" var source = new EventSource('Events');/r/n" +
" source.onmessage = function (event) {/r/n" +
" ev = document.getElementById('events');/r/n" +
" ev.innerHTML += /"<br>[in] /" + event.data;/r/n"+
" };/r/n" +
" </script>/r/n" +
" </head>/r/n" +
" <body>/r/n" +
" <div id=/"events/"></div>/r/n" +
" </body>/r/n" +
"</html>/r/n ";
exchange.send(new HttpResponse(200, "text/html", page));
}
private void sendEventStream(final IHttpExchange exchange)
throws IOException {
// get the last id string
final String idString = exchange.getRequest().getHeader(
"Last-Event-Id", "0");
// sending the response header
final BodyDataSink sink = exchange.send(new
HttpResponseHeader(200, "text/event-stream"));
TimerTask tt = new TimerTask() {
private int id = Integer.parseInt(idString);
public void run() {
try {
Event event = new Event(new Date().toString(), ++id);
sink.write(event.toString());
} catch (IOException ioe) {
cancel();
sink.destroy();
}
};
};
Event event = new Event();
event.setRetryMillis(5 * 1000);
event.setComment("time stream");
sink.write(event.toString());
timer.schedule(tt, 3000, 3000);
}
}
XHttpServer server = new XHttpServer(8875, new ServerHandler());
server.start();
The Server-Sent Events specification recommends sending a "keep-alive" comment event periodically, if no other data event is send. Proxy servers are known which drop an HTTP connection after a short period of inactivity. Such proxy servers close idling connections to avoid wasting connections to unresponsive HTTP servers. Sending a comment event deactivates this behaviour. Even though the EventSource will re-establish the connection automatically, sending comment events periodically avoids unnecessary reconnects.
Server-Sent Events are based on HTTP streaming. As described above, the response stays open and event data are written as they occur on the server side. Theoretically, HTTP streaming will cause trouble if network intermediaries such as HTTP proxies do not forward partial responses immediately. The current HTTP RFC (RFC 2616 Hypertext Transfer Protocol - HTTP/1.1) does not require that partial responses have to be forwarded immediately. However, a lot of popular, well-working web applications exists which are built on the HTTP Streaming approach. Furthermore, production-level intermediaries always avoid buffering large amounts of data to minimize their memory usage.
In contrast to other popular Comet protocols such as Bayeux or BOSH, Server-Sent Events support a unidirectional server-to-client channel only. The Bayeux protocol on the other side supports a bidirectional communication channel. Furthermore, Bayeux can use HTTP streaming as well as long polling. Like Bayeux, the BOSH protocol is a bidirectional protocol. BOSH is based on the long polling approach.
Although Server-Sent Events do have less functionality than Bayeux or BOSH, Server-Sent Events have the potential to be become the dominant protocol for use cases where a unidirectional server push channel is required only (which is the case in many instances). The Sever-Sent Events protocol is much simpler than Bayeux or BOSH. For instance, you are able to test the event stream by using telnet. No handshake protocols have to be implemented. Just send the HTTP GET
request and get the event stream. Furthermore Server-Sent Events will be supported natively by all HTML5-compatible browsers. In contrast, Bayeux and BOSH protocol are implemented on the top of the browser language environment.
Part 2 of this article series will cover the WebSocket API and protocol, and present concluding remarks.