HTML5 Case Study: Building the noVNC Client with WebSockets, Canvas and JavaScript
noVNC is a VNC client, implemented using HTML5 WebSockets, Canvas and JavaScript. InfoQ had a small Q&A with Joel Martin about noVNC and his experience in developing an HTML5 application:
InfoQ: Joel, would you like to give us an architectural overview of noVNC and how its various components it together?
Joel: The noVNC architecture is made up of 6 main components:
- Core VNC/RFB implementation: This component encapsulates all RFB protocol knowledge and is the main state machine that drives everything else.
- Canvas abstraction: This component provides an abstraction of HTML5 canvas APIs. It also does Canvas feature detection and works around browsers that don't have the full HTML5 canvas spec or have broken implementations.
- User interface: all HTML DOM interaction (except canvas) is encapsulated here. This component renders the page controls such as the connect/disconnect button, settings, and status feedback. One of my design goals for noVNC is that it is easy to drop into existing sites, so this component is optional.
- Utilities: This contains miscellaneous generic routines and extensions used by noVNC including: extensions to Javascript arrays to make them more useful as queues, cross-browser event handling, debug/logging. I'll also include some Javascript libraries from other sources to do base64 encode/decode and DES encryption (for VNC authentication).
- WebSockets fallback: most browsers in the wild don't have native WebSockets support so I include a Flash (flex) emulator for those browsers.I extended the original project with WebSockets encryption support.
- WebSockets to TCP proxy: The WebSockets standard is not a pure TCP socket implementation. There is an HTTP-like handshake to establish the initial connection and then every frame after that begins with a 0 (zero) byte and ends with a 255 byte. Until VNC servers implement WebSockets support (something I would like to see happen) the proxy is required to translate between WebSockets and standard TCP sockets. I have implemented this as a generic proxy (in both python and C) which might be useful to other developers who are working with WebSockets.
InfoQ: What were your main challenges developing an HTML5 application. What are the main pitfalls that developers should look out for?
Joel: The main challenges have been related to fall-back support for browsers that lack HTML5 features or have HTML5 features that are limited, perform poorly or are broken. For example, while Chrome 5 and Safari 5 have native WebSockets support, the current versions of Firefox and Opera do not. Some older versions of these browsers don't have the more recent canvas pixel manipulation APIs (or worse, they are there and broken in the case of Arora 0.5). No released versions of Internet Explorer have built-in WebSockets or even the most basic built-in canvas support (the IE 9 Preview has preliminary canvas support). Another challenge is performance optimization across multiple browsers. Each browser has different performance characteristics and these also change between different releases (and can be different between the same browser on different operating systems). This is one of the VERY few areas where I think browser detection can be appropriate.
InfoQ: What kind of tooling did you use? Do you find the current development tools powerful enough for building HTML5 applications? What kind of new tools would you like to see?
Joel: My development environment is pretty minimal. I use vim (with lots of extensions) on Linux to edit code. I make heavy use of firebug in firefox and the built-in developer tools in Chrome for debugging and profiling. I also make liberal use of Crockford's JSLint to keep my Javascript code sane.
I wish that the profiling tools in firebug and Chrome where able to give finer granularity feedback than at the function level. I also wish the profilers gave more insight into what parts of the code are contributing the most to garbage collection. The noVNC code is now optimized to the point where I'm beginning to run into garbage collection as one of the main performance bottlenecks.
A new tool that I would love to see is a code analyzer (in the same vein as JSLint) that would scan a Javascript code base and generate a nice browser support table. The output I'm envisioning would be a list of features used in the code along the top, and major browsers and version on the left. Each cell would then report if the way the code is using that feature is supported for the given browser/version. It wouldn't eliminate the need to test the code on many browsers, but it would sure help during the development process to know if you are on track or not. Ideally the scanner should detect if the Javascript is doing proper detection/workarounds for that feature too.
InfoQ: What were the main limitations of the current specs and implementations that you had to overcome?
Joel: Fortunately, this is an area where the specs are pretty good.
The RFB (VNC) protocol is well documented here: http://tigervnc.org/cgi-bin/rfbproto.
This site provides a great reference for Javascript (including which browser versions support each feature): http://www.hunlock.com/
The best site I've found which documents which browsers support which features (covering HTML5 and much more) is http://caniuse.com/. The site http://quirksmode.org is invaluable for the finer grained detail of which browsers support which APIs and how to work around those limitiations. The biggest browser limitation I've had to overcome is the lack of native cross-browser WebSockets support. It's a recent standard that is still changing, but it's quickly being adopted. It's now in webkit so Chrome 5 and Safari 5 have it and so the iPhone should have it soon. It will probably land in firefox 4. Opera will certainly implement it sooner or later. The biggest question is whether the IE 9 team will decide to add support.
As I mentioned above, I use a Flash WebSockets emulator to support browsers without native support. Extending, fixing and working around bugs in the Flash (ActionScript) code has been a big task. Bridging Javascript and Flash is a major headache. Adobe provides FABridge (which the emulator uses) but the bridge is slow, bulky and difficult to debug.
While I'm thrilled by the news that IE 9 will have full (and fast) canvas support, I would still like to overcome the lack of any native canvas support in older version of IE since they are so prevalent. The two options I'm looking at are explorercanvas and fxcanvas. Explorercanvas is a Javascript library which creates a canvas API on top of IE's VML support and fxcanvas is a Flash implementation of canvas. However, both options have major issues. Explorercanvas doesn't support pixel manipulation (due to VML being vector rather than raster), and fxcanvas only provides a canvas-like API and has some non-trivial asynchronous processing issues. Unfortunately, the Javascript engines in IE 6, 7 and 8 are so slow compared to other current browsers that doing canvas emulation may very well prove unworkable and I may just end up pointing people to Chrome Frame.
InfoQ: What are your future plans for the noVNC project?
Joel: The thing I'm currently the most excited about is working with some QEMU/KVM developers (including a Google Summer of Code Student) to design a new VNC encoding that is more optimal for browser rendering. This new VNC/RFB encoding transfers image data in the PNG format. In addition to good compresion for lossless image data (comparable to the tight encoding), the PNG data stream can be easily rendered in the browser with very little decode work (unlike the tight encoding).
The requirement for the WebSockets to TCP sockets proxy is a barrier to wider use of noVNC. I would like to see WebSockets support added to VNC servers. I will personally focus on libvncserver (which is used to build several different VNC servers) and QEMU/KVM. But I would love to help and encourage other VNC server developers to add support. Adding WebSockets support to other VNC clients would also be useful because the WebSockets protocol is designed to be easily supported and proxied by Web servers (thus the initial HTTP compatible handshake). This could help with one of VNC's historical problems of dealing with firewalls.
One of the reasons I named the project "noVNC" is because I would also like to see the implementation of other "virtual network computing" protocols such as RDP, NX and Red Hat's Spice protocol.
If and when the iPhone adds native WebSockets support (obviously the Flash fallback is out of the question), then I would love to get noVNC running on the iPhone. One feature I would like to add that would be generally useful, but critical for smartphone support is viewport and/or scaling support. And Google, if you're listening, a free Android phone would be a great inspiration for getting noVNC to work on Android. :-)
A similar approach has been taken by project Guacamole which is also an HTML5 VNC viewer, which makes use of a server-side proxy written in Java. The current version is claimed to be almost as responsive as native VNC and work in any browser supporting the HTML5 canvas tag.
You can find more information about HTML5 and Rich Internet Applications, right here on InfoQ!
screenshots
http://kanaka.github.com/noVNC/screenshots.html
from http://www.infoq.com/news/2010/07/html5-novnc