
Introduction
For several years now, developers have been producing full-fledged interactive experiences that run, more or less, right in the browser. Such sites usually require browser plug-ins (Flash). With the advent of smartphones and tablets, interactive experiences seemed a perfect fit for the new gadgets. Because of the limiting processing power of mobile devices, however, browser plug-ins were no longer a viable platform for development.
Learn more. Develop more. Connect more.
The new developerWorks Premium membership program provides an all-access pass to powerful development tools and resources, including 500 top technical titles (dozens specifically for web developers) through Safari Books Online, deep discounts on premier developer events, video replays of recent O'Reilly conferences, and more. Sign up today.
HTML5 has added a huge pallet of in-browser tools that require no extra plug-ins. The HTML5 specification from the W3C is still under development, but browsers are providing support as the spec evolves.
HTML5 audio is a powerful advancement for embedding sound in the browser, especially on mobile devices such as iOS's mobile Safari browser. Though HTML5 audio is a new feature, it has support in iOS. According to developers of the popular mobile application Instapaper, 98.8% of its iOS users in November 2011 were using at least iOS 4 (see Resources). Because HTML5 audio was introduced to mobile Safari in iOS 3, you can be assured there is almost universal support for HTML5 audio on the iOS platform.
In this article, learn about the HTML5 limitations for the desktop and in mobile Safari, and try some solutions for creating interactive sound effects. Also covered are: unsupported events, audio sprites, and how to use directCanvas and multiSound to accelerate HTML5 game performance.
It is important to note that with iOS 6, Apple has added support for the Web Audio API (discussed below), which removes the need for a lot of the workarounds discussed in this article. However, iOS 6 has only been out for a few weeks, so iOS 5 still has the majority of the market. The issues discussed and the workarounds provided in this article are still valid and should be considered when developing audio for mobile Safari.
You can download the source code for the examples used in this article.
Back to top
Limitations of HTML5 audio
Before discussing the limitations in mobile Safari, it's important to understand the limitations of HTML audio on the desktop. HTML5 audio is both robust and limiting, depending largely on its implementation. It works well for music players (jukebox player) or simple sound effects, but leaves much to be desired for sound-intensive applications such as games.
Format support
Unfortunately, not all browsers support the same audio file format. As shown in Table 1, there are currently four major formats: MP3, OGG, WAV, and AAC.
Table 1. HTML5 audio format support
Ogg Vorbis | WAV | PCM | AAC | |
---|---|---|---|---|
Internet Explorer 9 | X | X | ||
Firefox | X | X | ||
Chrome/Safari/mobile Safari | X | X | X |
To cover all browsers, it's best to have all audio streams as both Ogg Vorbis and AAC.
Why isn't MP3 included? MP3 comes with hefty royalty payments when distributed commercially. The license requirements for MP3 will claim a distribution fee of 2% of all revenue over $100K (see Resources). For this reason, I prefer AAC over MP3. AAC is not royalty-free, but it has a much more relaxed license that allows free distribution. AAC also provides better compression, allowing for smaller file sizes—a blessing in the web world (see Resources).
Ogg Vorbis wins my vote overwhelmingly because it is open-source, patent-free, and royalty-free. However, only Firefox supports it.
Listing 1 shows what cross-browser compatible HTML markup should look like.
Listing 1. HTML markup for the audio element
Manipulation and effects
When dealing with audio, a powerful feature is the ability to manipulate the sound. Whether it's synthesizing sound on-the-fly, processing sound effects, applying environmental effects, or even doing basic stereo panning, HTML5 audio lacks all manipulation abilities. The audio you load is the audio that is played.
The Web Audio API (Chrome) and Audio Data API (Firefox) help address the missing features by giving you the ability to synthesize and process audio on-the-fly without any browser plug-ins (see Resources). Both APIs are currently still under development and are only supported in Chrome 14+ and Firefox 4+. Unfortunately, they are also quite different in implementation. There are great libraries to help normalize support, including audiolibjs (see Resources). Chrome's Web Audio API is the standard being pushed through the W3C.
Single sound layering (Polyphonic)
To play the same sound over itself, you must instantiate a separate audio object of that same sound. There is a 1:1 correspondence between the markup and the audio that can be played. No layering is achievable with the current state of HTML5 audio. Other platforms, such as Flash, let you layer a single audio object without having to create a new one.
Back to top
iOS, mobile Safari, and HTML5 audio limitations
HTML5 audio is already somewhat limited, and mobile Safari adds another layer of limitations to the most basic uses of HTML5 audio.
Single audio stream
One of the biggest limitations imposed by mobile Safari is that only a single audio stream can be played at one time. HTML5 media elements in mobile Safari are singletons, so only a single HTML5 audio (and HTML5 video) stream can be playing at one time. Apple has offered no explanation for this limitation, but one can assume it is to reduce data charges (as is the reason for most other iOS HTML5 limitations).
iOS provides mobile Safari with only a single HTML5 media (audio and video) container. If you play an audio stream while another is currently playing, the previous audio stream will be removed from the container and the new stream will be instantiated in its place.
Listing 2 shows how calling play()
while another stream is playing will stop the previous stream—in this case, audio1.
Listing 2. Single audio stream
var audio1 = document.getElementById('audio1'); var audio2 = document.getElementById('audio2'); audio1.play(); // this stream will immediately stop when the next line is run audio2.play(); // this will stop audio1
See and hear this example in action.
It's important to keep in mind that audio and video are interchangeable. If an audio file is played while a video is playing, the video will stop. Only one audio or video stream can be playing at a time, as shown in Listing 3.
Listing 3. Interchangeable audio video stream
var audio = document.getElementById('audio'); var video = document.getElementById('video'); video.play(); // at a later time audio.play(); // this will stop video
Autoplay
Audio files cannot be auto-played on page load in mobile Safari. Audio files can only be loaded from a user-triggered touch (click) event. If theautoplay
attribute is used in the HTML markup, mobile Safari will ignore the attribute and not play the file on page load, like so:
The Safari Developer Guide has details on the matter (see Resources).
Loading audio
Audio streams cannot be loaded unless triggered by a user touch event such as onmousedown
, onmouseup
, onclick
, or ontouchstart
. Figure 1 shows an example.
Figure 1. Workflow to load audio in mobile Safari

If the code in Listing 4 is run on page load, the audio stream will not be loaded, or even downloaded, in mobile Safari.
Listing 4. Playing an audio stream on page load will silently fail
var audio = document.getElementById('audio'); audio.play();
Even if the preload
attribute is used in the HTML markup, mobile Safari ignores the attribute and will not load the file until triggered by a user touch event, as shown in Listing 5.
Listing 5. preload
attribute not supported in mobile Safari
See and hear this example in action.
On desktop Safari, the code in Listing 5 will download the audio file on page load. However, on mobile Safari, the attribute will be ignored and the audio file will not be downloaded.
Other quirks
There are a few additional quirks to consider when using HTML5 audio is mobile Safari.
There is a few-seconds delay when initializing a new audio stream due to iOS instantiating a new audio object. Listing 6 shows how to encounter the delay.
Listing 6. HTML5 audio delay when switching between audio objects
var audio1 = document.getElementById('audio1'); var audio2 = document.getElementById('audio2'); audio1.play(); // at a later time audio2.play(); // there will be a few-seconds delay as iOS is instantiating a new audio object. // at an even later time audio1.play(); // there will also be a few-seconds delay, as the audio object // for audio1 in iOS was destroyed when we played audio2.
See and hear this example in action.
It's important to ensure your logic does not assume the audio streams are loaded on page load. While calling play()
will fail silently, trying to set the currentTime
on a yet-to-be-loaded audio stream that hasn't had its metadata loaded will throw a fatal error, as shown in Listing 7.
Listing 7. Setting currentTime
on audio stream that hasn't had metadata loaded
// run on page load var audio = document.getElementById('audio'); audio.play(); // This will silently fail audio.currentTime = 2; // This will throw a fatal error because the metadata // for the audio does not exist
See and hear this example in action.
Audio files cannot be cached in a mobile manifest on iOS. This is only applicable when using a manifest for an offline web application. If an audio file is included in the manifest, iOS will ignore it and not cache the file. Every time the web application needs access to the audio file it will need to access the file from the network.
Mobile Safari does not respect the volume and playbackRate
property when set programmatically with JavaScript. Changing the attributes will not actually adjust the values. Volume is always under user control, and playbackRate
is not supported in mobile Safari. While volume always stays set at 1, playbackRate
will be set to the new value you set it to—but the actual rate of playback for the audio stream will not be changed. This creates some complications with the onratechange
event, which is discussed in Unsupported events.
Before iOS 5, the loop attribute was not supported. To work around the lack of support, add an event listener to the onended
event and, in that function, call play()
. Listing 8 shows an example.
Listing 8. Looping audio workaround for iOS < 5
var audio = document.getElementById('audio'); audio.play(); var onEnded = function() { this.play(); }; audio.addEventListener('ended', onEnded, false);
See and hear this example in action.
Back to top
Solutions
Solutions for mobile Safari's HTML5 audio shortcomings all depend on the usage. If you only want to play a single audio file or a playlist of audio files, not much will need to change. However, if interactive sound effects are needed, things can get a bit tricky.
Single audio streams
One solution to the single audio stream limitation is to simply swap out the source file with the audio needed, as shown in Listing 9. This is not an ideal solution because you need to wait for the new audio stream to load before you can play it.
Listing 9. Swapping out an audio object's source
var audio = document.getElementById('audio'); audio.play(); // at some later point in your script (does not need to be from a touch event) audio.src = 'newfile.m4a'; audio.play(); // there will be a slight delay while the new audio file loads
See and hear this example in action.
A better way to solve the single audio stream limitation is to use an audio sprite. In short, you would combine all your audio into a single audio stream and then play portions of the stream. Audio sprites has more detail.
Autoplay
There is no workaround for the autoplay limitation. As mentioned, audio streams can only be loaded from a user-touch event. When developing for mobile Safari, it's important to adjust your workflow as necessary to accommodate this limitation. (From experience, I know that a lot of refactoring will happen if this isn't taken into consideration from the start.)
Before iOS 4.2.1, you could load an audio file from the callback of a synchronous Ajax call, as in the example in Listing 10.
Listing 10. Loading an audio stream in the callback of an Ajax call before iOS 4.2.1
// run on page load var audio = document.getElementById('audio'); jQuery.ajax({ url: 'ajax.js', async: false, success: function() { audio.play(); // audio will play in iOS before 4.2.1 } });
Hear this example in action.
There's an issue with the method in Listing 10: It's a synchronous Ajax call, so the browser is locked until the call is complete. In mobile Safari, locked doesn't mean just the page is locked—the entire application is locked. If an error occurs and mobile Safari gets stuck in a locked state (not terribly likely), the only way to exit is to click the home button and force-close the application.
Apple patched this workaround in iOS 4.2.1, so the workaround does not work in any version of iOS 4.2.1 and later.
Loading audio
Audio streams cannot be loaded unless triggered by a user event. As shown in Listing 11, onmousedown
, onmouseup
, onclick
, andontouchstart
are valid events that will successfully load an audio stream when called within a callback. Note that this is only for loading an audio file; calling play()
on a file that has already loaded will work as expected.
Listing 11. Using a user-triggered event to load an audio stream
// run on page load var button = document.getElementById('button'); var audio = document.getElementById('audio'); var onClick = function() { audio.play(); // audio will load and then play }; button.addEventListener('click', onClick, false);
See and hear this example in action.
At first glance, Listing 11 may seem like an annoying workaround. However, it's a best practice to give your game or interactive experience a splash screen, as in Figure 2, that requires the user to click a button to start. When the user clicks the start button, you can use that event to load the audio in your project.
Figure 2. Cut the Rope HTML5 splash screen

Back to top
Unsupported events
Though HTML5 audio in mobile Safari supports all media events from the desktop, note that some events will never fire because of a few unsupported properties mentioned previously. There are also a few quirks to be aware of.
Table 2 lists all the event callbacks for the audio element and their compatibility on desktop and mobile Safari. The results are based on an HTML5 audio event debugger set up by the author, which you can play around with if you choose.
Table 2. Desktop versus mobile Safari support for media events
Event | Description | Desktop | Mobile Safari |
---|---|---|---|
abort |
The browser stops fetching the media before the media was completely downloaded. | X | X |
canplay |
The browser can resume playback of the media data, but estimates that if playback has started, the media resource could not be rendered at the current playback rate up to its end without having to stop for further buffering of content. | X | X |
canplaythrough |
The browser estimates that if playback is started now, the media resource could be rendered at the current playback rate all the way to its end without having to stop for further buffering. | X | X |
durationchange |
The duration property changes. | X | X |
emptied |
The media element network state changes to the NETWORK_EMPTY state. | X | X |
ended |
Playback has stopped at the end of the media resource and the ended property is set to true. | X | X |
error |
An error occurs while fetching the media data. Use the error property to get the current error. | X | X |
loadeddata |
The browser can render the media data at the current playback position for the first time. | X | X |
loadedmetadata |
The browser knows the duration and dimensions of the media resource. | X | X |
loadstart |
The browser begins loading the media data. | X | X |
pause |
Playback pauses after the pause method returns. | X | X |
play |
Playback starts after the play method returns. | X | X |
playing |
Playback starts. | X | X |
progress |
The browser is fetching the media data. | X | X |
ratechange |
Either the defaultPlaybackRate or the playbackRate property changes. |
X | X (shouldn't) |
seeking |
The seeking property is set to true and there is time to send this event. |
X | X* |
seeked |
The seeking property is set to false . |
X | X* |
stalled |
The browser is fetching media data but it has stopped arriving. | X | X |
suspend |
The browser suspends loading the media data and does not have the entire media resource downloaded. | X | X |
timeupdate |
The currentTime property changes as part of normal playback or because of some other condition. |
X | X |
volumechange |
Either the volume property or the muted property changes. | X | |
waiting |
The browser stops playback because it is waiting for the next frame. | X | X |
The follwoing list provides some notes on a few of the event callbacks.
-
ratechange
-
The
ratechange
event is fired whenever theplaybackRate
is changed. As mentioned, changing the playback rate of an audio stream (as well as video) is not supported in mobile Safari, so theplaybackRate
should never fire. However, as of iOS 5.1.1, HTML5 audio will still fire theratechange
event even though the actual playback rate hasn't changed. -
volumechange
-
Volume cannot be set using JavaScript, so the
volumechange
event will never be fired. Even if the user changes the volume on their device while mobile Safari is open, this event will not fire. -
seeking
/seeked
-
Mobile Safari only supports the
seeking
andseeked
events when the seeking is done through JavaScript, as shown in Listing 12. If the built-in controls are displayed and the user seeks using the progress bar,seeking
andseeked
do not fire as expected.
Listing 12. Setting
currentTime
will triggerseeking
andseeked
eventsvar audio = document.getElementById('audio'); audio.currentTime = 60; // seeking and seeked will be fired
Back to top
Audio sprites
Using an audio sprite is one of the best solutions to overcome the need for multiple sounds in mobile Safari. Much like a CSS image sprite, an audio sprite combines all your audio into a single stream, as shown in Figure 3.
Figure 3. Audio sprite

The principle is straightforward. You will need to store the data for each sprite: starting position, ending position or length, and an ID. When you want to play a particular sprite, you set the currentTime
of the audio stream to the start position and call play()
. Listing 13 shows an example.
Listing 13. Simple audio sprite implementation
// audioSprite has already been loaded using a user touch event var audioSprite = document.getElementById('audio'); var spriteData = { meow1: { start: 0, length: 1.1 }, meow2: { start: 1.3, length: 1.1 }, whine: { start: 2.7, length: 0.8 }, purr: { start: 5, length: 5 } }; // play meow2 sprite audioSprite.currentTime = spriteData.meow2.start; audioSprite.play();
Listing 13 will play the meow2 sprite and, because there isn't logic implemented to stop when the sprite is complete, it will also play the whine and purr sprite. By adding an event listener to the ontimeupdate
event in Listing 14, you can watch the currentTime
and stop the audio when the sprite reaches its end.
Listing 14. Adding logic to stop the stream when it reaches the end of a sprite
var handler = function() { if (this.currentTime >= spriteData.meow2.start + spriteData.meow2.length) { this.pause(); } }; audioSprite.addEventListener('timeupdate', handler, false);
See and hear this example in action.
A great advantage to using an audio sprite is that there will be no delay when switching between sprites (like when switching between audio streams, assuming the entire audio sprite is loaded). Having all streams in one file is also advantageous to cut down on HTTP requests.
Be aware that changing currentTime
isn't 100% accurate. Setting the currentTime
to 6.5 can actually seek to 6.7, or 6.2. A small amount of space is needed between each sprite to avoid seeking to the end of another sprite. Adding this space can add a slight delay if the stream seeks to 6.4 when the sprite starts at 6.8 seconds.
Ensure that the entire audio stream is loaded before accessing any sprites. This is important because if the audio stream isn't completely loaded, and an attempt is made to access a portion of the stream that's loaded, the stream will need to be buffered and a delay will occur while the stream is loading.
Full-featured example
See and hear an example of an audio sprite framework. The example takes into consideration the topics covered in this article.
Back to top
How directCanvas and multiSound accelerate HTML5 game performance
AppMobi has developed an interesting solution to overcome the various HTML5 limitations on mobile devices with directCanvas and multiSound (see Resources). directCanvas and multiSound use the native capabilities of a device within a standard HTML5 browser application. Slow graphical performance, and the limitations discussed in this article, are no longer an issue; you get the full performance benefits of a native application.
When a user navigates to a site that makes use of directCanvas, the page will prompt the user to download the MobiUs application from the App Store. If the user already has the application installed on their device, then the page will be opened in the MobiUs application.
AppMobi has videos on their site that show side-by-side comparisons of games running in their MobiUs application and games running in mobile Safari. The results are quite amazing, offering a 10X performance boost, as shown in Figure 4.
Figure 4. Average HTML5 performance improvement from mobile Safari to MobiUs app using directCanvas

AppMobi's API site has great documentation, so you can jump right in. The SDK is free to download, and there is also a handy Google Chrome extension that lets you develop in your desktop browser.
Though it's not ideal to require users to install an application on their device, AppMobi has an interesting solution that should warrant consideration. Currently, the MobiUs application is not available in the App Store, but MobiUs assures it will be back soon.
Back to top
Conclusion
Despite the limitations discussed in this article, HTML5 audio is a welcome addition to mobile Safari and you should take advantage of it. In this article, you learned about the limitations on both desktop and mobile Safari, walked through solutions to the limitations, and explored the advantages to using audio sprite in mobile Safari. Being aware of the mobile Safari limitations can increases its usability for you.
As a developing specification, HTML5 audio is sure to evolve, but there is no reason to wait until the spec is final in (supposedly) 2014. With near universal HTML5 audio compatibility for all iOS users, there is no reason not to use it.
Back to top
Download
Description | Name | Size |
---|---|---|
Article source code | html5audio.article.source.zip | 4073KB |
Resources
Learn
- Develop and deploy your next app on the IBM Bluemix cloud platform.
- More iOS device and OS version stats from Instapaper: See recent data and trends from people using the Instapaper application.
- License requirements for MP3: Get mp3, mp3HD, and mp3surround patent and licensing information.
- AAC licensing requirements: Learn about licensed products (other than PC Software) standard rates.
- HTML5 Audio: Read more about HTML5 Audio on Wikipedia.
- HTML5 audio lacks the ability to manipulate sound. Read how the Web Audio API (Chrome)and Audio Data API (Firefox) help address the missing features and give you the ability to synthesize and process audio on-the-fly without any browser plug-ins.
- How can I autoplay media in iOS >= 4.2.1 Mobile Safari?: Get the answer to the question.
- appMobi: Learn how appMobi has solved HTML5 shortcomings with directCanvas and multiSound.
- iOS Specific Considerations: Learn the considerations when embedding audio and video using HTML5.
- HTMLMediaElement Class Reference: Learn more from the Safari Extensions Development Guide.
- Getting Started with Web Audio API: Read this article on the HTML5 Rocks website.
- Audio Data API: Learn more about the Audio Data API on the MozillaWiki.
- Web Audio API: Check out the official specification from W3C.
- WHATWG: Explore this community of developers working with the W3C to fine-tune HTML5.
- developerWorks Web development zone: Find articles covering various web-based solutions. See the Web development technical library for a wide range of technical articles and tips, tutorials, standards, and IBM Redbooks.
- The developerWorks community: Personalize your developerWorks experience.
- developerWorks technical events and webcasts: Stay current with technology in these sessions.
- developerWorks on Twitter: Join today to follow developerWorks tweets.
- developerWorks on-demand demos: Watch demos ranging from product installation and setup for beginners to advanced functionality for experienced developers.
Get products and technologies
- developerWorks Premium: Provides an all-access pass to powerful tools, curated technical library from Safari Books Online, conference discounts and proceedings, SoftLayer and Bluemix credits, and more.
- audiolib.js: Install the powerful toolkit for audio written in JS.
- directCanvas: Download the directCanvas SDK, a collection of HTML5 game acceleration technologies, to solve several HTML5 shortcomings.
- IBM product evaluation versions: Download or explore the online trials in the IBM SOA Sandbox and get your hands on application development tools and middleware products from DB2®, Lotus®, Rational®, Tivoli®, and WebSphere®.