After initially starting with a small set of fairly basic media APIs in iPhone OS 2.0, the APIs and the features they provide have dramatically increased in the past 2 years and provided a rapidly moving target for developers trying to remain current. In this post, I'll try to summarize all of the different APIs in iOS 4.3 for playing media, when they arrived, what their purposes are, what their limitations are and what it's been like to remain up-to-date and support new features.
Introduction
This post has two purposes:
To detail the different media APIs in iOS and to explain the scenarios to which they are best suited.
2.To show how many updates have been made to the media APIs and what that has meant to any iOS developer attempting to keep their media applications compiling successfully against the latest SDKs and up-to-date with the latest media features in iOS.
Note: I'll be limiting discussion to time-based media in this post, i.e. audio and video APIs. I realize that still photos are "media" but since photos are generally handled as basic graphics, they are handled in a very different manner to audio and video which use specialized hardware processing and handling in iOS.
.
I was inspired to write this post while working on StreamToMe version 3.5.2 — an update to one of my ap plications to improve the experience of users running iOS 4.3. Nominally, iOS 4.3 only added logging features to some media classes and added an "allowsAirPlay" property to the MPMoviePlayerController. Despite these seemingly limited changes to the APIs, StreamToMe still required some significant changes to work smoothly and deliver the features that users expect in iOS 4.3.
But I'm getting ahead of myself.
iPhone OS 2.0
Playback APIs
The first version of iPhone OS available to developers arrived with 5 media playing APIs:
AudioUnits/AUGraphs
AudioQueues
MPMoviePlayerController
AudioServicesPlaySystemSound
UIWebView Web
AudioUnits/AUGraphs are the "low-level" API in both Mac OS and iOS. If you want to process audio in any way, mix more than one source of audio, want to generate your own samples or otherwise access the raw Linear-PCM values, these have always been the best option — in many cases, close to the only option.
I've previously written a post showing what is probably the simplest possible AudioUnit program:
an iOS Tone Generator. Of course, most people require considerably more complexity than this. A good next step if you're trying to learn about lower level audio APIs is the MixerHost sample project you'll find in the iOS documentation. Apple tend to favor C++ wrappers around these C APIs so you may also want to be familiar with the classes in AUPublic folder — you can start to see how these are used by looking at the very similar iPhoneMultichannelMixerTest.
AudioQueues are for playing or recording buffers of data. AudioQueueNewInput remains a common means of capturing microphone input and AudioQueueNewOutput is a common way to play to the speaker. The AudioQueue API is, like AudioUnits, a pure C API still requires a fairly meticulous set up. Where AudioUnits require that you push PCM samples into buffers yourself, AudioQueues let you push the buffers and not worry about the sample format. In fact, AudioQueues generally deal with buffers of still-compressed MP3 or AAC data.
I've written a series of posts on using AudioQueues (in conjunction with AudioFileStream) to play from an HTTP stream starting with Streaming and playing an MP3 stream and ending with Streaming MP3/AAC audio again AudioServicesPlaySystemSound will play up to 30 second segments. Its purpose is really for brief UI or notification sounds played asynchronously. You create the sound using AudioServicesCreateSystemSoundID and then play with AudioServicesPlaySystemSound. Not much more to say than that.
Living out on its own in iPhone OS 2.0, was MPMoviePlayerController — the only Objective-C class for media playback in iPhone OS 2.0. It offered no programmatic control other than play, no options to configure the UI or movie and no feedback about state. You gave it a URL (either file or HTTP) and it presented the interface, handled the entire experience and posted a notification when it was done. The canonical code example used to be the MoviePlayer sample project but this has not been updated since iOS 3.0 and since iOS 4.0 broke backwards compatibility with this class, you'll need to ensure that the MPMoviePlayerController's view is inserted into the view hierarchy before this project will work.
UIWebView offered an experience similar to MPMoviePlayerController but had an added advantage: it was the only way to output over the TV out dock cables until iOS 3.2 (MPMoviePlayerController, despite being implemented by the same internal private classes, has this functionality disabled). While playing a movie through UIWebView didn't break in iOS 4 like MPMoviePlayerController did, the ability to play to the TV went away without explanation.
Media support APIs
AudioFile
AudioFileStream
AudioSession
OpenAL
MPVolumeSettingsAlertShow
MPVolumeView
AudioFile offers a fairly rich set of metadata and parsing functions for files that are fully saved to disk. AudioFileStream offers a limited subset of the AudioFile functionality but has the advantage that the file doesn't need to be fully saved or downloaded — it can be a continuous source or progressive source.
AudioSession is mostly for handling audio routing (is the audio going to the headphones or the speaker) and for determining how your application's audio is blended with audio that other applications may be playing. If you need to handle interruptions (like when an iPhone rings) this API will help you.
OpenALoalTouch sample project for an example of how to set this up in iOS.It is an audio standard for controlling the positioning of the audio in 3D — mostly used for games. You can look at the
The MPVolumeSettingsAlertShow and related functions show a dialog so the user can change the volume. The MPVolumeView is a slider so that the user can change the volume.
Code maintenance considerations
Code written using AudioUnits, AudioQueues, AudioSessions and AudioServicesPlaySystemSound for iPhone OS 2.0 will generally continue to work in the latest version of iOS (iOS 4.3). Despite additions to these APIs, backwards compatibility remains high. However, many new classes like AVAudioPlayer, AVPlayer,AVAudioRecorder,AVAudioSession and AVCaptureSession provide alternative ways of doing similar things so you may need to consider these alternatives compared to these earlier APIs.
As I mentioned, MPMoviePlayerController code written for iPhone OS 2.0 but linked against iOS 4.3 SDKs will likely not work since this code requires a view be inserted into the hierarchy starting with iOS 3.2.
UIWebView stopped outputting over TV out in iOS 3.2 so there's no longer a real reason to use a web view instead of a real movie view.
I rarely use the AudioFile APIs anymore. It's not due to compatibility issues but instead I feel it's been superceded: AudioFileStream (rather than AudioFile) is required for streaming or progressive downloads, AVAudioPlayer(iOS 2.2) is easier for playing files stored on the device (apparently it uses AudioFile/AudioQueue internally) and ExtAudioFile (iOS 2.1) can convert between media formats using the hardware and hence can plug into an AUGraph better.
In my experience, the
MPVolumeView
slider is more commonly used than the
MPVolumeSettingsAlertShow
MPVolumeView
dialog — with
MPVolumeView
supporting AirPlay audio in iOS 4.2 and later, the
MPVolumeView
become even more compelling.
It used to infuriate me that in the simulator, the
MPVolumeView
simply didn't appear — it worked fine on the device but didn't draw itself in simulator (many hours were lost around wondering if its absence was a bug). The
MPVolumeView
still doesn't appear in the simulator (for no reason I can understand) but at least it now draws a label saying "No volume available".
iPhone OS 2.1
Arriving just 2 months after iPhone OS 2, iPhone OS 2.1 brought audio conversion as the main addition to the SDK. The AudioConverter functions introduced various forms of PCM conversions and conversions to and from compressed audio formats (MP3 and AAC).
The ability to convert MP3/AAC was important since it could take advantage of the audio hardware (previously required software handling which consumes much more battery power).
Since the primary purpose for audio conversion is to allow a file — like an MP3 — to be opened and fed into a processing pipeline like an AUGraph, the ExtAudioFile functions were also added to streamline this process.
Code maintenance considerations
If you had code that decompressed audio in software or performed PCM conversion in anything less than an optimal manner, it was now a waste of CPU cycles relative to newer code that used these APIs.
iPhone OS 2.2
Arriving just 2 months after iPhone OS 2.1 (now just 4 months after iPhone OS 2) the iPhone OS 2.2 update introduced the AVAudioPlayer — the first Objective-C API for dedicated audio playback in iPhone OS. The AVAudioPlayer requires that the file be fully saved on your iOS device (so it isn't suitable for continuous streams, network streams or progressive downloads).
Code maintenance considerations
If you had code that used AudioFile and AudioQueue, chances are that it would have been much easier to write your program using
AVAudioPlayer
instead — however, AudioFile and AudioQueue continue to work, so there was no need to update to
AVAudioPlayer
. Later on,
AVPlayer
superceded almost all of
AVAudioPlayer
's functionality (with the exception of audio metering and playing from a non-URL buffer) so you need to consider if this is still the class you want to use.
iPhone OS 3.0
Arriving approximately 1 year after iPhone OS 2.0, iPhone OS 3.0 brought the following media APIs:
AVAudioRecorder
AVAudioSession
MPMediaQuery
,
MPMediaPickerController
and
MPMusicPlayerController
classes
AVAudioRecorder provided the first Objective-C approach for recording sound. It offers a simple way to record sound to a file but doesn't allow processing of the sound on-the-fly (for that, AudioQueueNewInput is still required).
AVAudioSession provided an Objective-C approach for managing the application's audio session but bizarrely, it still lacks any facility for handling routing changes (i.e. a switch from the headphones to the speaker or to the dock connector). For this reason, I still generally avoid this class — the AudioSession C functions are clean an simple enough that sacrificing functionality for the improved simplicity of AVAudioSession doesn't seem like a great tradeoff.
The MPMediaQuery, MPMediaPickerController classes and MPMusicPlayerController added the ability to browse, control or play music from the user's iTunes library on the device. This allows you to offer basic library browsing and playing capability. In iPhone OS 3, there was no way to apply different processing to the files — you had to use
MPMusicPlayerController.
Arguably though, the 2 biggest media additions in iPhone OS 3 didn't require a new API:
UIImagePickerController
and the
MPMoviePlayerController
added handling of HTTP live streaming.
While
MPMoviePlayerController
has always supported opening an MP4 file over HTTP, this has three major disadvantages:
It is not really optimized for streaming (so the many HTTP byte range requests required can end up being slow).
An MP4 file can't be generated on-the-fly (so it's not suitable for continuous sources, live remuxed sources or live transcoded sources).
You can't dynamically change bitrate on an MP4 file (you can't handle 3G and WiFi bitrates in a single URL).
All of which were addressed by Apple's HTTP live streaming.
Code maintenance considerations
HTTP live streaming did bring with it the following additional problems:
As a new protocol, the segmented MPEG-TS and M3U8 files required completely new software to generate them.
It was initially only supported by MPMoviePlayerController (no other interface could be used except UIWebView which was just a different way of presenting the same interface).
You don't have any access to the transport layer — all communication is handled by Apple's internal libraries making careful control of network access difficult or impossible.
The MPMusicPlayerController's remote controlling of the iPod application is still relevant but since iOS 4.0 introduced the ability to get the URL and play the music in AVAudioPlayer or AVPlayer instead, MPMusicPlayerController's playback capabilities seem limited.Despite adding video to UIImagePickerController, you still were not able to get a live image from the camera or programmatically take a picture. Still image capture didn't arrive until iPhone OS 3.1. Actual movie capture didn't arrive until iOS 4.
In iPhone OS 3, you couldn't get a URL for MPMediaQuery results, meaning that you could play files from the user's iTunes library but couldn't do anything interesting. It wasn't until iOS 4 that you could finally get a URL (a weird "ipod-library" URL) that could be used to open the file in lower-level audio APIs to actually perform processing, mixing or other more interesting effects to music.
With HTTP live streaming in place, Apple introduced bitrate restrictions for media into the App Store submission guidelines. This meant that you needed to update your code to throttle streaming audio connections over 3G yourself (a tricky thing to do since NSURLConnection won't generally do this and you need to resort to CFHTTPReadStream), and all HTTP live streams over 3G needed to have a 64kbps fallback variant. If you've ever tried to squeeze video into 64kbps, you'll know how tight a restriction that is.
AVAudioSession's inability to handle routing changes prevented it from properly superceding the older AudioSession functions.
AVAudioSession不具备处理传输转换而得与AudioSession的类共存.
iPhone OS 3.1
UIVideoEditorController was the only significant media addition in iPhone OS 3.1. It allowed you to present the trimming/re-encoding interface for videos stored in the user's Photo Library.
iOS 3.2
The first iPad release and the first release to be named "iOS" made two changes that were significant to for media playback: the addition of multiple screen support and a radical overhaul of the
MPMoviePlayerController.
Prior to iOS 3.2, the only App Store legal way to output via the dock connector to a TV was to load a movie in a UIScreen to find additional screens and place your views on that screen instead of the main screen.
UIWebView
and let the movie player in the web view connect to the TV screen and output via the dock connector. With the iPad, you could finally use
MPMoviePlayerController was finally overhauled
to provide a lot of the feature it sorely needed:
Inline (non-fullscreen) playback if desired, with smooth switching between fullscreen and non-fullscreen
Ability to programmatically seek and get the current playback point
Ability to set the control style (including disabling the standard user-interface entirely)
Provided a location to actually insert a background image if desired
The "set and forget" movie player was reborn as MPMoviePlayerViewController, a
UIViewController
that handles all display and handling automatically and which handles all communication with its internal
MPMoviePlayerController
automatically.
Code maintenance considerations
While older
MPMoviePlayerController
code linked against previous SDKs would continue to work, if you ever linked the code against a iOS 3.2 SDK or newer, it would now fail since the new
MPMoviePlayerController
requires its view be inserted into the view hierarchy or that fullscreen be set to YES.
Remember: Apple rarely allow you to link against anything except the newest SDK, so any attempt to recompile old projects with
MPMoviePlayerController
code will result in no video being shown unless you update the code. For this reason, Apple's MoviePlayer sample project continues to not work (they haven't updated since iPhone OS 3.0).
Given the size of the iPad screen, users now expect a non-fullscreen view to be possible.
The "Done" button of the
MPMoviePlayerController
(visible in fullscreen) no longer ends the movie. It just pauses it and shrinks it to the inline (non-fullscreen) view. This creates another new trait of the
MPMoviePlayerController
that you must adapt to handle.
iOS 4.0
The biggest update since iPhone OS 2.0, iOS 4 brought a huge number of changes to media APIs.
ALAsset
(and related classes)
AVCaptureSession
(and related classes)
AVComposition
(and related classes)
AVPlayer
,
AVPlayerItem
,
AVAsset
(and related classes)
The ability to get the URL for an
MPMediaItem
startVideoCapture
and
stopVideoCapture
in
UIImagePickerController
UIScreen
and
MPMoviePlayerController
changes from iOS 3.2 brought to non-iPad devices
Background audio
beginReceivingRemoteControlEvents
and
endReceivingRemoteControlEvents
The huge additions to the AVPlayer and AVComposition class hierarchies — reflect Apple providing APIs that replace what Quicktime's API used to provide on the Mac: sophisticated media handling that could be used to implement a complete music or movie editing program if required. Ultimately, since Quicktime 7 is deprecated in favor of Quicktime X on the Mac, I expect that these APIs will probably appear in a future version of Mac OS X and represent multi-track mixing, editing and composition in Cocoa for the future.
AVFoundation.framework — particularly the
AVPlayer in iOS 4.0 ultimately didn't offer any advantages over MPMoviePlayerController for playing regular media. AVPlayer is required for playing AVCompositions but for regular files, it was largely the same as MPMoviePlayerController with the user interface disabled (made possible since iOS 3.2).
The ALAsset classes finally provided a way to search through the photo and video media without using the
UIImagePickerController
. It also provided a better way to handle reading and writing photo and video media to the user's photo library.
AVCaptureSesssion and the other AVCapture classes finally provided the ability to capture video data without the UIImagePickerController interface and perform realtime processing of video data. The classes also included the ability to handle audio capture too, providing an alternative to the AudioQueueNewInput function for processing audio while it is recording (remember AVAudioRecorder will still let you record audio direct to a file without processing).
Background audio was largely painless — just a setting in your Info.plist — although trying to get videos to continue playing their audio in the background is a near impossibility (you need to disable the video track or if you're using HTTP live streaming, you need to restart the stream without video or iOS will forcibly pause playback when you hit the background).
Code maintenance considerations
iOS 4.0 required updating of all
MPMoviePlayerController
code for non-iPad devices in the same way that iOS 3.2 required updating for the iPad.
AVPlayer has no built-in interface. You must entirely create it yourself. This remains a problem for anyone who needs to use AVPlayer instead of the standard MPMoviePlayerController because implementing video playback controls can take a long time and requires a lot of subtle features.
UIWebView stopped playing to the TV in iOS 4. No idea why but this functionality has not returned.
The inline (non-fullscreen) iPhone/iPod version of the
MPMoviePlayerController
user interface offers no button to return to fullscreen when playing audio. This creates an annoying difference between the iPhone/iPod and iPad versions of the
MPMoviePlayerController
which you need to handle.
iOS 4.1
The biggest update in this version was the
AVQueuePlayer
. The iOS 4.0 headers actually hinted at being able to queue multiple items for an
AVPlayer
but obviously this functionality was held over.
AVQueuePlayerAVPlayer though, it has no user-interface so if you want to use this player, you need to write your own interface completely.
is an important class as it is the only player in iOS that will attempt to cache subsequent items for play to allow nearly gapless playback between items in a list. Like
Code maintenance considerations
AVQueuePlayer would be the unambiguously best player in iOS if it:
could provide an inbuilt UI if requested
could use AirPlay video
Until these features are brought to
AVQueuePlayer
, there are still reasons why you would need to use
MPMoviePlayerController
instead.
iOS 4.2
The first version of iOS to merge iPad and iPhone/iPod lines. For media APIs, it added the CoreMIDI framework and AirPlay audio support.
AirPlay audioMPVolumeView would allow you to select an AirPlay destination for your application's audio. Many applications required zero code changes if they already featured an MPVolumeView.
ended up being very simple: any existing
I have no experience with this framework but from the look of it, CoreMIDI appears to be for controlling MIDI devices over the network, not for actually playing/synthesizing on the iPhone/iPod/iPad so it is perhaps only tangentially related to media on an iOS device.
Code maintenance considerations
If any
MPVolumeView
s in your program are too small, they won't be able to show the AirPlay controls, so a new minimum width requirement is effectively established.
iOS 4.3
The biggest addition in iOS 4.3 was
AirPlay video. In essence this only required you set the flag
allowsAirPlay
to
YES
on the
MPMoviePlayerController
.
Additionally a large set of
logging, error tracking and statistics gathering APIs were added to the AV media classes (AVPlayerItemAccessLog, AVPlayerItemErrorLog) and
MPMoviePlayerController
(MPMovieAccessLog, MPMovieErrorLog).
Code maintenance considerations
The
allowsAirPlay
flag on the
MPMoviePlayerController
carries with it an implicit requirement: that you're actually using
MPMoviePlayerController
. If you've been playing media with a different API, then you'll need to switch to
MPMoviePlayerController
to take advantage of AirPlay video. This was the biggest change that StreamToMe required for iOS 4.3 — since StreamToMe uses the
AVQueuePlayer
by default (for its superior track transitions and more detailed track and asset control) it needed to allow switching to the
MPMoviePlayerController
in the case where AirPlay video is desired. For a program as focussed on media play as StreamToMe, allowing a runtime switch between two interfaces at the core of the program was a big effort. Fortunately, StreamToMe has always had
MPMoviePlayerController
code for supported iOS prior to 4.1 but this was the first time a dynamic switch between interfaces had been needed.
The second change, much less expected since it wasn't really documented, it that iOS 4.3 no longer lets you observe the
playerItem.asset.tracks
key path of an
AVQueuePlayer
, instead you must now observe
playerItem.tracks.assetTrack
key path to get the same value. Technically while linked against iOS 4.2, you can still observe the old key path even when running on iOS 4.3 but it suddenly incurs a dramatic performance hit. Finding the exact cause of this issue was time consuming — as I said, it wasn't documented in any change notes I could find.
The final point that made compatibility difficult: if you have an
MPMoviePlayerController
with
allowsAirPlay
set to
YES
and
useApplicationAudioSession
set to
NO
, and the
MPMoviePlayerController
wants to launch straight to the Apple TV without displaying on the local device first, then the entire movie player interface disappears, never to return. This is undoubtedly a temporary bug but it provided another unexpected reason to make maintenance updates to StreamToMe.
Conclusion
This has been a lot of classes and functions to summarize. I hope I haven't missed anything important.
Obviously, I'm closer to the media APIs than to some other traits of iOS (so I might have a skewed perspective on their prominence) but I think that the media APIs are close to the most, if not the most updated area of iOS. Attempting to keep media applications up-to-date with the latest media features available remains a busy task.
Of course despite the huge amount of work (on the part of both Apple and the 3rd party application developers) these additions have certainly improved the media experience in iOS. The original iPhone OS felt hugely limiting at the time and users were certainly crying out for the additions that have appeared. The idea that the only movie player interface used to be fullscreen, the only audio playback API was AudioQueue or raw AudioUnits, there was no programmatic camera access and no access to the iPod library in the original iPhone OS highlights how many more options are now available.
Of course, the constant changes to the API also leave me feeling embarrassed when they trip me up or otherwise get ahead of my release schedules. The StreamToMe 3.5.2 update is coming soon, I promise!
All content in this post is © Matt Gallagher (CocoaWithLove.com), all rights reserved. Code samples may be freely used in any programming project, commercial or otherwise, at your risk.