Capturing Audio & Video in HTML5

Orignal URL is :http://www.html5rocks.com/en/tutorials/getusermedia/intro/

Introduction

Audio/Video capture has been the "Holy Grail" of web development for a long time.For many years we've had to rely on browser plugins (Flash orSilverlight)to get the job done. Come on!

HTML5 to the rescue. It might not be apparent, but the rise of HTML5 has broughta surge of access to device hardware. Geolocation (GPS),the Orientation API (accelerometer), WebGL (GPU),and the Web Audio API (audio hardware) are perfect examples. These featuresare ridiculously powerful, exposing high level JavaScript APIs that siton top of the system's underlying hardware capabilities.

This tutorial introduces a new API, navigator.getUserMedia(), which allowsweb apps to access a user's camera and microphone.

The road to getUserMedia()

If you're not aware of its history, the way we arrived at the getUserMedia() API is an interesting tale.

Several variants of "Media Capture APIs" have evolved over the past few years.Many folks recognized the need to be able to access native devices on the web, butthat led everyone and their mom to put together a new spec. Things gotso messy that the W3C finally decided to form a working group. Their sole purpose?Make sense of the madness! The Device APIs Policy (DAP) Working Grouphas been tasked to consolidate + standardize the plethora of proposals.

I'll try to summarize what happened in 2011...

Round 1: HTML Media Capture

HTML Media Capture was the DAP's first go atstandardizing media capture on the web. It works by overloading the <input type="file">and adding new values for the accept parameter.

If you wanted to let users take a snapshot of themselves with the webcam,that's possible with capture=camera:

<input type="file" accept="image/*;capture=camera">

Recording a video or audio is similar:

<input type="file" accept="video/*;capture=camcorder">
<input type="file" accept="audio/*;capture=microphone">

Kinda nice right? I particularly like that it reuses a file input. Semantically,it makes a lot of sense. Where this particular "API" falls short is the ability to do realtime effects(e.g. render live webcam data to a <canvas> and apply WebGL filters).HTML Media Capture only allows you to record a media file or take a snapshot in time.

Support:

  • Android 3.0 browser -one of the first implementations. Check out this video to see it in action.
  • Chrome for Android (0.16)
  • Firefox Mobile 10.0
  • iOS6 Safari and Chrome (partial support)

Round 2: device element

Many thought HTML Media Capture was too limiting, so a new specemerged that supported any type of (future) device. Not surprisingly, the design calledfor a new element, the <device> element,which became the predecessor to getUserMedia().

Opera was among the first browsers to create initial implementationsof video capture based on the <device> element. Soon after(the same day to be precise),the WhatWG decided to scrap the <device> tag in favor of another up and comer, this time a JavaScript API callednavigator.getUserMedia(). A week later, Opera put out new builds that includedsupport for the updated getUserMedia() spec. Later that year,Microsoft joined the party by releasing a Lab for IE9supporting the new spec.

Here's what <device> would have looked like:

<device type="media" onchange="update(this.data)"></device>
<video autoplay></video>
<script>
  function update(stream) {
    document.querySelector('video').src = stream.url;
  }
</script>

Support:

Unfortunately, no released browser ever included <device>.One less API to worry about I guess :) <device> did have two great things goingfor it though: 1.) it was semantic, and 2.) it was easily extendible to supportmore than just audio/video devices.

Take a breath. This stuff moves fast!

Round 3: WebRTC

The <device> element eventually went the way of the Dodo.

The pace to find a suitable capture API accelerated in recent monthsthanks to a larger effort called WebRTC (Web Real Time Communications).The spec is overseen by the W3C WebRTC Working Group.Google, Opera, Mozilla, and a few others are currently workingon bringing implementations to their browsers.

getUserMedia() is related to WebRTC because it's the gateway into that set of APIs.It provides the means to access the user's local camera/microphone stream.

Support:

In Chrome 21, this feature will be on by default. The API is also supported inOpera 12 and Firefox 17.

Getting started

With navigator.getUserMedia(), we can finally tap into webcam and microphone input without a plugin.Camera access is now a call away, not an install away. It's baked directly into the browser. Excited yet?

Feature detection

Feature detecting is a simple check for the existence of navigator.getUserMedia:

function hasGetUserMedia() {
  // Note: Opera is unprefixed.
  return !!(navigator.getUserMedia || navigator.webkitGetUserMedia ||
            navigator.mozGetUserMedia || navigator.msGetUserMedia);
}

if (hasGetUserMedia()) {
  // Good to go!
} else {
  alert('getUserMedia() is not supported in your browser');
}

Gaining access to an input device

To use the webcam or microphone, we need to request permission.The first parameter to getUserMedia() is an object specifying the type ofmedia you want to access. For example, if you want to access the webcam, thefirst parameter should be {video: true}. To use both the microphone and camera,pass {video: true, audio: true}:

<video autoplay></video>

<script>
  var onFailSoHard = function(e) {
    console.log('Reeeejected!', e);
  };

  // Not showing vendor prefixes.
  navigator.getUserMedia({video: true, audio: true}, function(localMediaStream) {
    var video = document.querySelector('video');
    video.src = window.URL.createObjectURL(localMediaStream);

    // Note: onloadedmetadata doesn't fire in Chrome when using it with getUserMedia.
    // See crbug.com/110938.
    video.onloadedmetadata = function(e) {
      // Ready to go. Do some stuff.
    };
  }, onFailSoHard);
</script>

OK. So what's going on here? Media capture is a perfect example of new HTML5 APIsworking together. It works in conjunction with our other HTML5 buddies, <audio> and <video>.Notice that we're not setting a src attribute or including <source> elementson the <video> element. Instead of feeding the video a URL to a media file, we're feedingit a Blob URL obtainedfrom a LocalMediaStream object representing the webcam.

I'm also telling the <video> to autoplay, otherwise it would be frozen onthe first frame. Adding controls also works as you'd expected.

If you want something that works cross-browser, try this:

window.URL = window.URL || window.webkitURL;
navigator.getUserMedia  = navigator.getUserMedia || navigator.webkitGetUserMedia ||
                          navigator.mozGetUserMedia || navigator.msGetUserMedia;

var video = document.querySelector('video');

if (navigator.getUserMedia) {
  navigator.getUserMedia({audio: true, video: true}, function(stream) {
    video.src = window.URL.createObjectURL(stream);
  }, onFailSoHard);
} else {
  video.src = 'somevideo.webm'; // fallback.
}

Security

Some browsers throw up an infobar upon calling getUserMedia(),which gives users the option to grant or deny access to their camera/mic.The spec unfortunately is very quiet when it comes to security. For example, hereis Chrome's permission dialog:

Permission dialog in Chrome

If your app is running from SSL (https://), this permission will be persistent.That is, users won't have to grant/deny access every time.

Providing fallback

For users that don't have support for getUserMedia(), one option is to fallbackto an existing video file if the API isn't supported and/or the call fails for some reason:

// Not showing vendor prefixes or code that works cross-browser:

function fallback(e) {
  video.src = 'fallbackvideo.webm';
}

function success(stream) {
  video.src = window.URL.createObjectURL(stream);
}

if (!navigator.getUserMedia) {
  fallback();
} else {
  navigator.getUserMedia({video: true}, success, fallback);
}

Basic demo

Taking screenshots

The <canvas> API's ctx.drawImage(video, 0, 0) method makes it trivial to draw<video> frames to <canvas>. Of course, now that we have videoinput via getUserMedia(), it's just as easy to create a photo booth applicationwith realtime video:

<video autoplay></video>
<img src="">
<canvas style="display:none;"></canvas>

var video = document.querySelector('video');
var canvas = document.querySelector('canvas');
var ctx = canvas.getContext('2d');
var localMediaStream = null;

function snapshot() {
  if (localMediaStream) {
    ctx.drawImage(video, 0, 0);
    // "image/webp" works in Chrome 18. In other browsers, this will fall back to image/png.
    document.querySelector('img').src = canvas.toDataURL('image/webp');
  }
}

video.addEventListener('click', snapshot, false);

// Not showing vendor prefixes or code that works cross-browser.
navigator.getUserMedia({video: true}, function(stream) {
  video.src = window.URL.createObjectURL(stream);
  localMediaStream = stream;
}, onFailSoHard);

Applying Effects

CSS Filters

CSS filters are currently supported in WebKit nightlies and Chrome 18+.

Using CSS Filters, we can apply some gnarly effects to the <video>as it is captured:

<style>
video {
  width: 307px;
  height: 250px;
  background: rgba(255,255,255,0.5);
  border: 1px solid #ccc;
}
.grayscale {
  +filter: sepia(1);
}
.blur {
  <a href="http://sass-lang.com/docs/yardoc/file.SASS_REFERENCE.html#including_a_mixin" _xhe_href="http://sass-lang.com/docs/yardoc/file.SASS_REFERENCE.html#including_a_mixin" id="tooltip510.68" target="_blank" data-tooltip="" *vendor="" prefixes="" required.="" try="" compass="" sass.*="" "="" role="tooltip" class="noexternal tooltip">+filter: blur(3px);
}
...
</style>

<video autoplay></video>

<script>
var idx = 0;
var filters = ['grayscale', 'sepia', 'blur', 'brightness', 'contrast', 'hue-rotate',
               'hue-rotate2', 'hue-rotate3', 'saturate', 'invert', ''];

function changeFilter(e) {
  var el = e.target;
  el.className = '';
  var effect = filters[idx++ % filters.length]; // loop through filters.
  if (effect) {
    el.classList.add(effect);
  }
}

document.querySelector('video').addEventListener('click', changeFilter, false);
</script>

Click the video to cycle through CSS filters

WebGL Textures

One amazing use case for video capture is to render live input as a WebGL texture.Since I know absolutely nothing about WebGL (other than it's sweet), I'm goingto suggest you give Jerome Etienne's tutorialand demo a look.It talks about how to use getUserMedia() and Three.jsto render live video into WebGL.

Using getUserMedia with the Web Audio API

One of my dreams is to build AutoTune in the browser with nothing more than open web technology! We're actually not too far from that reality.

As of Chrome 24, you can enable the "Web Audio Input" flag in about:flags toexperiment with getUserMedia() + the Web Audio API for realtimeeffects. Integrating the two is still a work in progress (crbug.com/112404),but the current implementation works pretty well.

Piping microphone input to the Web Audio API looks like this:

var context = new window.webkitAudioContext();

navigator.webkitGetUserMedia({audio: true}, function(stream) {
  var microphone = context.createMediaStreamSource(stream);
  var filter = context.createBiquadFilter();

  // microphone -> filter -> destination.
  microphone.connect(filter);
  filter.connect(context.destination);
}, onFailSoHard);

Demos:

  • Live Input Visualizer
  • Audio Recorder
  • Pitch Detector

For more information, see Chris Wilson's post.

Conclusion

In general, device access on the web has been a tough cookie to crack. Manypeople have tried,few have succeeded. Most of the early ideas have never taken hold outside of aproprietary environment nor have they gained widespread adoption.

The real problem is that the web's security model is very different from the native world.For example, I probably don't want every Joe Shmoe web site to have random access to myvideo camera. It's a tough problem to get right.

Bridging frameworks like PhoneGap have helped push the boundary,but they're only a start and a temporary solution to an underlying problem. To make webapps competitive to their desktop counterparts, we need access to native devices.

getUserMedia() is but the first wave of access to new types of devices. I hopewe'll continue to see more in the very near future!

Additional resources

  • W3C specification
  • Bruce Lawson's HTML5Doctor article
  • Bruce Lawson's dev.opera.com article

Demos

  • Live Photo booth
  • Instant Camera
  • Exploding Video
  • Paul Neave's WebGL Camera Effects
  • Snapster
  • Live video in WebGL
  • Play Xylophone with your hands

你可能感兴趣的:(Capturing Audio & Video in HTML5)