Webcams have become nearly ubiquitous peripherals for today’s PCs. They’re inexpensive and produce video imagery of varying quality—often dependent on how much the webcam cost or how fast your computer runs. Furthermore, you don’t have to do much to set up a webcam: just plug it into an available port, and you’re ready to go. Windows XP provides desktop-level access to webcams (and all digital cameras attached to the computer) through Windows Explorer; here you can take single frames of video and save them to disk as photographs.
网络摄像头的应用无所不在。本例是将视频的每个单帧存成文件。
A webcam appears as a video input source filter in DirectShow. Like the audio input source filters we covered in the last chapter, the webcam produces a stream of data that can then be put through a DirectShow filter graph. Some webcams—particularly older models—capture only video, relying on a separate microphone to capture audio input. Many recent versions of webcams, including the Logitech webcam I have on my own system, bundle an onboard microphone into the webcam, and this microphone constitutes its own audio capture source filter.
在DirectShow中,webcam表示为视频输入源filter,它能生成数据流并传输给DirectShow的filter graph。以前的摄像头可能只能采集视频,现在的则音视频都可以采集。
For this reason, DirectShow applications will frequently treat webcams as if they are two independent devices with entirely separate video and audio capture components. That treatment is significantly different from a digital camcorder, which provides a single stream of multiplexed audio and video data; that single stream of digicam data is demultiplexed into separate streams by DirectShow filters. Although this treatment makes webcams a tiny bit more complicated to work with, their ubiquity more than makes up for any programming hoops you’ll have to jump through.
因此,DirectShow应用程序通常将webcam看作是两个独立的设备:视频捕捉组件和音频捕捉组件。这样的处理方式是明显不同于数字录像机的----它提供的是音频和视频数据合成了的单一流,而且这个流必须使用DirectShow的filter来进行去混合。尽管这样的做法会使webcam的工作稍稍复杂些,但是这种处理方式是很有普遍性。
Introducing DSWebcamCap
The DirectShow application DSWebcamCap captures separate video and audio inputs from a Logitech webcam to a file. Although a specific brand of webcam is explicitly specified in the code of DSWebcamCap, it is very easy to modify a few string constants and have the program work with any version of webcam. The application takes the combined streams and puts them into a filter known as the WM ASF Writer. This filter compresses audio and video streams using the powerful Windows Media Video codecs. The encoded content is put into an ASF file, which, according to the Microsoft guidelines, should have either a .WMA or .WMV file extension. (In this case, we’re creating a file with the .ASF extension, which stands for Advanced Systems Format. This file will be functionally identical to a file with a .WMV extension. ASF is a container format, a file type that can contain any of a number of stream formats, including both WMV and WMA, so it’s well suited to this task.) The WM ASF Writer allows you to select a variety of compression levels for the output file, so you can create very small files from a rich video source.
本DirectShow应用DSWebcamCap是从Logitech webcam捕捉分立 的视频和音频输入存成文件。虽然本例使用了一个特定品牌的webcam,但是它可以稍做修改而用到别的地方。本应用程序是将合成流通过WM ASF Writer filter来写文件。这个上filter同时还使用了Windows Media Video codecs来压缩音频和视频流。编码后的数据写到ASF文件,以.WMA或.WMV为扩展名。WM ASF Writer可能根据需要来选择压缩等级。
Examining main
As in our earlier examples, nearly all the application’s work is performed inside its main function, as the following code shows:
点击(此处)折叠或打开
After COM has been initialized, we find ourselves in unfamiliar territory. Instead of instantiating a Filter Graph Manager object, we use CoCreateInstance to create an instance of the Capture Graph Builder object. This object is specifically designed to assist in the construction of filter graphs that capture audio and video input to files, but it can also be used to construct other types of graphs. As soon as the object has been created, we invoke its ICaptureGraphBuilder2::SetOutputFileName method. This method takes a pointer to a media type—in this case, MEDIASUBTYPE_Asf, indicating an ASF file—and creates an output file with the name passed as the next parameter.
在COM初始化后,并不是按照惯例地生成Filter Graph Manager 对象的实例,而是调用CoCreateInstance函数来生成Capture Graph Builder对象的实例。这个对象用来帮助构建捕捉音频和视频成文件的filter graph,但是它也可用来构建其它类型的graph。当这个对象创建成功后,就可以调用 它的ICaptureGraphBuilder2::SetOutputFileName方法。这个方法将会以一个媒体类型指针为参数—本例中为MEDIASUBTYPE_Asf,表示ASF文件----并按后续参数创建一个输出文件。
The call to SetOutputFileName creates an instance of a WM ASF Writer filter. A pointer to the filter’s IBaseFilter interface is returned in pASFWriter, and a pointer to the filter’s IFileSinkFilter interface is returned in pSink. We’ll need a pointer to the WM ASF Writer filter, and you might need the IFileSinkFilter interface. For example, if you want to change the name of the output file to some value other than that passed in the call to SetOutputFileName, you need IFileSinkFilter.
SetOutputFileName的调用会创建一个WM ASF Writer filter的实例。pASFWriter是这个filter的IBaseFilter的接口指针,pSink是这个filter的IFileSinkFilter接口指针。有时即需要WM ASF Writer filter的指针,也需要它的IFileSinkFilter接口。例如,当要改变输出文件名时,就不需要通过SetOutputFileName来实现,而用IFileSinkFilter就可以了。
If we’d wanted to create an AVI file—as we will in the next chapter when we’re dealing with digital video from a camcorder—we would have passed a pointer to MEDIASUBTYPE_Avi as the first parameter. In that case, instead of returning a pointer to a WM ASF Writer filter, the call would have returned a pointer to an AVI Mux filter, which combines the audio and video streams into an AVI-formatted stream. You’d still need to write that stream to a file by using the File Writer renderer filter, whereas the WM AVI Writer combines both multiplexing and file writing features in a single renderer filter.
当不需要创建AVI文件时—如在下章将要讨论的从camcorder获得是数字视频---就不用将指针MEDIASUBTYPE_Avi作为第一个参数传递,在这种情况下,返回的就不是WM ASF Writer filter指针,而是返回一个AVI Mux filter—它会将音频和视频流合成一个AVI格式的流。这时,还是可以同样使用File Writer renderer filter来将流写到文件。反之,
When Should You Compress?
When capturing video, in some scenarios it is preferable to simply save the data without compressing it. Most webcams output some sort of YUV video, although some output RGB, which requires significantly more bandwidth and hard drive space. Depending on the type of video, the size of the video frames, and the speed of your CPU, a compressor might not be able to keep up with the incoming video stream. If you save the uncompressed data in an AVI file by using the AVI Mux and the File Writer, you can come back and compress it later. When the graph’s source data is coming from a file, the compressor (in this case the WM ASF Writer) controls the speed at which the data is processed. In some cases, this might be faster than real time, and in some cases, it might be slower. Either way, no frames will be dropped because the source filter is reading from a local file.
在捕捉视频时,数据更趋向于不经压缩而直接存储。大多数的webcam可以输出YUV格式的视频(也有输出RGB的),这就要求大量的带宽和有磁盘空间。这些数据压不压缩看你实际的使用环境。
Examining and Changing a Filter’s Property Pages
The WM ASF Writer created by the call to SetOutputFileName has a wide range of options that control how it uses the Windows Media compression algorithms. A file can be highly compressed, which saves space but sacrifices quality, or it can be relatively expansive, consuming valuable hard disk resources but requiring fewer CPU cycles to process. These values can be set internally (which we’ll cover in Chapter 15), or they can be presented to the user in a property page. Many DirectShow filters implement property pages, which are dialog boxes that expose the inner settings of a filter. The ShowFilterPropertyPages function presents the property pages of a filter.
通过调用SetOutputFileName创建的WM ASF Writer有很多的可选项,可以控制Windows Media compression algorithms如何来工作。这些值可以内部进行设置(具体可参见第15章),也可以呈现在用户属性页上。很多DirectShow filter实现了这个属性页,通过ShowFilterPropertyPages函数可以实现它。
点击(此处)折叠或打开
The ShowFilterPropertyPages function exposes the ISpecificPropertyPages interface of the IBaseFilter passed by the caller and then gets the name of the filter. It also creates an interface to an IUnknown object, which exposes all the interfaces of a given object. That information is passed in a call to OleCreatePropertyFrame, which manages the property pages dialog box for user input. When this function is executed, you should see a property page dialog box that looks something like the one shown in Figure 5-1.
ShowFilterPropertyPages函数使用了IBaseFilter的ISpecificPropertyPages接口,然而获得filter的名字。然后创建了一个IUnknown对象的接口。这些信息都传输给OleCreatePropertyFrame。它管理用户的属性用户页输入。当这个函数执行时,就可以看到这个属性页。
Figure 5-1. Property pages dialog box for the WM ASF Writer
The operating system handles the details of user input, and when the user closes the dialog box, control returns to the function, which cleans itself up and exits.
OS将会处理用户输入的细节,当用户关闭对话框时,控制权将会返回给函数,它将会清除并退出。
Some, but not all, DirectShow filters have a property page associated with them, but it’s not a good idea to show the property page to the user. These property pages are designed for testing by the programmer, so they don’t meet Microsoft’s user interface guidelines or accessibility requirements; they could potentially confuse the user. In this case, however, the property page serves a useful purpose for testing: it allows you to adjust file compression parameters to suit your needs.
属性页有时是不必要的,会搞乱用户,主要在给开发人员调试就可以啦。
It’s not necessary to use property pages to change the settings on the WM ASF Writer, and, in fact, you cannot use the filter’s default property page to enable the latest Windows Media 9 Series Audio and Video codecs. But there is a set of COM interfaces that allow you to adjust all the features available to the Windows Media encoder and to make your own property page for end users; these interfaces will be covered in detail in Chapter 15.
用属性页来改变WM ASF Writer的设置是没有必要的,而且,实际上,最新版的Windows Media 9 Series Audio and Video codecs还不支持这种设置调整。
Working with the Filter Graph Manager of a Capture Graph
After using the Capture Graph Builder to create the filter graph, we must obtain the IMediaControl interface on the Filter Graph Manager to run, stop, and pause the graph. By calling ICaptureGraphBuilder2::GetFilterGraph, we obtain a pointer to the Filter Graph Manager’s IGraphBuilder interface. We can then use that interface to obtain the IMediaControl interface.
当使用Capture Graph Builder创建了filter graph之后,就必须从Filter Graph Manager上获得IMediaControl接口来Run, pause和pause。通过调用ICaptureGraphBuilder2::GetFilterGraph,就可以获得Filter Graph Manager的接口IGraphBuilder的指针。通过这个接口,就可以获得IMediaControl的接口。
The EnumerateAudioInputFilters function used in DSAudioCap has been reworked slightly to create GetAudioInputFilter, which returns an audio input filter with a specific FriendlyName, if it exists. The function is invoked with a request for a filter with the “Logitech” label—a microphone built into the webcam. Changing the value of this string returns a different audio input filter if a match can be found. You’d change this string if you were using another brand of webcam in your own DirectShow programs. (Ideally, you’d write a program that wouldn’t be tied to any particular brand of hardware, allowing the user to select from a list of hardware that had been recognized by DirectShow as being attached to the computer.) Once the appropriate filter is located, it’s added to the filter graph using the IGraphBuilder method AddFilter.
EnumerateAudioInputFilters函数和DSAudioCap中的相似,它用来创建GetAudioInputFilter。
Next we do much the same thing with video input devices. We’ve created the GetVideoInputFilter routine, which is nearly a carbon copy of GetAudioInputFilter, except that the GUID for the class enumerator provided in the call to CreateClassEnumerator is CLSID_VideoInputDeviceCategory rather than CLSID_AudioInputDeviceCategory. Once again, the function walks through the list of available capture devices, looking for one that has a FriendlyName that will match the one I’m looking for, which is labeled “Logitech.” An instance of the matching video input filter is returned to the caller, and it’s then added to the filter graph. (Monikers are a fast, cheap way to find a particular filter because DirectShow doesn’t need to create instances of each filter object to determine its FriendlyName or other properties.)
接下来要做的和GetAudioInputFilter很相似,不同的是枚举类的GUID是VideoInputDeviceCategory。它将会列出所有有效的捕捉设备,并找到一个想要的加入到filter。
Building the DSWebcamCap Filter Graph
Next we need to add a video renderer to the filter graph so that we can watch the video at the same time we are saving it to disk. To do so, invoke the ICaptureGraphBuilder2 method RenderStream to connect the video renderer to the appropriate pins in the filter graph. RenderStream handles all the filter creation and connection necessary to produce a path from an input stream to a particular renderer. In this case, we want to render a preview (monitor) to the display, so we pass a pointer to PIN_CATEGORY_PREVIEW as the first value, followed by MEDIATYPE_Video, indicating that we’re interested in a video stream. Then we pass two NULL values. The first NULL value tells RenderStream that no intermediate filter needs to be connected in the filter graph. If there is a filter you want to place into the path of the renderer (such as the Smart Tee filter, an encoder, or some other transform filter), supply a pointer to that filter as this argument. The second NULL value allows the RenderStream method to create a monitor window for the video stream, so we can see the video capture on screen while it is being written to file. RenderStream will determine the best path between the input filter and the renderer filter.
为了在监视视频的同时能将它写到文件,就需要在filter graph中添加一个video renderer。激活ICaptureGraphBuilder2的方法RenderStream,并将它连接到filter graph合适的pin就能实现这个功能。RenderStream处理所有filter创建并建立输入流到特定renderer的连接。在本例中,需要提交一个预览给显示器。因此,传送指针PIN_CATEGORY_PREVIEW作为第一个参数,接着的参数是MEDIATYPE_Video—它用来指示感兴趣视频流。然后传递两个NULL参数。第一个NULL是通知RenderStream,在本filter graph不需要中间filter。如果想加入一个filter,则将要插入的filter的参数作为参数传入。第二个NULL参数RenderStream方法可以创建一个流的监视窗口,因此可以在写文件的同时监视。RenderStream会确定输入filter和renderer filter之间的最佳路径。
Frequently, calls to RenderStream will result in the addition of transform filters known as Smart Tees to the filter graph. A Smart Tee takes a single stream and converts it into two identical, synchronized streams. One of these streams is designated as the capture stream and presented on the filter’s Capture pin, and the other stream is designated as a preview stream and presented on the filter’s Preview pin.
通常,调用RenderStream将会导致要添加一个额外的transform filter作为Smart Tees插入到filter graph。Smart Tee能够接收一个流,然后转换成两个一样的,同步的流。一个流可以作为捕捉流以给其它的filter作为Capture pin来使用,而另一个流则可以作为预览流给其它的filter作为Preview pin。
What makes the Smart Tee so smart? The Smart Tee understands priorities, and it will never cause the capture stream to lose frames. If the CPU starts getting overloaded with video processing, the Smart Tee will detect the overloading and drop frames from the preview display, sacrificing the preview in favor of capture quality. Using the Smart Tee, you might see a jerky preview, but the captured file should be perfect.
Smart Tee是怎么做到的呢?Smart Tee具有优先级,并且它不会造成捕捉流的丢帧。如果CPU在处理视频时过载了,那么Smart Tee能够侦测到这种过载并丢掉预览的帧,牺牲预览有利于捕捉的质量。使用Smart Tee将会看到一个不平稳的预览,但是有一个良好的捕捉文件。
To complete the construction of the filter graph, we make two more calls to RenderStream. The first call connects the video capture filter to the WM ASF Writer that’s been created by the ICaptureGraphBuilder2 object. In this case, PIN_CATEGORY_CAPTURE is passed as the first parameter of RenderStream; this parameter tells RenderStream that it should connect the capture stream rather than the preview stream. The Smart Tee filter will be added if needed. After the first call to RenderStream is completed, another call is made to RenderStream, with parameters to specify that the audio stream should be connected to the WM ASF Writer.
为了filter graph的构建,需要再多两个RenderStream的调用。第一个调用是建立视频捕捉filter到WM ASF Writer(由ICaptureGraphBuilder2对象创建)的连接,在本例中,PIN_CATEGORY_CAPTURE是传递给RenderStream的第一个参数。这个参数表示RenderStream,它应当连接是捕捉流,而不是预览流。如果需要的话,Smart Tee filter将会被插入。第二个调用是建立音频流到WM ASF Writer的连接。
That concludes the construction of the filter graph. Because we’ve used the Capture Graph Builder to
handle the details of wiring the filters together, it’s taken only a few lines of code to create a functional
filter graph. This filter graph will capture camcorder video and audio streams to a file while providing a
video monitor window on the display. When the application invokes the Filter Graph Manager’s Run
method, a small preview window will open on the display, showing the real-time output of the webcam,
even as the webcam’s video stream is written to a disk file.
Summary
In this chapter, we covered video capture with webcams—inexpensive, low-resolution video cameras, which have become nearly ubiquitous PC peripherals. Despite their low-quality imagery, webcams provide an important portal between the real world and the PC. They can turn any PC into a moviemaking workstation and they can send your image around the world in just a few clicks—or a few lines of DirectShow code.
本章主要讲述对摄像头的视频捕捉。虽然摄像头的图像品质很一般,但是的应用还是很广泛的。而DS完全只用几行代码就做进行图像捕捉。
Although DSWebcamCap seems very simple, it is the foundation of everything you’d need to add video capture and video e-mail capability to any Internet-based application. All the essentials—video and audio capture, video monitoring, and data compression—are included in this demonstration program. Although the program lacks the nicety of a user interface, line for line it’s an incredibly powerful multimedia authoring application, and it could easily be integrated within any application, including e-mail, a Web browser, a peer-to-peer file sharing program, and so on. DSWebcamCap is a good example of the power that DirectShow brings to the application programmer; many tens of thousands of lines of code are “under the hood,” hidden away in DirectShow filters that we need know nothing about to use them.
尽管DSWebcamCap看起来很简单,但是它可以作为所有需要视频捕捉的基础,如音视频捕捉,视频监控,和数据压缩等。
Webcams are the low-tech end in the world of digital video. They’re commonplace but not very powerful. Digital video cameras are the high-end hardware of this world, and it’s now time to examine how to use DirectShow to control these increasingly popular devices.