The next four chapters of this book deal with Microsoft DirectShow applications that capture media streams for manipulation by DirectShow. In most of the examples to follow, the streams will simply be written to disk for later playback or manipulation by another DirectShow application. The filter graphs we’ll build will be relatively simple, consisting of a capture source filter (a special class of source filter that talks directly to capture hardware on the computer), one or two transform filters, and a rendererfilter, which will write the stream to a file.
接下来的四个章节主要讨论MS应用程序通过DS来捕捉媒体流。为了使filter graph相对简单,这些章节中的示例程序都是将流写到文件以供以后的回放和其它DirectShow应用程序来操作。这些filter graph仅由capture source filter, 一个或两个transform filters,和一个renderer filter(用来写流到文件)组成。
Although building filter graphs to capture streams is a straightforward affair, at the code level a number of issues need to be addressed to successfully create a filter graph. Starting with a very simple filter graph, which captures an incoming audio source as an AVI file (containing sound but no video), we’ll progress through all the programming tricks you’ll need to employ to successfully build DirectShow applications capable of capturing any media stream coming into your computer.
尽管创建一个捕捉流的filter graph是一件很简单的任务,但是在代码级上还是有一些问题需要解决了才能成功地创建这样的filter graph。先从一个很简单的filter graph开始,它捕捉输入的音频源并将其写到一个AVI文件(只有音频没有视频)。下面将对这些编程的细节一一展开。
Capturing Audio with DSAudioCap
DSAudioCap is a console-based application that will record audio to a file (MyAVIFile.AVI) for as long as it runs. As a console application, it sends diagnostic information to the text-mode window it opens on the display, which is useful for debugging. This window also prompts the user to press the Enter key to stop recording, at which point the filter graph is stopped and the application terminates.
DSAudioCap是一个基于控制台的应用程序,当它运行时,将记录音频成文件(MyAVIFile.AVI)。作为一个控制台应用程序,它会将诊断信息发送到文本模式的窗口,以利于调试。这个窗口同样可以用于接收用户的按键命令以停止收录,这时filter graph会停止,并且应用程序结束。
Examining the main Function
Let’s begin our analysis of DSAudioCap by taking a look at the main function. As in the earlier examples of DSBuild and DSRender, main opens with the instantiating of the Filter Graph Manager object. This object is followed by the acquisition of the IMediaControl interface to the Filter Graph Manager, which controls execution of the filter graph.
函数main打开了一个Filter Graph Manager对象的实例,接着它请求了一个IMediaControl 类型的接口pControl,用它(pControl)来控制filter graph的执行。
点击(此处)折叠或打开
Enumerating System Devices in DirectShow
At this point in the function, a call is made to EnumerateAudioInputFilters, which is local to the application. This function is required because before a capture source filter can be added to the filter graph, it must be identified and selected. Unlike other DirectShow filters (which can be identified by a GUID), capture source filters and other filters that are tied to hardware devices are identified as a class of device. Using DirectShow calls, you can walk through the list of all filters in this class and select the one you want to use for your capture application.
在main函数中,调用了函数EnumerateAudioInputFilters,它是一个局部域函数,用于在将捕捉源 filter 添加进filter graph之前,识别和选择输入的设备。不同于其它的DirectShow的filter(它们都能被GUID标识),capture source filter 和其它硬件设备相关的filter都要以一个设备类来进行识别。使用DirectShow的调用,程序可以遍历类中的所有filter的列表,并选择其中一个来用。
Being able to enumerate capture filters gives you ultimate flexibility in the design of your DirectShow application because all capture resources are available to the application, not just those that have been designated as default devices by the user or the system. This functionality also means that DirectShow remains open to new hardware devices as they’re introduced by manufacturers—you won’t need a revision of DirectShow to deal with every new card or interface that comes down the path. Some WDM drivers installed in conjunction with the installation of multimedia devices are recognized by DirectShow and added to its enumerated lists of available filters. Figure 4-1 shows the Insert Filters dialog box listing audio capture devices.
由于可以枚举捕捉filters,因此所有的捕捉源都可以被应用程序使用,从而给DirectShow的应用程序设计提供了极大的灵活性,而不会只能使用由用户或系统选择的默认设备。这个功能同样意味着DirectShow可以忽略细节来打开新硬件设备只要它们被制造商兼容—也不需要修改DS来每个新采集卡或接口。某些安装了WDM驱动的设备同样也可被DS识别和连接。图4-1显示了插入filter对话框列出的所有音频捕捉设备。
On the other hand, now is a good time to ask yourself an important question: are you relying on specific hardware components in your DirectShow application? You could specify which capture source filter will be used in your application, but if that source filter isn’t on the user’s system (because it relies on hardware the user hasn’t installed), your application will fail. It’s best to design a DirectShow application for the widest possible range of devices to ensure that a user won’t encounter any unexpected failures when he or she runs your code.
另外,现在需要回答一个关键性的问题:你的DirectShow应用程序依赖于一个特定的硬件组件吗?你可以指定哪个capture source filter用于应用程序,但是如果这个source filter不在用户的系统上(因为它依赖于用户的系统是否安装),此时,这个应用程序将会导致失败。最好的设计是应用程序能最大范围地确保用户在使用时不会出现意外的程序崩溃。
Figure 4-1. GraphEdit enumerating audio capture devices in the Insert Filters dialog box
Here’s the source code for EnumerateAudioInputFilters:
点击(此处)折叠或打开
This function begins by creating an instance of a COM object known as the System Device Enumerator. This object will enumerate all the hardware devices of a specified type once its CreateClassEnumerator method is invoked with a GUID indicating the class of devices to be enumerated. In this case, the GUID CLSID_AudioInputDeviceCategory is requesting an enumeration of all the audio capture source filters on the system, but it could also be CLSID_VideoInputDeviceCategory for video capture source filters, or even CLSID_AudioCompressorCategory for all audio compression filters. (Although audio compression isn’t generally performed in hardware, each compressor is considered a system device and is treated as if it were hardware rather than software.) The resulting list of devices is placed into an object of the IEnumMoniker class. An enumerated class (which generally includes the letters Enum in its class name) can be examined, element by element, by invoking its Next method. In this case, invoking Next will return an IMoniker object, which is a lightweight COM object used to obtain information about other objects without having to instantiate them. The IMoniker object returns references to a‘bag’of data related to the enumerated
DirectShow filter, which is then placed into an IPropertyBag object with an invocation of the IMoniker method BindToStorage.
这个函数以创建COM对象实例----系统设备Enumerator开始。在调用了它的方法CreateClassEnumerator后,该对象将会枚举出该类中所有硬件设备。此时,GUID CLSID_AudioInputDeviceCategory会请求系统上的所有音频capture source filters的枚举对象,当然,也可以用来通过CLSID_VideoInputDeviceCategory来枚举视频的capture source filters,还可以通过CLSID_AudioCompressorCategory来枚举所有音频压缩filters。(尽管音频压缩通常不由硬件实现,每个音频压缩可以认为是一个系统设备并以硬件方式来对待)。设备的结果列表放在IEnumMoniker类的对象pEnumCat中。枚举类(通常在类名中包含有Enum)可以通过它的方法Next来进行逐个元素检查。在这个例子中,激活Next方法将会返回一个IMoniker对象pMoniker----它是一个轻量级的COM对象,它不需要实例化,用来获得其它对象的信息。这个IMoniker对象返回一个枚举相关的数据的引用bag。它可以放在IPropertyBag对象中。
Once the IPropertyBag object has been instantiated, it can be queried using its Read method for string data labeled as its FriendlyName, which is to say the English-readable name (rather than the GUID) for the filter. That string is then printed to the console. The string identifies the first filter in the enumerated list, which is
also the default audio capture device, established by the user through his or her preferences in the Sound control panel. (It’s a bad idea to count on any particular device showing up first on the list of audio capture devices because, at any point, the user might go into the Sound control panel and muck things up.)
当IPropertyBag对象被实例化后,就可以使用Read方法(以字符串FriendlyName为参数,以varName为返回修正)来询问,它将会返回filter的英文名称。这个字符串然后指向控制台,这个字符中识别了枚举列表中的第一个filter,----它就是默认的音频捕捉设备(由用户通过Sound control panel设置的)。
Once the name of the device has been printed to the console, another method call to the IMoniker object BindToObject creates an instance of the filter object, and this object is returned to the calling function. It will later need to be destroyed with a call to its Release method.
当设备名打印到控制台后,就可以调用IMoniker对象的方法BindToObject来创建这个filter对象的实例,并返回这个对象给调用函数EnumerateAudioInputFilters(void** gottaFilter),以用于后面的资源释放。
Although this function returns an object representing the first filter in the enumerated list—the default selection as specified in the control panel—with a few modifications, the function could return a list of objects and names. These objects and names could then be used to build menus or other GUI features that would allow the user to have complete control over which of possibly several audio capture sources would be used within a DirectShow application.
尽管这个函数返回了一个表示枚举列表中的第一个filter的对象 ,但是经过一些小的修改,这个函数也可以返回一系列的对象和名称。这些对象和名称可以用来创建菜单或是其它允许用户完全控制的GUI功能。
Enumerating Input Pins on an Audio Capture Device
Immediately upon return to the main function, another call is made to a different local function, EnumerateAudioInputPins. This function examines all the input pins on the selected audio input filter object; we’ll need to do this to determine which input pin is active. If you examine of the Volume control panel (generally accessible through the system tray), you can see that several different sources of audio input usually exist on a PC. Most PCs have some sort of CD audio, line audio, microphone, and auxiliary inputs. Other sources will capture the sound data passing through the system, as though the PC had an internal microphone, capturing its own sound-making capabilities in real-time. Figure 4-2 shows various audio input pins on an audio capture filter.
返回主函数后,将会调用另一个本地函数EnumerateAudioInputPins。这个函数将会选定后的输入filter对象的所有输入pin。我们需要确认哪个输入pin是激活的。在使用音量控制面板(Volume control panel)时,可以看到PC上的多种不同音频输入源。大多数的PC都会有CD audio, line audio, microphone, 和附加输入等。其它的源将会通过系统来捕捉音频。
Figure 4-2. GraphEdit showing the various audio input pins on an audio capture filter
The local function EnumerateAudioInputPins allows us to examine the audio input filter’s pins, as shown here:
函数EnumerateAudioInputPins用来检查音频输入filter的pin
点击(此处)折叠或打开
The EnumerateAudioInputPins function was adapted from the GetPin function used in the previous chapter. A pointer to a DirectShow IBaseFilter object is passed by the caller, and its pins are enumerated within an IEnumPins object by a call to the IBaseFilter method EnumPins. Using the Next method on the IEnumPins object, an IPin object is instantiated for every pin within the enumerated list. The IPin object is then tested with a call to the QueryDirection method: is the pin an input pin? If it is, the QueryPinInfo method of IPin is invoked. This method returns a PIN_INFO data structure, containing (in English) the name of the pin, which is then printed on the console.
函数EnumerateAudioInputPins继承自前面章节的函数GetPin。调用者传入DirectShow的IBaseFilter对象指针pFilter。通过调用IBaseFilter的方法EnumPins来枚举对象pFilter的所有Pin,并存在IEnumPins对象pEnum中。通过调用IEnumPins对象的Next方法,可以得到IPin对象的枚举列表实例。可以通过调用IPin对象的QueryDirection方法来测试某个Pin是否是一个输入pin?如果是,IPin的方法QueryPinInfo就可以调用,这个方法会返回一个PIN_INFO的数据结构----它包含有pin的名称(将会输出到控制台)。
Now comes a bit of DirectShow magic. Each pin on an audio input filter is an object in its own right and presents an IAMAudioInputMixer interface as one of its properties. This interface allows you to control various parameters of the pin, such as its volume level. The IAMAudioInputMixer interface for each IPin object is retrieved with a QueryInterface call. (This call will fail if you’re not acting on an IPin that is part of an audio input filter.) Once this object is instanced, one of its properties is examined with a call to get_Enable. If TRUE is returned, the pin is enabled—that pin is the active input pin for the filter, and the corresponding audio input is enabled. (The default active input pin is set through the Volume control panel; it’s the enabled item in the list of recording inputs.) Figure 4-3 shows pin properties of the audio input pins.
现在开始介绍DirectShow的一些神奇的地方。在音频输入filter上的每个pin都是一个有自己权利的对象,并且有一个IAMAudioInputMixer的接口。这个接口可以控制pin的各种参数,例如音量。每个IPin对象的IAMAudioInputMixer接口可以通过QueryInterface来调用。当IPin对象实例化后,它的属性要使用get_Enable方法来启用。如果返回TURE,那么这个pin就开启了----对于filter来说,这是一个激活了的输入pin,并且相应的音频输入也是开启的。(默认的激活的输入pin是通过音量控制面板来设置的,它也开启了相应的输入设备列表)
Figure 4-3. GraphEdit showing pin properties for the audio input pins
Although this routine only reads whether a pin is enabled, it is possible, through a corresponding call to the put_Enable method, to change the value on the pin, thereby selecting or removing an audio input. (You should be very careful that you have only one input enabled at a time, unless you know that your audio hardware can handle mixing multiple inputs.) This function indicates only the currently enabled input, but it’s very easy to modify it so that it selects an alternative input. And, once again, this function could easily be rewritten to return an array of IPin objects to the caller. This list could then be used to build a GUI so that users could easily pick the audio pin themselves.
尽管这个例子仅读取了是否这个pin是开启的,但是,通过相应的put_Enable方法的调用来尽管pin上的值是可以的。(但是这种情况的使用只能针对一次一个输入允许的情形,除非你知道你的音频硬件能处理多个输入的合成)。这个函数仅对当前开启的输入有用,但是也可以很容易地修改成选择一个合适的输入。另外,这个函数也能很方便地重写以返回一个IPin对象给调用者。这个列表就可以用来由用户选择的音频输入pin的GUI.
Connecting DirectShow Filters
DSAudioCap adds a convenience function, ConnectFilters, which can be used by the programmer to handle all the nitty-gritty of connecting two filters together. It’s actually three separate functions, as shown here:
DSAudioCap添加了一个例程函数, ConnectFilters, 它用来处理两个filter之间的连接。它实际由三个函数组成,如下所示:
点击(此处)折叠或打开
Two versions of ConnectFilters are included as local functions in DSAudioCap, which is possible because the two have different calling parameters and C++ can handle them differently through its capability with overloaded functions. The bottom version of ConnectFilters is the one invoked by main, and it calls GetUnconnectedPin, which is a function very much like GetPin, except that it returns the first unconnected output pin on the source filter. The results are passed along to the other version of ConnectFilters, which calls GetUnconnectedPin again, this time searching for an input pin on the destination filter. Once everything’s been discovered, a call is made to the Filter Graph Manager method Connect, which connects the two pins.
在DSAudioCap中有两个版本的ConnectFilters,这是因为有两个不同的调用参数,并且C++可以通过重载来处理。后面的ConnectFilters是由main函数调用的,它调用了GetUnconnectedPin----它和GetPin很像,不同之处在于它返回源filter中第一个未连接的输出pin。这个结果将传送给另一个版本的ConnectFilters,它会再次调用GetUnconnectedPin,这次会搜索目的filter中一个输入pin。当这两个pin都被发现后,就可以调用Filter Graph Manager的方法Connect来连接两个pin。
Adding a Filter by Its Class ID
One more convenience function exists in DSAudioCap, AddFilterByCLSID, which adds a filter to the graph by its unique class ID. It will be used frequently in future examples.
AddFilterByCLSID,函数用于添加一个filter到graph。
Although AddFilterByCLSID doesn’t do anything spectacular, it does save some time in coding because it encapsulates the object creation and filter addition actions into a single function call. You’ll probably want to use it in your own DirectShow applications.
尽管AddFilterByCLSID并没有做什么大动作,但是它是出现 了多次,因为它封装了对象的创建和filter的添加在一个函数调用中,它同样可用于别的DS应用程序。
Using the Audio Capture Filter Graph
The filter graph for DSAudioCap is composed of three filters. First is an audio input filter, as explored earlier. Next comes an AVI multiplexer (or AVI mux). This filter takes a number of media streams and multiplexes them into a single stream formatted as an AVI file. These streams can be video, audio, or a combination of the two. An AVI mux is one of the ways to create a movie with synchronized video and audio portions; as streams arrive at the multiplexer, they’re combined into a single, synchronized stream. The final component in the filter graph is a File Writer filter, which writes the AVI stream to disk.
DSAudioCap的filter graph由三个filter组成。第一个是音频输入filter,接着的是AVI合成器,这个filter用来将多个流合成一个AVI文件流,这些被合成的流可以是视频,音频,或是两者的组合。AVI合成器也是创建同步的音视频电影的方法。当流到达合成器时,它们被合成一个单一的,同步过的流。最后一个组件是File Writer filter,它是用来写AVI流到磁盘。
Adding a File Writer Filter
Nearly all capture applications write their captured streams to a disk file. Several objects can serve as file “sinks,” meaning that they write stream data. The most common of these is the DirectShow File Writer filter object. In DSAudioCap, the File Writer filter is instantiated and added to the filter graph, and then its IFileSink interface is instantiated through a call to QueryInterface. This object exposes methods that allow you to programmatically set the name and path of the file to be written to disk, through a call to its SetFileName method. Figure 4-4 shows the filter graph created by DSAudioCap.
几乎所有捕捉应用都是将它们捕捉到的流写成文件。有多个对象都可以作为文件的“接收器模块,意即它们都可以写流数据。最常用的是DirectShow的File Writer filter对象。在DSAudioCap中,这个File Writer filter是一个实例,并被添加到filter graph中,它的IFileSink接口是一个实例,可以通过QueryInterface来调用。这个对象的可以通过SetFileName方法来设置保存的文件名和路径。
Executing the DSAudioCap Filter Graph
Once the file name has been set and all the filters have been connected together, the Run method is sent to the filter graph’s control object. From this point, the filter graph captures audio and writes it to an ever-growing AVI file until the Enter key is pressed. Watch out: this file can get very big very quickly. (We’re using an old but venerable call to getchar to watch for keyboard input.) Once the Enter key has been pressed, the filter graph control object makes a call to its Stop method, which terminates execution of the filter graph.
当设置好文件名,并且所有的filter都连接好后,就可以给filter graph的控制对象来发送Run方法。从此时开始,filter graph就能捕捉音频并写到AVI文件中,直到回车键被按下。
Figure 4-5 shows DSAudioCap running.
We’ve kept the call to SaveGraphFile in this program (and will continue to do so), which means that after the program terminates its execution, you can use GraphEdit to examine the filter graph created by DSAudioCap.
本程序仍保留了SaveGraphFile的调用,用于生成filter graph的文件。
After you execute this application, there should be a file named MyAVIFile.AVI on disk. Double-clicking the file (or opening it from Windows Media Player) will allow you to hear the sound that was captured by DSAudioCap.
应用程序执行完成后,将生成MyAVIFile.AVI文件,可以在GraphEdit中查看。
Adding Audio Compression with DSAudioCapCom
The sound that’s captured by DSAudioCap is uncompressed, so file sizes build very quickly, at just about the same rate they do on a standard audio CD—about 700 MB per hour. However, numerous transform filters are available to DirectShow that can compress audio to a fraction of its uncompressed size. We’ve already covered everything you need to know to add a compressor filter to the filter graph. Like audio input devices, audio compressor filters must be enumerated and selected from a list.
DSAudioCap的音频是未压缩的,因此文件增长得很快。然而,绝大多数的transform filters可以用来压缩音频数据大小。这里我们就要添加下个压缩filter。
The only major difference between DSAudioCap and DSAudioCapCom (the version with compression) is the
addition of one function, EnumerateAudioCompressorFilters, as shown in the following code:
DSAudioCap和DSAudioCapCom的主要的差别在于EnumerateAudioCompressorFilters这个函数。代码如下所示:
点击(此处)折叠或打开
To add an audio compressor to this filter graph, begin by enumerating all of the CLSID_AudioCompressorCategory filters in a method call to CreateClassEnumerator. You do this because
DirectShow treats the audio compressors as individual objects with their own class IDs. However, we have no
way of knowing the class ID of a given audio compressor—something we’ll need to instantiate a DirectShow
filter. In this case, we use the call to CreateClassEnumerator as a way of learning the class IDs of all audio
compressors available to DirectShow.
为了在filter graph中添加一个音频压缩filter,以调用CreateClassEnumerator.方法来枚举所有CLSID_AudioCompressorCategory filters开始。因为DirectShow将音频压缩作为独立的,有它自己的类ID来处理。然而,我们没有办法知道一个给定站音频压缩的class ID---有进我们需要实例一个DirectShow filter。但是我们可以通过调用CreateClassEnumerator来音频压缩的class ID。
Once we have the enumerated list, we step through it using the Next method, until we find an audio
compressor whose FriendlyName property matches the name we’re looking for. In this program, we’re
matching the name Windows Media Audio V2, a codec that should be included on your system. (If it’s not,
use GraphEdit to list the audio compressors installed on your system—see Figure 4-6—and change the
string to match one of those entries.)
当获得了枚举列表后,可以通过Next方法来逐个查找FriendlyName的匹配项。本例中,我们要的匹配名为“Windows Media Audio V2----这个应该是系统默认有的。(如果没有,则得要GraphEdit去查看下)。When the matching object is found, it’s instantiated and returned to the caller. This function is nearly identical to EnumerateAudioInputFilters, so you can begin to see how these functions can easily be repurposed for your own application needs.
当找到匹配对象后,这个函数将返回调用者一个实例。这个函数和EnumerateAudioInputFilters,很像,因此可以将它们都应用到你自己的应用程序中
The only major difference in this version of the main function is that the returned audio compressor filter is
added to the filter graph using AddFilter and ConnectFilter, as shown here:
在main函数中这个版本的主要不同在下调用了AddFilter和ConnectFilter将生成了音频压缩filter并添加到了filter graph。代码如下所示:
点击(此处)折叠或打开
We’ve also added a bit of substance to the File Writer filter. The File Writer filter exposes a number of
interfaces, including IFileSink, which can be used to set the name of the file, and IFileSink2, which can be
used to set the mode parameter for the File Writer filter. If the mode is AM_FILE_OVERWRITE, the file will
be deleted and re-created every time the filter graph begins execution. That’s the behavior we want, so we’ve included a few extra lines of code:
本例同样添加了写文件filter的代码。写文件filter提供了一些接口,包括有IFileSink,它用来设置文件名;和IFileSink2, 它用来设置写文件filter的模式参数。如果这个模式是AM_FILE_OVERWRITE, 那么每次的filter graph的执行都会删除原文件并重新生成一个新的文件。这种方式正是本例子中想要的,因此增加了额外的代码:
After you’ve run DSAudioCapCom, you’ll see that its file sizes increase much more slowly than DSAudioCap because the files are compressed. As you become familiar with the different types of audio
compression—some are better for voice, others for music—you might consider modifying the compression
filters used in your DirectShow programs on a task-dependent basis: different compression for different
sounds.
当运行DSAudiaCapCom后,可以看到相对于DSAudioCap,文件大小的增长要缓慢多了,这是因为文件进行了压缩。如果很熟悉不同类型的音频压缩—有些更适应于声音,而有些更适于音乐—就可以考虑使用这些效率更好的压缩filter。
Summary
In this chapter, we’ve covered how to discover and enumerate the various audio capture devices connected to a PC. Once we’ve learned the list of audio capture devices available to DirectShow, we’ve been able to build capture filter graphs and write these audio streams to disk files. DirectShow can be used as the basis of a powerful audio capture and editing application. (Editing is covered in Chapter 8.) Alternatively, you can use DirectShow for audio playback—although there are easier ways to do that, using DirectSound.
本章介绍了如何发现和枚举连接到本机的各种音频捕捉设备。一旦对于DS来说这些音频捕捉设备可用时,就可以用来创建捕捉filter graph并将音频流写到文件。DS可以用来作为基本且功能强大的的音频播放器或编辑应用中。(编辑应用在第八章中讨论)。同样的,也可以使用DS来实现音频播放—即使使用DirectSound会更简单。
Now we’ll move along to video capture with webcams. Here compression isn’t just an option; it’s a
necessity because video files can easily fill your entire hard disk in just a few minutes.
下章将会开始讨论捕捉摄像头的视频,这时压缩就很必要了。