这是NAudio作者在https://channel9.msdn.com/coding4fun/articles/NET-Voice-Recorder上发表的一篇关于NAudio的使用方法的一篇文章转录于此并翻译成汉语(有道字典翻译)的有很多地方可能不对,现将中英文对照一起附上。文章最后是网友提问一并附上。
.NET 音频录制
.NET Voice Recorder
2009年10月08日
10月 08, 2009 at 11:29上午
作者:Mark Heath
在这篇文章中,我演示了如何在。net中从麦克风录制,支持设置录制级别,从开始到结束裁剪噪音,在WPF中可视化波形和转换为MP3。
In this article I demonstrate how to record from the microphone in .NET, with support for setting the recording level, trimming noise from the start and end, visualizing the waveform in WPF and converting to MP3.
.NET中的音频录制
Audio Recording in .NET
. net框架没有提供任何对录制音频的直接支持,因此我将使用开放源码的NAudio项目,该项目包括许多Windows音频录制api的包装器。
The .NET framework does not provide any direct support for recording audio, so I will make use of the open source NAudio project, which includes wrappers for a number of Windows audio recording APIs.
注意:需要指出的是,. net不是一个适合于高采样率和低延迟音频录制的选择,例如在录音棚中使用的数字音频工作站软件。这是因为。net垃圾收集器可以在任何时候中断进程。然而,为了从麦克风中录制语音,. net框架的功能远远不止于此。默认情况下,NAudio要求声卡每100ms提供一次数据,这给了垃圾收集器和我们自己的代码运行足够的时间。
Note: It is important to point out that .NET is not an appropriate choice for high sample rate and low latency audio recording, such as that found in Digital Audio Workstation software used in recording studios. This is because the .NET garbage collector can interrupt the process at any point. However, for purposes of recording speech from the microphone, the .NET framework is more than capable. By default, NAudio asks the soundcard to give us data every 100ms, which gives plenty of time for the garbage collector to run as well as our own code.
我们将使用waveIn API的包装器,因为它们是最普遍支持的,并且允许我们自由选择采样率。我们将以8kHz的16位单声道录制,这对于语音来说已经足够好了,而且不会让处理器负担过重,因为我们想要可视化波形,这一点很重要。
We will make use of the wrappers for the waveIn API’s, as these are the most universally supported, and allow us freedom to choose the sample rate. We will record in mono, 16 bit at 8kHz, which is more than good enough audio quality for speech, and will not overly tax the processor, which is important as we want to visualize the waveform as well.
选择捕获设备
Choosing a Capture Device
通常情况下,你可以毫无困难地使用默认的音频捕获设备,但如果你需要为用户提供选择,NAudio会允许你这么做。你可以使用WaveIn。DeviceCount WaveIn。GetDeviceCapabilities查找存在多少录音设备,并查询它们的名称和支持的通道数量。
Normally, you will be able to use the default audio capture device without any difficulties, but should you need to offer the user a choice, NAudio will allow you to do so. You can use the WaveIn.DeviceCount and WaveIn.GetDeviceCapabilities to find out how many recording devices are present, and query for their name and number of supported channels.
在我的计算机上,我有一个单一的waveIn设备(麦克风阵列),直到我插入我的耳机,这时,一个新的设备出现并成为默认(设备0总是默认)。
On my computer, I have a single waveIn device (Microphone Array) until I plug my headset in, at which point, a new device appears and becomes the default (device 0 is always the default).
int waveInDevices = WaveIn.DeviceCount;
for (int waveInDevice = 0; waveInDevice < waveInDevices; waveInDevice++)
{
WaveInCapabilities deviceInfo = WaveIn.GetCapabilities(waveInDevice);
Console.WriteLine("Device {0}: {1}, {2} channels",
waveInDevice, deviceInfo.ProductName, deviceInfo.Channels);
}
这将在我的计算机上产生以下输出:
This produces the following output on my computer:
设备0:麦克风/线路插入(SigmaTel, 2通道
Device 0: Microphone / Line In (SigmaTel , 2 channels
设备1:麦克风阵列(SigmaTel高,2通道
Device 1: Microphone Array (SigmaTel High, 2 channels
不幸的是,这些设备名被截断,因为WAVEINCAPS结构只支持31个字符。有一种方法可以获得完整的设备名,但它相当复杂。
Unfortunately these device names are truncated because the WAVEINCAPS structure only supports 31 characters. There is a way of getting the full device name, but it is rather convoluted.
通常,你会选择设备0(默认值),但是如果你想选择一个不同的输入设备,只需在你的WaveIn对象上设置DeviceNumber属性为你想要的数字。
Normally, you will choose Device 0 (the default), but if you wish to select a different input device, simply set the DeviceNumber property on your WaveIn object to the desired number.
Checking the Recording Level
录音的第一步通常是帮助用户确定他们的麦克风是否工作。如果用户的声卡上有多个输入,这一点尤其重要。我们实现这一点的方式很简单,通过开始记录和显示音频水平检测到用户与音量计。waveIn api不会把任何东西写到磁盘上,所以在这一点上没有音频被“录制”,我们只是检查输入电平,然后扔掉捕获的音频样本。
The first step in recording is usually to help the user determine if their microphone is working or not. This is especially important if the user has more than one input on their soundcard. The way we achieve this is simply by starting recording and displaying the level of audio detected to the user with a volume meter. The waveIn APIs do not write anything to disk, so no audio is actually being ‘recorded’ at this point, we are simply examining the input level and then throwing the captured audio samples away.
为了开始从声卡捕获音频,我们使用了NAudio中的WaveIn类。在调用StartRecording之前,我们将其配置为我们想要记录的波形格式(在我们的例子中是8kHz单声道),以开始从设备捕获音频。
To begin capturing audio from the soundcard, we use the WaveIn class in NAudio. We configure it with the WaveFormat in which we would like to record (in our case 8kHz mono), before calling StartRecording, to start capturing audio from the device.
waveIn = new WaveIn();
waveIn.DeviceNumber = selectedDevice;
waveIn.DataAvailable += waveIn_DataAvailable;
int sampleRate = 8000; // 8 kHz
int channels = 1; // mono
waveIn.WaveFormat = new WaveFormat(sampleRate, channels);
waveIn.StartRecording();
当音频缓冲区从声卡返回给我们时,DataAvailable事件处理程序将通知我们。数据以字节数组的形式返回,表示PCM示例数据。如果我们计划将音频直接写入磁盘,这没问题,但如果我们希望查看音频数据本身呢?每个音频样本都是16位,也就是两个字节,这意味着我们需要将字节对转换成短字节,以便能够理解数据。
The DataAvailable event handler will notify us whenever a buffer of audio has been returned to us from the sound card. The data comes back as an array of bytes, representing PCM sample data. This is fine if we are planning to write the audio directly to disk, but what if we wish to have a look at the audio data itself? Each audio sample is 16 bits, i.e. two bytes, meaning that we will need to convert pairs of bytes into shorts to be able to make sense of the data.
注意:如果我们用立体声记录,16位的样本会成对出现,首先是左样本,然后是右样本。
Note: if we were recording in stereo, the 16 bit samples would themselves come in pairs, first the left sample, then the right sample.
下面的代码展示了我们如何处理DataAvailable事件中的原始字节,并将单个音频样本读取出来。注意,我们使用的是BytesRecorded字段,而不是缓冲区的Length属性。此外,我选择将样本转换为32位浮点格式,并缩放它们,使最大体积为1.0f。这使得通过效果处理和可视化它们变得更加容易。
The following code shows how we might process the raw bytes in the DataAvailable event, and read the individual audio samples out. Notice that we use the BytesRecorded field, not the buffer’s Length property. Also, I have chosen to convert the samples to 32 bit floating point format and scaled them so the maximum volume is 1.0f. This makes processing them through effects and visualizing them much easier.
void waveIn_DataAvailable(object sender, WaveInEventArgs e)
{
for (int index = 0; index < e.BytesRecorded; index += 2)
{
short sample = (short)((e.Buffer[index + 1] << 8) |
e.Buffer[index + 0]);
float sample32 = sample / 32768f;
ProcessSample(sample32);
}
}
注意:使用waveIn和waveOut api的一个复杂之处是决定回调机制。NAudio提供了三种选择。首先是函数回调。这意味着waveIn API被赋予了一个(固定的)函数指针,并被它调用。这意味着DataAvailable回调将在后台线程中进入。在某种程度上,这是最干净的方法,但你需要小心使用函数回调时可能挂起对waveOutReset调用的恶意声卡驱动程序(在许多笔记本电脑上发现的SoundMAX芯片组特别容易出现这个问题)。
Note: One complication of using the waveIn and waveOut APIs is deciding on a callback mechanism. NAudio offers three options. First is function callbacks. This means that the waveIn API is given a (pinned) function pointer which it calls back onto. This means that your DataAvailable callback will come in on a background thread. In some ways this is the cleanest approach, but you need to beware of rogue soundcard drivers that can hang in calls to waveOutReset when using function callbacks (the SoundMAX chipset found on a lot of laptops is particularly prone to this problem).
第二种方法是提供窗口句柄。waveIn api会在窗口句柄的消息队列中返回一个待处理的消息。这种方法往往是最可靠和最常用的。需要注意的一个问题是,如果您停止录制并立即重新启动,来自旧录制会话的消息可能会在新会话中处理,从而导致严重的异常。
The second is to supply a window handle. The waveIn APIs will post a message back to be handled on the message queue of that window handle. This method tends to be the most reliable and most commonly used. One gotcha to watch out for is that if you stop recording and immediately restart, a message from the old recording session could get handled in the new session resulting in a nasty exception.
第三是让NAudio创建自己的新窗口并向其发布消息。这避免了来自一个录音会话的消息与另一个会话混淆的危险。如果你调用默认的WaveIn构造函数,NAudio会默认使用这个回调方法。但不要在后台线程或控制台应用程序中使用这个,或者NAudio创建的新窗口实际上不会去处理它的消息队列。
The third is to let NAudio create its own new window and post messages to that. This gets round any danger of messages from one recording session getting muddled up with another. This is the callback method that NAudio will use by default if you call the default WaveIn constructor. But don’t use this from a background thread or from a console application, or the new window that NAudio creates won’t actually get round to processing its message queue.
可视化的记录电平
Visualizing the Recording Level
我们已经看到了如何开始从声卡捕获音频,以检查录音级别。现在我们需要给用户一些视觉反馈。我们将使用WPF作为示例记录应用程序。我们可用来以图形方式显示单个数值的最简单控件是ProgressBar。因为它是WPF,所以我们可以完全自定义进度条的图形化外观,让它看起来更像一个音量计。我用了一个从绿色到红色的渐变来显示当前的音量等级。你可以在这里阅读更多关于我如何创建这个ProgressBar模板的内容。
We have seen how we can begin to capture audio from the soundcard for the purposes of checking the recording level. Now we need to give the user some visual feedback. We will use WPF for our sample recording application. The simplest control we have available to display a single numeric value graphically is the ProgressBar. And because it is WPF, we can fully customize the graphical appearance of the progress bar to look a little more like a volume meter. I have used a gradient going from green to red to show the current volume level. You can read more about how I created this ProgressBar template here.
图1-显示当前麦克风音量级别的进度条
Figure 1 - A Progress Bar Showing the Current Microphone Volume Level
为了帮助提供要显示的音量级别,我创建了一个SampleAggregator类。它会传递给我们接收到的每个音频采样值,并跟踪最大值和最小值。然后,在指定数量的样本之后,它会引发一个事件,允许GUI组件响应。我们需要小心,不要提出太多的这些事件或性能将受到严重影响。我每800个样本就增加一个,这意味着我们每秒钟会得到10个屏幕更新。因为我使用的是数据绑定,当其中一个更新触发时,我必须在DataContext对象(在MVVM模式中也称为“ViewModel”)上引发PropertyChangedEvent。以下是绑定到我的CurrentInputLevel属性的XAML语法:
To help provide the volume level to display, I have created a SampleAggregator class. This is passed every audio sample value we receive and keeps track of the maximum and minimum values. Then, after a specified number of samples, it raises an event allowing the GUI components to respond. We need to be careful not to raise too many of these events or performance will be badly affected. I am raising one every 800 samples, meaning we get 10 updates per second to the screen.
Because I am using data binding, when one of these updates fires, I must raise a PropertyChangedEvent on my DataContext object (also known as the “ViewModel” in the MVVM pattern). Here’s the XAML syntax for binding to my CurrentInputLevel property:
<ProgressBar Orientation="Horizontal"
Value="{Binding CurrentInputLevel, Mode=OneWay}"
Height="20" />
下面是ViewModel中的代码,确保每当我们计算一个新的最大输入级别时,GUI都会更新:
And here’s the code in the ViewModel that ensures that the GUI updates whenever we calculate a new maximum input level:
private float lastPeak;
void recorder_MaximumCalculated(object sender, MaxSampleEventArgs e)
{
lastPeak = Math.Max(e.MaxSample, Math.Abs(e.MinSample));
RaisePropertyChangedEvent("CurrentInputLevel");
}
乘以100,因为进度条的默认最大值是100
// multiply by 100 because the Progress bar's default maximum value is 100
public float CurrentInputLevel { get { return lastPeak * 100; } }
注意:模型视图视图模型(MVVM)是一种在WPF和Silverlight开发人员中越来越流行的模式。其基本思想是,视图(即xaml标记文件)背后没有任何代码,只是通过数据绑定的方式指定与业务逻辑的所有通信。ViewModel充当适配器,以简化数据绑定的过程。这种方法很好地分离了外观和行为。在大多数情况下,这个模式工作得很好,但是有一些棘手的领域,您需要在后面编写几行代码,或者使用一些巧妙的技巧,如附加依赖属性或自定义触发器。有几个优秀的开源助手库可以帮助您启动和运行MVVM应用程序。这里有一个全面的列表。
Note: Model View ViewModel (MVVM) is a pattern that is growing in popularity amongst WPF and Silverlight developers. The basic idea is that you have no code behind whatsoever on your View (i.e. your xaml markup file), and simply specify all communications with your business logic by means of data binding. The ViewModel serves as an adapter to ease the process of data binding. This approach gives very good separation of appearance and behavior. For the most part, this pattern works very well, but there are a few tricky areas, for which you will need to either write a few lines of code behind, or make use of some cunning tricks such as attached dependency properties or custom triggers. There are several excellent open source helper libraries that can take some of the work out of getting an MVVM application up and running. Have a look here for a comprehensive list.
假设当前输入电平过高或过低。我们希望能够支持修改记录级别。同样,我们希望使用数据绑定来实现这一点,因此我们将在XAML中添加一个音量滑块:
Adjusting the Recording Level
Suppose the current input level is too high or too soft. We would like to be able to support modifying the recording level. Again, we would like to use data binding to do so, so we will add a volume slider to our XAML:
<Slider Orientation="Horizontal"
Value="{Binding MicrophoneLevel, Mode=TwoWay}"
Maximum="100"
Margin="5" />
现在我们必须掌握MixerLine,它将允许我们访问我们的waveIn设备的输入音量控制。这要求我们使用Windows mixer api,它在NAudio中也有包装器。获得这个音量控制并不总是像你可能希望的那样简单(并且可能需要不同的方法为XP和Vista),但以下是代码,似乎在大多数系统上工作:
Now we have to get hold of the MixerLine that will allow us to access the input volume control for our waveIn device. This requires us to make use of the Windows mixer APIs, which also have wrappers in NAudio. Getting hold of this volume control is not always as straightforward as you might hope (and can require different approaches for XP and Vista), but the following is code that seems to work on most systems:
private void TryGetVolumeControl()
{
int waveInDeviceNumber = 0;
var mixerLine = new MixerLine((IntPtr)waveInDeviceNumber,
0, MixerFlags.WaveIn);
foreach (var control in mixerLine.Controls)
{
if (control.ControlType == MixerControlType.Volume)
{
volumeControl = control as UnsignedMixerControl;
break;
}
}
}
现在我们可以使用UnsignedMixerControl的Percent属性将volume设置为0到100之间的任何一个值。
Now we can use the Percent property on the UnsignedMixerControl to set volume to a value anywhere between 0 and 100.
Starting Recording
现在我们已经正确地设置了我们的记录级别,我们已经准备好真正开始记录了。但是由于我们已经打开了waveIn设备,我们所需要做的就是开始将我们接收到的数据写入一个文件。
NAudio有一个名为WaveFileWriter的类,它允许我们将记录的数据写入文件。现在,我们将它以PCM格式写入一个临时文件,然后将其转换为更好的压缩格式,如MP3。下面的代码创建了一个新的WAV文件:
Now we have got our recording levels set up correctly, we are ready to actually start recording. But since we have already opened our waveIn device, all we need to do is start writing the data we have received into a file.
NAudio has a class called WaveFileWriter which will allow us to write our recorded data to a file. For now, we will write it to a temporary file in PCM format, and convert it later into a better compressed format such as MP3. The following code creates a new WAV file:
writer = new WaveFileWriter(waveFileName, recordingFormat);
Now we can write to the file as we receive notifications from the waveIn device:
void waveIn_DataAvailable(object sender, WaveInEventArgs e)
{
if (recordingState == RecordingState.Recording)
writer.WriteData(e.Buffer, 0, e.BytesRecorded);
// ...
}
注意:有三个主要选择如何存储音频,而它是被记录。首先,您可以将它写入一个内存流。这减少了处理临时文件的不便,但是您需要小心,不要耗尽内存。而且,如果你的录音程序在进行到一半的时候崩溃了,你就失去了一切。以我们在演示中使用的采样速率,一分钟的音频只需要不到1mb的内存,但如果您使用44.1kHz的立体声(音乐的标准)录制,那么您每分钟大约需要10mb内存。
Note: There are three main options for how to store audio while it is being recorded. First, you can write it to a MemoryStream. This saves the inconvenience of dealing with a temporary file, but you need to be careful not to run out of memory. Also, if your recording program crashes half way through, you have lost everything. At the sample rate we are using for this demo, one minute of audio takes just under 1 MB of memory, but if you were recording at 44.1kHz stereo (the standard for music), you would need about 10 MB per minute.
其次,您可以写入一个临时WAV文件,以便稍后转换成另一种格式,就像我们在这里所做的那样。虽然这不是一种有效节省磁盘空间的格式,但它非常容易使用,如果你计划在录制后应用任何效果或编辑音频,它特别有用。
Second, you can write to a temporary WAV file to be converted to another format later, as we are doing here. While this is not a disk space efficient format, it is very easy to work with, and particularly useful if you are planning to apply any effects or edit the audio in any way after recording.
第三,在录音时,你可以直接将音频传送到编码器(如WMA或MP3)。这可能是最好的选择,如果你正在制作一个较长的录音,并且不需要在录音后编辑它。
Third, you can pass the audio directly to an encoder (such as WMA or MP3) as it is being recorded. This might be the best option if you are making a longer recording, and have no need to edit it after recording.
停止记录
显然,当用户单击stop recording按钮时,我们将停止,但我们也可能希望设置一个最大的记录持续时间,以阻止用户无意中填满他们的硬盘。在这个例子中,我们将允许一分钟的录音。
Stopping Recording
Obviously we will stop when the user clicks the stop recording button, but we might also want to set a maximum recording duration to stop the user inadvertently filling up their hard disk. For this example, we will allow one minute of recording.
long maxFileLength = this.recordingFormat.AverageBytesPerSecond * 60;
int toWrite = (int)Math.Min(maxFileLength - writer.Length, bytesRecorded);
if (toWrite > 0)
writer.WriteData(buffer, 0, bytesRecorded);
else
Stop();
注意:可以稍微令人困惑的是,当用户使用与WaveIn窗口回调,音频你记录的最后一点要求录音停止后,所以确保你不要关闭该文件保存直到你得到所有的音频。WaveIn对象上的FinishedRecording事件将帮助您确定何时关闭wave文件写入器并清理您的资源是安全的。
Note: Something that can be slightly confusing for users is that when using window callbacks with WaveIn, the last bit of audio you recorded comes in after you have asked recording to stop, so make sure you don’t close the file you are saving to until you have got all the audio back. The FinishedRecording event on the WaveIn object will help you determine when it is safe to close the WaveFileWriter and clean up your resources.
可视化波形
通常希望将音频波形显示给用户。当你在录音时显示波形有时被称为“置信度录音”,因为它允许你看到音频被录制的预期和水平仍然是正确的。
Visualizing the Wave Form
It is often desirable to display the audio waveform to the user. Displaying the waveform while you are recording is sometimes called “confidence recording”, because it allows you to see that audio is being recorded as expected and the levels are still right.
有多种可能的方法来绘制音频波形。最简单的方法是在每次聚合器触发时画一条垂直线显示最小值和最大值:
There are a variety of possible approaches for drawing audio waveforms. The simplest is to draw a vertical line showing the minimum and maximum values every time our sample aggregator fires:
图2 -使用垂直线的音频波形
乍一看,在WPF中实现这一点似乎是微不足道的,但这确实有消耗过多资源的危险。例如,每次计算一个新的最大样本时,简单地在画布上添加一个新行,执行起来非常糟糕,所以最好使用固定数量的垂直线,并动态地调整它们的大小。
Figure 2 - Audio Waveform using vertical lines
At first glance it may seem that this would be trivial to implement in WPF, but there is a real danger of consuming too many resources. For example, simply adding a new line to a Canvas every time a new maximum sample is calculated performs very badly, so it is better to have a fixed number of vertical lines and resize them dynamically.
另一种方法是创建一个多边形。这要求我们每次收到一个新样本时,都要在一个多边形的点集合中添加两个点。诀窍是在点集合的中间添加这些点,而不是在最后,这样最终的结果就是一个单一的形状。这意味着我们的波形可以有不同的轮廓颜色和填充颜色。为了防止边缘出现锯齿状,我们在X轴上绘制两个单位的点。
Another approach is to create a polygon. This requires us to add two points to a Polygon’s Points collection every time we receive a new sample. The trick is to add these points in the middle of the Points collection, rather than at the end, so that the end result is a single shape. This means our waveform can have a different outline color and fill color. To stop the edges from appearing too jagged, we plot points two units apart along on the X axis.
图3 -使用多边形渲染的音频波形
与麦克风音量计一样,波形绘制控制需要每秒接收几个由SampleAggregator接收到的最大和最小采样值的通知。当接收到每个样本值时,我们要么在多边形中插入新的点,要么(如果整个屏幕已经满了)返回到左边的边缘,并从那里继续绘图。
Figure 3 - Audio Waveform rendered using a Polygon
Like the microphone volume meter, the waveform drawing control needs to receive several notifications a second of the maximum and minimum sample values received by the SampleAggregator. When each sample value is received, we either insert new points into our polygon, or, if the whole screen is full, we go back to the left-hand edge and continue drawing from there.
对于置信度记录显示,我使用了Polygon方法,它在一个名为PolygonWaveFormControl的类中。下面是当我们收到一个新的最大样本时计算新点或更新点位置的代码:
For the confidence recording display I have used the Polygon method, which is in a class called PolygonWaveFormControl. Here’s the code which calculates the new points or updated point locations as we receive a new maximum sample:
public void AddValue(float maxValue, float minValue)
{
int visiblePixels = (int)(ActualWidth / xScale);
if (visiblePixels > 0)
{
CreatePoint(maxValue, minValue);
if (renderPosition > visiblePixels)
{
renderPosition = 0;
}
int erasePosition = (renderPosition + blankZone) % visiblePixels;
if (erasePosition < Points)
{
double yPos = SampleToYPosition(0);
waveForm.Points[erasePosition] =
new Point(erasePosition * xScale, yPos);
waveForm.Points[BottomPointIndex(erasePosition)] =
new Point(erasePosition * xScale, yPos);
}
}
}
private void CreatePoint(float topValue, float bottomValue)
{
double topYPos = SampleToYPosition(topValue);
double bottomYPos = SampleToYPosition(bottomValue);
double xPos = renderPosition * xScale;
if (renderPosition >= Points)
{
int insertPos = Points;
waveForm.Points.Insert(insertPos, new Point(xPos, topYPos));
waveForm.Points.Insert(insertPos + 1, new Point(xPos, bottomYPos));
}
else
{
waveForm.Points[renderPosition] = new Point(xPos, topYPos);
waveForm.Points[BottomPointIndex(renderPosition)] =
new Point(xPos, bottomYPos);
}
renderPosition++;
}
erase position的计算是为了将之前的一些示例值清空,以便在我们绕完一次后,新数据出现的地方更加明显:
The erase position calculation is to blank out some previous sample values to make it obvious where the new data is appearing after we have wrapped around once:
图4 polygon波形控制的“空白区域”
Figure 4 PolygonWaveForm control’s “blank zone”
注意:在WPF中有更快的渲染方法。一种选择是使用WriteableBitmap类并直接在其上绘图。如果您使用的是垂直线渲染方法,那么这可能是一种很好的方法。第二种是使用DrawingVisual对象,这是一种轻量级的绘图对象,比使用Shape派生的类提供更好的性能。缺点是失去了数据绑定和在XAML中全面描述图像的能力等特性,但对于波形绘制来说,这并不是一个缺点。我在这个应用程序的音频保存部分使用了DrawingVisual方法。
Note: There are faster ways to perform rendering in WPF. One option is to use the WriteableBitmap class and draw directly onto it. This could be a good approach if you were using the vertical lines method of rendering. The second is to use DrawingVisual objects, which are lightweight drawing objects offering better performance than using classes derived from Shape. The down-side is the loss of features such as DataBinding and the ability to fully describe the picture in XAML, but for WaveForm drawing this is not really a drawback. I use the DrawingVisual method in the Save Audio part of this application.
另一个挑战是波形绘制控制如何接收通知,因为我使用MVVM,所以我没有直接访问SampleAggregator。一个简单的方法是在PolygonWaveFormControl上创建一个依赖属性:
Another challenge was how the waveform drawing control could receive notifications since I am using MVVM so I have no direct access to the SampleAggregator. A simple way around this was to create a Dependency Property on PolygonWaveFormControl:
public static readonly DependencyProperty SampleAggregatorProperty =
DependencyProperty.Register(
"SampleAggregator",
typeof(SampleAggregator),
typeof(PolygonWaveFormControl),
new PropertyMetadata(null, OnSampleAggregatorChanged));
public SampleAggregator SampleAggregator
{
get { return (SampleAggregator)this.GetValue(SampleAggregatorProperty); }
set { this.SetValue(SampleAggregatorProperty, value); }
}
private static void OnSampleAggregatorChanged(object sender, DependencyPropertyChangedEventArgs e)
{
PolygonWaveFormControl control = (PolygonWaveFormControl)sender;
control.Subscribe();
}
这允许我们将PolygonWaveFormControl绑定到DataContext上公开的SampleAggregator上:
This allows us to bind the PolygonWaveFormControl to the SampleAggregator made public on our DataContext:
<my:PolygonWaveFormControl
Height="40"
SampleAggregator="{Binding SampleAggregator}" />
Trimming the Audio
我们已经创建了一个临时的WAV文件,但是在用户将它保存到他们所选择的文件之前,我们希望允许他们从录音的开始和结束删去任何不需要的部分。为了做到这一点,我想显示整个记录波形,与选择矩形叠加在上面,以允许子范围被选择。
We have created a temporary WAV file, but before the user saves it to a file of their choosing, we want to allow them to trim off any unwanted parts from the start and end of the recording. To do this I would like to display the entire recorded waveform, with a selection rectangle superimposed on top to allow a sub-range to be selected.
图5 - GUI允许选择录制音频的一部分
Figure 5 - GUI to allow selection of a portion of the recorded audio
要完成这种接口,我们需要三个组件。第一个是滚动查看器。ScrollViewer允许我们在波形太大而不适合屏幕的情况下左右滚动,这很可能是在录制超过几秒钟的音频时发生的。
To accomplish this kind of interface we need three components. The first is a ScrollViewer. The ScrollViewer allows us to scroll left and right through the WaveForm if it is too big to fit onto a screen, which is likely if you record more than a few seconds of audio.
第二种是一种新型波形渲染器,它将渲染整个文件,而不是我的PolygonWaveFormControl,它在屏幕填满时从左边开始。为此,我创建了WaveFormVisual,它使用DrawingVisual objects来绘制整个波形。显然,如果我们想要长时间记录,这种方法就需要进行优化,因为它创建的多边形将拥有数千个点,但对于短记录来说,它的效果很好。
The second is a new type of WaveForm renderer that will render an entire file, rather than my PolygonWaveFormControl which started again at the left when the screen filled up. For this I created WaveFormVisual which uses DrawingVisual objects to draw the entire WaveForm. Obviously if we wanted to record for a long period, this approach would need to be optimised as the polygon it creates would have thousands of points, but for short recordings, it works fine.
The third piece was the hardest to get right – the selection rectangle to support mouse dragging selection of the waveform. For this I created the RangeSelectionControl.
RangeSelectionControl
只是一个带有实体轮廓和半透明填充的蓝色矩形,放置在画布上。魔术发生在鼠标处理程序中。我们需要检测用户何时将鼠标悬停在矩形的左边缘或右边缘,并将光标设置为显示水平大小调整图标。这可以在MouseMove事件中完成,检查X坐标,然后设置光标属性:
The RangeSelectionControl is simply a blue rectangle with a solid outline and semi-transparent fill sitting on a Canvas. The magic occurs in the mouse handler. We need to detect when the user hovers over the left or right edge of the rectangle, and set the cursor to show a horizontal resizing icon. This can be done in the MouseMove event, checking the X coordinate and then setting the Cursor property:
Cursor = Cursors.SizeWE;
当用户在边缘上点击左键时,我们开始拖动。关键是调用canvas。capturemouse。如果我们不这样做,当您试图将矩形拖大时,鼠标移动事件将丢失给下面的其他控件。
When the user clicks the left-button while over the edge, we begin to drag. Key to this is calling Canvas.CaptureMouse. If we don’t do this, as soon as you try to drag the rectangle bigger, the mouse move events are lost to other controls underneath.
void RangeSelectionControl_MouseDown(object sender, MouseButtonEventArgs e)
{
if (e.LeftButton == MouseButtonState.Pressed)
{
Point position = e.GetPosition(this);
Edge edge = EdgeAtPosition(position.X);
DragEdge = edge;
if (DragEdge != Edge.None)
{
mainCanvas.CaptureMouse();
}
}
}
现在在MouseMove方法中,我们可以改变画布。属性来调整矩形的大小。
Now in the MouseMove methods, we can change the Canvas.Left and Width properties of the rectangle to resize it.
ScrollViewer非常容易使用,但必须记住将CanContentScroll属性设置为true,并正确设置ScrollViewer中项目的大小。
The ScrollViewer is quite straightforward to use, but you must remember to set CanContentScroll property to true, and also to set the size of the items within the ScrollViewer correctly.
<ScrollViewer CanContentScroll="True"
HorizontalScrollBarVisibility="Visible"
VerticalScrollBarVisibility="Hidden">
<Grid>
<my:WaveFormVisual Height="100"
HorizontalAlignment="Left"
x:Name="waveFormRenderer"/>
<my:RangeSelectionControl
HorizontalAlignment="Left"
x:Name="rangeSelection" />
</Grid>
</ScrollViewer>
我们根据波形中绘制的点的总数来设置波形visual和RangeSelectionControl的适当宽度。
We set the appropriate Width of the WaveFormVisual and RangeSelectionControl based on the total number of points we have drawn in the waveform.
Saving the Audio
我们终于准备好保存音频了。我们将为用户提供两种保存格式的选择。第一个方法是简单地保存为WAV文件。如果用户选择了整个录音,我们只需要将音频复制到他们想要的位置。然而,如果用户选择了一个子范围,那么我们需要修剪WAV文件。这可以通过使用TrimWavFile实用功能快速完成,该功能从WAV文件阅读器复制到WAV文件写入器,从开始到结束跳过一定数量的字节。
So we are finally ready to save the audio. We will offer the user two choices of format to save in. The first is simply to save as a WAV file. If the user has selected the entire recording, we only need to copy the audio across to their desired location. If, however, the user has selected a sub-range, then we need to trim the WAV file. This can be quickly accomplished using a TrimWavFile utility function that copies from a WAV file reader to a WAV file writer, skipping over a certain number of bytes from the beginning and end.
public static void TrimWavFile(string inPath, string outPath,
TimeSpan cutFromStart, TimeSpan cutFromEnd)
{
using (WaveFileReader reader = new WaveFileReader(inPath))
{
using (WaveFileWriter writer =
new WaveFileWriter(outPath, reader.WaveFormat))
{
int bytesPerMillisecond =
reader.WaveFormat.AverageBytesPerSecond / 1000;
int startPos = (int)cutFromStart.TotalMilliseconds *
bytesPerMillisecond;
startPos = startPos - startPos % reader.WaveFormat.BlockAlign;
int endBytes = (int)cutFromEnd.TotalMilliseconds *
bytesPerMillisecond;
endBytes = endBytes - endBytes % reader.WaveFormat.BlockAlign;
int endPos = (int)reader.Length - endBytes;
TrimWavFile(reader, writer, startPos, endPos);
}
}
}
private static void TrimWavFile(WaveFileReader reader,
WaveFileWriter writer, int startPos, int endPos)
{
reader.Position = startPos;
byte[] buffer = new byte[1024];
while (reader.Position < endPos)
{
int bytesRequired = (int)(endPos - reader.Position);
if (bytesRequired > 0)
{
int bytesToRead = Math.Min(bytesRequired, buffer.Length);
int bytesRead = reader.Read(buffer, 0, bytesToRead);
if (bytesRead > 0)
{
writer.WriteData(buffer, 0, bytesRead);
}
}
}
}
我们还想提供保存为MP3的功能。创建MP3文件最简单的方法是使用开源的LAME MP3编码器(如果你还没有这个应用程序,可以在web上搜索LAME .exe来获得它)。我们的应用程序将在当前目录中查找,并提示用户找到lame.exe(如果它不存在),因为我们没有将它包含在应用程序下载中。假设您确实提供了一个有效的路径,那么我们就可以通过使用适当的参数调用lame.exe将我们的(修改过的)WAV文件转换为MP3。
We also want to offer the ability to save as MP3. The easiest way to create MP3 files is to use the open source LAME MP3 encoder (do a web search for lame.exe to get hold of this application if you haven’t already got it). Our application will look in the current directory, and prompt the user to find lame.exe if it is not present, as we do not include it in the application download. Assuming you do provide a valid path, we can then convert our (trimmed) WAV file to MP3 by simply calling lame.exe with the appropriate parameters.
public static void ConvertToMp3(string lameExePath,
string waveFile, string mp3File)
{
Process converter = Process.Start(lameExePath, "-V2 \"" + waveFile
+ "\" \"" + mp3File + "\"");
converter.WaitForExit();
}
最后,我们得到了一个很好的MP3文件,其中包含选定的麦克风录音部分。
We end up with a nice compact MP3 file containing the selected portion of our microphone recording.
探索示例代码解决方案
Exploring the Sample Code Solution
主要的WPF示例应用程序可以在VoiceRecorder项目中找到。它包含了主窗口以及三个视图及其关联的视图模型。VoiceRecorder。Core包含一些WPF helper类和用户控件,以帮助处理应用程序的管道和GUI,而VoiceRecorder。Audio包含实际执行音频录制、编辑和转换的类。
The main WPF sample application is found in the VoiceRecorder project. This contains the main window along with the three views and their associated ViewModels. VoiceRecorder.Core contains some WPF helper classes and user controls to help with the plumbing and GUI of the application, while VoiceRecorder.Audio contains the classes that actually perform the recording, editing and converting of audio.
About the Author
Mark Heath是一名软件开发者,目前在英国南安普顿的NICE CTI系统公司工作。他专门从事。net开发,特别关注客户端技术和音频播放。他在http://mark-dot-net.blogspot.com上有关于音频、WPF、Silverlight和软件工程最佳实践的博客。他是CodePlex上几个开源项目的作者,其中包括NAudio,一个底层的。net音频工具包(http://www.codeplex.com/naudio)。
Mark Heath is a software developer currently working for NICE CTI Systems in Southampton, UK. He specializes in .NET development with a particular focus on client side technologies and audio playback. He blogs about audio, WPF, Silverlight and software engineering best practices at http://mark-dot-net.blogspot.com. He is the author of several open source projects hosted at CodePlex, including NAudio, a low-level .NET audio toolkit (http://www.codeplex.com/naudio).
标记:
Audio、Model-View-ViewModel、MVVM、WPF
rss
我喜欢你不掩饰架构的事实。有人可能会认为,因为它是Coding4Fun,所以它的设计将是扁平和单一的。您有单独的程序集来公开不同级别的功能。您还拥有各种实现或一些通用设计模式的开端。如IoC、命令中介、助手/服务、MVVP等。除了在MVVP上的一个小宣传外,这是没有说太多关于它的事情。伟大的工作!
I like the fact that you don’t gloss over architecture. One might think that because it is Coding4Fun that the design would be flat and monolithic. You have separate assemblies exposing different levels of functionality. You also have various implementations or the start of some common design patterns. Like IoC, Command Mediator, Helper/Services, MVVP, etc. This is being done without saying to much about it other than a small blurb on MVVP. Great Job!
上次修改时间: 10月 08, 2009 at 9:00下午
内特·格林伍德,我们可能会也可能不会就此发表文章
@Nate Greenwood, we may or may not have an article in the works for that
上次修改时间: 10月 08, 2009 at 9:00下午
太棒了。这是一篇很棒的文章,正好赶上我正在考虑一个项目来学习如何利用我内置的监控摄像头和麦克风。
Awesome. Great article, and just in time as I was pondering a project to learn to take advantage of my built-in monitor webcam and microphone.
上次修改时间: 10月 08, 2009 at 9:00下午
Adrian
震惊
amazing!!
你做了一份不可思议的工作,我是你的粉丝!!
you make an incredible job, im your fans!!
上次修改时间: 10月 08, 2009 at 9:00下午
感谢你朋友,我们努力使文章既有用又展示做事情的有用方法。也许不总是成功,但我们努力了。
@SomeONe Thanks man, we try to make the articles both useful and show useful ways of doing stuff. May not always be successful but we try.
上次修改时间: 10月 09, 2009 at 9:00下午
Robson Felix
这可以在XAML浏览器应用程序(XBAP)中使用吗?
谢谢,
Can this be used inside an XAML Browser Application (XBAP)?
Thanks,
上次修改时间: 3月 16, 2010 at 9:00下午
如何将录音时间修改为大于60秒???
How can I change the recording time to a value bigger than 60 seconds???
上次修改时间: 1月 12, 2011 at 9:00下午
你做的最好的教程…这是很好的解释…恭喜奥特
you do the best tutorials… that was really nice explained… congrats to autors
上次修改时间: 6月 04, 2011 at 7:03上午
对于c#程序员来说,这是关于音频最有用的一篇文章。非常感谢!它肯定会帮助我开发我的程序。
This has to be the most useful article on audio there has come to exist for C# programmers. Thanks a LOT for it! It’ll help me develop my program for sure.
上次修改时间: 7月 23, 2011 at 3:15下午
Bram Osterhout
我想保存一系列的音符/声音,我有一个数组,存储每个音符的频率,振幅和持续时间。实现这一目标的程序是什么?
I want to save a series of notes/sounds which I had in an array which stored each note’s frequency, amplitude, and duration. What would be the procedure for accomplishing this?
Thanks!
上次修改时间: 9月 03, 2011 at 5:35下午
Microsoft.DirectX
上次修改时间: 9月 10, 2011 at 5:44上午
重播的人问如何删除限制60秒。
Replay at person asking how to remove limit of 60 seconds.
在项目的AudioRecorder.cs中,将writetofile函数更改为following。
in AudioRecorder.cs of the project change the writetofile Function to following.
private void WriteToFile(byte[] buffer, int bytesRecorded)
{
if (recordingState == RecordingState.Recording
|| recordingState == RecordingState.RequestedStop)
{
writer.WriteData(buffer, 0, bytesRecorded);
}
}
REBUILD THE SOLUTION !!!
进入您的解决方案并删除voicerecorder。音频和核心dll。
Go to your solution and delete voicerecorder.audio and core DLLs.
添加引用并浏览到构建的voicerecorder项目的bin位置,并导入这些DLL。看到voicerecorder项目是如何在我的桌面dll位置是。
Add refrences and Browse to bin location of the built voicerecorder project and import those DLL. Seeing how the voicerecorder project was on my desktop the dll location is.
C:\Users—\Desktop\voicerecorder_8c1b2512cf5a\bin\Debug
瞧,极限完全被去掉了。
Voila the limit is totaly removed.
上次修改时间: 10月 08, 2011 at 8:52上午
谢谢你!这正是我所需要的!
Thank you! It is exactly what I’ve needed!
上次修改时间: 10月 13, 2011 at 12:21上午
So what happened to: waveIn.DeviceNumber = selectedDevice;
我一直在想怎么选设备。我以为NAudio会让捕捉麦克风音频变得容易。不幸的是,似乎所有的示例都是在对DLL进行更改之前编写的。
I’ve been trying to figure out how to select a device. I thought NAudio would make capturing mic audio easy. Unfortunately, it appears that all the examples were written befor the changes to the DLL.
好的,现在我该如何选择声卡呢?
OK, how do I select my sound card now?
上次修改时间: 1月 03, 2012 at 12:33下午
@John -如果你注意到,在第一部分他给出了一个例子,如何看当前可用的设备,之后,您可以使用他们的“id”返回waveInDevice方法,对于第一个示例,在大多数情况下是0(默认设备)所以你需要设置waveIn。DeviceNumber = 0;在您的代码中。
@John - if you noticed, in the first part he gives an example of how to see the devices currently available, after that you can use their “id’s” returned by waveInDevice method in that for at the beginning of the example, in most of the cases will be 0 (default device) so actually you’ll need to set waveIn.DeviceNumber = 0; in your code.
上次修改时间: 1月 18, 2012 at 5:56上午
可能是一个非常无知的问题,但这是我第一次在这个网站上,有人能告诉我什么是wavin,它从哪里来的?非常感谢。
Probably a really really ignorant question but this is the first time I’ve been on this site, could someone please tell me what wavein is and where it has come from? Would be much appreciated.
上次修改时间: 1月 29, 2012 at 8:20上午