discuss about AEC in Speex

http://lists.xiph.org/pipermail/speex-dev/2010-March/007694.html


On 2010-03-03 02:40,brant wrote:

Hi,

      The notch filter in AEC is only used toremove DC signal, and the time of convergence is not important, right? If so, Ithink preset value of notch_radius is too small, and it causes noticeabledistortion ( freq<200hz cut).

There is a picture inattachment to show signals under different radius in time-domain.

    By setting notch_radius to 0.999 for allsampling rates, I found better voice effect(distortion), while AEC still workingfine.

 

 

 

On 03/03/2010 07:39 PM,Jean-Marc Valin wrote:

The notch filter isspecifically designed to cut below 200 Hz when working in narrowband. Inwideband, the cutoff is more around 50 Hz.

The reason is that innarrowband operation (irrespective of the codec), you're not really supposed tohave anything below ~200 Hz, but a lot of people forget that.

       Jean-Marc

 

 

Hi Jean-Marc,

You make that sound like its just a matter ofmeeting some arbitrary spec. Let's be more specific.....

If you use narrow band voice down todeep bass frequencies:

     - 16bit linear audio sounds good

     -alaw or ulaw sounds muddy

     -low bit rate codecs, like speex or G.729, sound awful.

I assume QinBin only listened to someuncompressed audio in his evaluation.

 

 

 

QuotingSteve Underwood :

> You make that sound like its just a matter of meetingsome arbitrary spec. Let's be morespecific.....

 

Well,the corresponding spec is G.712, though the reason for doing that has more todo with quality (as you point out) than just meeting the spec.

 

> If you use narrowband voice down to deep bass frequencies:

>      - 16 bitlinear audio sounds good

 

Eventhere I tend to disagree because when you chop all the high frequencies,keepingthe deep bass makes the sound even more "muffled" IMO.

 

>      - alaw or ulaw sounds muddy

>      - low bit rate codecs, like speex orG.729, sound awful.

 

Definitely agree here. It's amazing to see how many systems don't dothis filtering before encoding and end up with poor quality speech.

 

   Jean-Marc

 

 

 

Hi,

    But in fact, it really affects the voicequality. One of my tester says, "Is your mouth far way from the mic?"

    Could you explain why we should cut 200hzbelow?

 

 

 

On 03/03/2010 10:22 PM,QianBin wrote:

We already said itaffects the quality when the voice is compressed. Are you asking why thatshould be?

 

Even with a simple formof lightly lossy compression like alaw/ulaw, a big low frequency waveform canpush the coder into using coarse steps, so the more interesting energy in themiddle of the voice band is coded with those coarse steps - i.e. poorly.

 

Low bit rate codecs, likeG.729 or speex, usually apply their own fast rolloff filter for low frequencies- e.g. G.729 rolls off its input hard, with a turnover at 140Hz - their wholecoding methodology tends to break down without that.

 

The telephone networknormally rolls off hard below about 300Hz, so this actually perfectly normalvoice for a phone call.

 

A second reason to removethe bass frequencies is if you don't you can get nasty echoes if the audio issent to the PSTN. Analogue PSTN interfaces don't always remove the bassfrequencies, and digital interfaces (BRI/E1/T1) never do. If you sent thosefrequency through an analogue segment of the network, the hybrids can dohorrible non-linear things that the echo cancellers cannot handle. To protecttheir own stability, echo cancellers may insert their own bass filter when then

switch on, but its poorpractice to rely on that. Its better not to throw the bass frequencies at them.


你可能感兴趣的:(语音信号处理,AEC,speex)