语音和视频的捕获、编码解码以及渲染--开源技术大全


1. MediaStreamer2:   http://www.linphone.org/ 

http://hi.baidu.com/zui_chu_de_meng_xiang/blog/item/71d2d952ebaad6561038c286.html

Mediastreamer2 是一个支持多种平台的轻量级的流技术引擎,主要适合于开发语音和视频电话应用程序。该引擎主要为 linphone 的多媒体流的收发,包括语音和视频的捕获、编码解码以及渲染。

主要功能:

  • Read/Write from to an alsa device, an oss device, a windows waveapi device
  • Send and receive RTP/SRTP packets
  • DTMF (telephone tones) support using SIP INFO or RFC2833
  • Encode and decode the following formats: speex, G711, G772, GSM, H263, theora, iLBC, MPEG4, and H264, VP8 (WebM).
  • Read and write from/to a wav file
  • Read YUV pictures from a webcam (provided that it has video4linux v1 or v2 driver)
  • Display YUV pictures (using SDL library or native apis on windows)
  • Dual tones generation
  • Echo cancelation, using the extraordinary echo canceler algorithm from the speex library
  • Audio conferencing
  • Audio parametric equalizer using a FIR filter
  • Volume control, automatic gain control
  • Nat friendly: guesses NAT address for SIP messages, uses STUN for RTP streams
  • Sound backends: 
      • Linux: ALSA, OSS, PulseAudio
      • Windows: waveapi
      • MacOSX: HAL Audio Unit
      • iPhone: VoiceProcessing AudioUnit with built-in echo cancellation
      • Android sound system
      • JSR135 on BlackBerry
    • Efficient bandwidth management: the bandwidth limitations are signaled using SDP (b=AS...), resulting in audio and video session established with bitrates that fits the user's network capabilities.

Mediastreamer2  可通过插件进行扩展,当前提供了 H264 和 ILBC 编码器插件。

Some definitions.

Filter: A filter is a Mediastreamer2 component that process data. A filter has 0 or several INPUT pins and 0 or several OUTPUT pins. Here is a list of possible use of filters:

    capture audio or video data.

   play audio or display video data.

   send or receive RTP data.

   encode or decode audio or video data.

   mix audio/video data.

Graph: A graph is a manager of filters connected together. It will transfer data from OUTPUT pins to INPUT pins and will be responsible for running filters.

How do I use Mediastremer2?

Mediastreamer2 can be used for a lot of different purpose. The primary use is to manage RTP audio and video session. You will need to use the API to build filters, link them together in a graph. Then the ticker API will help you to start and stop the graph.

Basic graph sample:

AUDIO CAPTURE   -->   ENCODE -->     RTP

      FILTER      -->   FILTER -->    FILTER

The above graph is composed of three filters. The first one has no input: tt captures audio data directly from the drivers and provide it to the OUTPUT pin. This data is sent to the INPUT pin of the encoder which of course encode the data and send it to its OUTPUT pin. This pin is connected to the INPUT pin of a filter capable to build and send RTP packets.

The modular design helps you to encode in many different format just by replacing the "ENCODE FILTER" with another one. mediastreamer2 contains internal support for g711u, g711a, speex and gsm. You can add new encoding format by implementing new filters which can then be dynamically loaded.

List of existing filters.

Mediastreamer2 already provides a large set of filters. Here is a complete list of built-in filters.

All supported platforms:

   RTP receiver

RTP sender

Audio Filters:

   audio capture

   audio playback

   several audio encoder/decoder: PCMU, PCMA, speex, gsm

   wav file reader.

   wav file recorder.

   resampler.

   conference bridge.

   volume analyser.

   acoustic echo canceller.

   dtmf generation filter.

Video Filters:

   video capture

   video display

   several audio encoder/decoder: H263-1998, MP4V-ES, theora

   image resizer.

   format converter. (RBG24, I420...)

Plugin Filters:

iLBC decoder/encoder.

Refer to : http://download.savannah.gnu.org/releases-noredirect/linphone/mediastreamer/doc/group__mediastreamer2.html

Mediastreamer2 - the multimedia streaming engine

Mediastreamer2 is a powerful and light weighted streaming engine specialized for voice/video telephony applications.

It is the library that is responsible for all the receiving and sending of multimedia streams in linphone, including voice/video capture, encoding and decoding, and rendering.

Features

★   Read/Write from to an alsa device, an oss device, a windows waveapi device

★   Send and receive RTP packets

★   Encode and decode the following formats: speex, G711, GSM, H263, theora, iLBC, MPEG4, and H264.

★   Read and write from/to a wav file

★   Dual tones generation

★   Echo cancelation, using the extraordinary echo canceller algorithm from the speex library

★   Audio conferencing

★   Audio parametric equalizer using a FIR filter

★   Volume control, automatic gain control

★   Works on linux, windows XP

Mediastreamer2 can be extended with dynamic plugins, currently a H264 and an ILBC codec plugins are available.

Design and principles

Each processing entity is contained within a MSFilter object. MSFilter(s) have inputs and/or outputs that can be used to connect from and to other MSFilters.

A trivial example to understand:

☆   MSRtpRecv is a MSFilter that receives RTP packets from the network, unpacketize them and post them on its only output.

☆   MSSpeexDec is a MSFilter that takes everything on its input assuming these are speex encoded packets, and decodes them and put the result on its output.

☆   MSFileRec is a MSFilter that takes everything on its input and write it to wav file (assuming the input is 16bit linear pcm).

MSFilters can be connected together to become filter chain. If we assemble the three above examples, we obtain a processing chain that receives RTP packet, decode them and write the uncompressed result into a wav file.

The execution of the media processing work is scheduled by a MSTicker object, a thread that wakes up every 10 ms to process data in all the MSFilter chains it manages. Several MSTicker can be used simultaneously, for example one for audio filters, one for video filters, or one on each processor of the machine where it runs.

Mediastreamer2 is easy to use

If your intent is simply to create audio and video streams, a simple API is defined in audiostream.h and videostream.h to create audio and video stream.

It can be as simple as:

AudioStream *as=audio_stream_start(.../*list of parameters*/);sleep(10);
audio_stream_stop(as);

If your intent is to add new functionalities to mediastreamer2, you'll be glad to know that implementing a mediastreamer2 filter is very straightforward: no complex declarations, inheritance or such like this.

As an example, have a look at this: it is the 'MSVolume' MSFilter, whose goal is to measure and control loudness of an audio stream.

Thanks to this lightweighted framework, developers can concentrate on what matters: the implementation of the algorithm !

Mediastreamer2 is also suitable for embedded systems

★   Mediastreamer2 is light. For example on linux/x86 the full-featured shared library takes around 800ko unstripped and compiled with -g (debug). Data messages that carries the media data within mediastreamer2 chains are optimized using the famous sys-V mblk_t structure. This is to avoid copies as long it is possible and allow low cost fragmentation/re-assemble operations that are very common especially when processing video streams.

★   Mediastreamer2 is written in C.

★   Mediastreamer2 compiles on arm with gcc.

★   Mediastreamer2 has only oRTP and libc as minimal dependencies. Others (ffmpeg, speex, alsa...) can be added optionnaly if you need all features.

Thanks to its plugin architecture, mediastreamer2 can be extended to interface with hardware codecs, for example video codecs dsp.




2.  VoIP性能和质量测试 MCS MyVoIP

MCSMyVoIP非常精确地模拟服务器和浏览器客户端之间的的UDP数据传输的语音质量和语音互联网连接(VoIP)的性能IP电话。连接...进行测试,抖动,丢包和支持的声音质量水平评价。在VoIP测试可以设置为不同的编解码器或数据包大小,包率定制和测试长度。测试可以进一步结合起来,带宽速度测试,网络路由诊断为更深入的关联分析。


3. Android手机的VoIP客户端 Sipdroid

Sipdroid是一个运行于Android手机平台上的SIP/VoIP客户端。


4. 网络视频会议软件 VMukti

Vmukti是一个网络视频会议软件.它是第一个开源的PBX和会议软件,它支持在家里或办公室进行声/视频交流,桌面共享等。它使用了VoIP技术。


5. iPhone上的VoIP程序 Siphon: http://code.google.com/p/siphon/

Siphon 是一款 iPhone 上的 SIP/VoIP应用,支持包括中文在内的多国语言。


6. Video4Linux2

Video4Linux2是Linux系统下进行音影图像开发的应用编程接口.本文采用了Video4Linux2提供的数据结构、应用函数等,实现了在Linux环境下USB摄像头图像教据的采集功能,并运用GTK库显示和播放了其视频图像


7. Intel Media SDK(IPP)

API 提供的功能可以简化视频编码、解码和预处理操作,并支持 H.264 和 MPEG-2 格式编码和 H.264、MPEG-2 和 VC-1 解码。


8.视频特效处理库 frei0r

frei0r 是一个 C库,主要用来提供一些常用的视频效果处理,通过一些简单参数来控制过滤器和混合器以实现不同的视频效果。


9. C++的音频处理框架 GNU ccAudio2

ccaudio 2 是一个简单的、高可移植性、独立的用来处理音频数据的 C++ 框架。


10. 视频捕获 API VideoMan

VideoMan 提供一组视频捕获 API 。支持多种视频流同时输入(视频传输线、USB摄像头和视频文件等)。能利用 OpenGL 对输入进行处理,方便的与 OpenCV,CUDA 等集成开发计算机视觉系统。


11. 流媒体实时传输开发包 jrtplib

RTP 是目前解决流媒体实时传输问题的最好办法,如果需要在Linux平台上进行实时流媒体编程,可以考虑使用一些开放源代码的RTP库,如LIBRTP、 JRTPLIB等。 JRTPLIB是一个面向对象的RTP库,它完全遵循RFC 1889设计,在很多场合下是一个非常不错的选择,下面就以JRTPLIB为例,讲述如何在Linux平台上运用RTP协议进行实时流媒体编程。

JRTPLIB 是一个用C++语言实现的RTP库,目前已经可以运行在Windows、Linux、FreeBSD、Solaris、Unix和 VxWorks等多种操作系统上。


12.声音压缩库 Speex


13. SIP模拟工具 SIP Inspector

SIP Inspector 是一个用来模拟不同的SIP消息和通讯情景的工具,可用来创建 SIP 信令、定制 SIP 消息以及兼容输入和输出的消息包,该工具还可以直接从 pcap 文件中播放 RTP 流


14.视频编码器 VP8

VP8:高质量的视频编码,以BSD式的免费授权形式提供给所有人使用


15.x264 视频解码器

x264是一个基于h.264/AVC的免费开源的视频解码器,该版本为win32平台下的VFW codec版。

Encoder features

  • 8x8 and 4x4 adaptive spatial transform
  • Adaptive B-frame placement
  • B-frames as references / arbitrary frame order
  • CAVLC/CABAC entropy coding
  • Custom quantization matrices
  • Intra: all macroblock types (16x16, 8x8, 4x4, and PCM with all predictions)
  • Inter P: all partitions (from 16x16 down to 4x4)
  • Inter B: partitions from 16x16 down to 8x8 (including skip/direct)
  • Interlacing (MBAFF)
  • Multiple reference frames
  • Ratecontrol: constant quantizer, constant quality, single or multipass ABR, optional VBV
  • Scenecut detection
  • Spatial and temporal direct mode in B-frames, adaptive mode selection
  • Parallel encoding on multiple CPUs
  • Predictive lossless mode
  • Psy optimizations for detail retention (adaptive quantization, psy-RD, psy-trellis)
  • Zones for arbitrarily adjusting bitrate distribution

16.DirectShow解码、编码器 FFDShow

FFDShow是一款全能的DirectShow解码、编码器,可以解压缩常见的视频格式和几乎所有的音频格式, 它还提供了丰富的加工处理选项,可以锐化画面,调节画面的亮度,它还支持诸多字幕格式.它能让音频和视频播放更流畅.


17. 高质量网络视频压缩格式 WebM

WebM 项目旨在为对每个人都开放的网络开发高质量、开放的视频格式。

相对于苹果支持的H.264标准,Google提出的WebM标准实际上就是VP8视频编码加上Vorbis(一种开源且无专利限制的音 频压缩格式)。

WebM标准的网络视频更加偏向于开源并且是基于HTML5标准的。最为可怕的是WebM标准受到了包括 Opera,Mozilla,adobe等软件巨头和AMD,ARM,NVIDIA,qualcomm在内硬件巨头的支持,在未来潜力巨大。而且全球第一 大视频网站YouTube从今天开始支持全新的WebM标准。


18.Android多媒体框架 OpenCore

Opencore是google联合packetvideo推出的多媒体开源框架,其中的h.264解码器在目前所有的开源h.264解码器中最好的,在win32和armv4上测试通过,性能好很多,大概提升20%!

OpenCore的另外一个常用的称呼是PacketVideo,它是Android的多媒体核心。在防站的过程中,PacketVideo是一家公司的 名称,而OpenCore是这套多媒体框架的软件层的名称。在Android的开发者中间,二者的含义基本相同。对比Android的其它程序 库,OpenCore的代码非常庞大,它是一个基于C++的实现,定义了全功能的操作系统移植层,各种基本的功能均被封装成类的形式,各层次之间的接口多 使用继承等方式。

OpenCore是一个多媒体的框架,从宏观上来看,它主要包含了两大方面的内容:

    * PVPlayer:提供媒体播放器的功能,完成各种音频(Audio)、视频(Video)流的回放(Playback)功能
    * PVAuthor:提供媒体流记录的功能,完成各种音频(Audio)、视频(Video)流的以及静态图像捕获功能


19. oRTP lib 

oRTP - a Real-time Transport Protocol (RFC3550) stack under LGPL

Features:

★   Written in C, works under Linux (and probably any Unix) and Windows.

★   Implement the RFC3550 (RTP) with a easy to use API with high and low level access.

★   Includes support for multiples profiles, AV profile (RFC3551) being the one by default.

★   Includes a packet scheduler for to send and recv packet "on time", according to their timestamp. Scheduling is optionnal, rtp sessions can remain not scheduled.

★   Supports mutiplexing IO, so that hundreds of RTP sessions can be scheduled by a single thread.

★   Features an adaptive jitter algorithm for a receiver to adapt to the clockrate of the sender.

★   Supports part of RFC2833 for telephone events over RTP.

★   The API is well documented using gtk-doc.

★   Licensed under the Lesser Gnu Public License.

★   RTCP messages sent periodically since 0.7.0 (compound packet including sender report or receiver report + SDES)

★   Includes an API to parse incoming RTCP packets.

Todo features:

★   Multi stream sessions ?


20. WebRTC  http://baike.baidu.com/view/5855785.htm

WebRTC是一项在浏览器内部进行实时视频和音频通信的技术.WebRTC提供了视频会议的核心技术,包括音视频的采集、编解码、网络传输、视频显示(direct3d9和directdraw)/视频图像处理等功能,并且还支持跨平台:windows,linux,mac,android。可以用本地文件作为视频源,  虚拟摄像头支持WebRTC还可以录制音视频到本地文件,比较实用的功能。WebRTC的音频部分,包含设备、编解码(iLIBC/iSAC/G722/PCM16/RED/AVT、NetEQ)、加密、声音文件、声音处理、声音输出、音量控制、音视频同步、网络传输与流控(RTP/RTCP)等功能。

WebRTC采用iLIBC/iSAC/G722/PCM16/RED/AVT编解码技术。

  WebRTC还提供NetEQ功能---抖动缓冲器及丢包补偿模块,能够提高音质,并把延迟减至最小。
  另外一个核心功能是基于语音会议的混音处理。

声音处理针对音频数据进行处理,包括回声消除(AEC)、AECM、自动增益(AGC)、降噪处理等功能,用来提升声音质量。

编译WebRTC  :   http://white313.blog.163.com/blog/static/2102620116314827580/



21. 

虚拟摄像头  http://baike.baidu.com/view/1366860.htm


22.  oSIP协议栈:  http://hi.baidu.com/zui_chu_de_meng_xiang/blog/item/636d2f327847a193a9018ea1.html

你可能感兴趣的:(网络/通信,VOIP,audio,video,h.264,linux,processing,filter)