国内Agora为开发者们提供了一份 WebRTC 镜像源,可以更快速地下载、编译 WebRTC。
镜像说明
支持编译版本包括:Linux、Android、iOS、Windows
本镜像将根据官方版本动向,不定期更新
镜像地址
配置与使用指南,也在该页面中,按照步骤操作即可。
WebRTC 的代码量不小,一次性看明白不太现实,在本系列中,我将试图搞清楚三个问题:
1、客户端之间如何建立连接?
2、客户端之间如何实现数据传输?
3、音视频数据的采集、预览、编码、传输、解码、渲染完整流程。
本文是第一篇,我将从Android平台入手,分析一下 WebRTC相机采集的实现。
代码位置:webrtc/src/sdk/objc/components/capturer
RTCCameraVideoCapturer.h RTCCameraVideoCapturer.m
RTCFileVideoCapturer.h RTCFileVideoCapturer.m
视频采集主要类
VideoCapturer
如下所示,webrtc针对视频采集对外主要提供的是VideoCapturer接口,实现类有ScreenCapturerAndroid、FileVideoCapturer和CameraCapturer,分别表示屏幕、文件、摄像头三种不同的视频来源,因为android系统先后提供了camera1.0和camera2.0接口,因此CameraCapturer又用Camera1Capturer和Camera2Capturer两个子类分别表示。
VideoCapturer定义了CapturerObserver接口,如下所示,可通过实现该接口来接收图像数据。
// Interface used for providing callbacks to an observer.
public interface CapturerObserver {
// Notify if the camera have been started successfully or not.
// Called on a Java thread owned by VideoCapturer.
void onCapturerStarted(boolean success);
void onCapturerStopped();
// Delivers a captured frame. Called on a Java thread owned by VideoCapturer.
void onByteBufferFrameCaptured(
byte[] data, int width, int height, int rotation, long timeStamp);
// Delivers a captured frame in a texture with id |oesTextureId|. Called on a Java thread
// owned by VideoCapturer.
void onTextureFrameCaptured(int width, int height, int oesTextureId, float[] transformMatrix,
int rotation, long timestamp);
// Delivers a captured frame. Called on a Java thread owned by VideoCapturer.
void onFrameCaptured(VideoFrame frame);
}
webrtc中的AndroidVideoTrackSourceObserver类实现了该接口,然后由其分发到Sink模块,比如编码器、本地预览等需要图像数据的模块。
CameraEnumerator
CameraCapturer对象的创建是由CameraEnumerator来完成,考虑到有前后摄像头之分,CameraEnumerator为不同的摄像头分配特定的名称,并根据名称来创建CameraCapturer对象,Camera1Enumerator和Camera2Enumerator创建的分别是Camera1Capturer和Camera2Capturer对象。
camera1
- 创建
Camera
对象:Camera.open
; - 设置预览 SurfaceTexture,用来接收帧数据(位于显存中):
camera.setPreviewTexture
; - 选择合适的相机预览参数(尺寸、帧率、对焦):
Parameters
和camera.setParameters
; - 如果需要获取内存数据回调,则需要设置 buffer 和 listener:
camera.addCallbackBuffer
和camera.setPreviewCallbackWithBuffer
; - 如果需要相机服务为我们调整数据方向,则可以设置旋转角度:
camera.setDisplayOrientation
; - 开启预览:
camera.startPreview
; - 停止预览:
camera.stopPreview
和camera.release
;
Camera2
- 创建
CameraManager
对象,相机操作始于“相机管家”:context.getSystemService(Context.CAMERA_SERVICE)
; - 创建
CameraDevice
对象:cameraManager.openCamera
; - 和 Camera1 不同,Camera2 的操作都是异步的,调用
openCamera
时我们会传入一个回调,在其中接收相机操作状态的事件; - 创建成功:
CameraDevice.StateCallback#onOpened
; - 创建相机对象后,开启预览 session,设置数据回调:
camera.createCaptureSession
,同样,这个操作也会传入一个回调; - session 开启成功:
CameraCaptureSession.StateCallback#onConfigured
; - 开启 session 后,设置数据格式(尺寸、帧率、对焦),发出数据请求:
CaptureRequest.Builder
和session.setRepeatingRequest
; - 停止预览:
cameraCaptureSession.stop
和cameraDevice.close
;
CameraEnumerator 接口如下:
public interface CameraEnumerator {
public String[] getDeviceNames();
public boolean isFrontFacing(String deviceName);
public boolean isBackFacing(String deviceName);
public List getSupportedFormats(String deviceName);
public CameraVideoCapturer createCapturer(
String deviceName, CameraVideoCapturer.CameraEventsHandler eventsHandler);
}
CameraSession
CameraCapturer主要是实现接口和状态维护,与android camera接口打交道的是通过创建的CameraSession对象来完成,相对应的也有Camera1Session和Camera2Session不同的实现,Camera1Session封装是camera1.0接口,即android.hardware.Camera,Camera2Session封装的是camera2.0接口,即android.hardware.camera2。camera1.0和camera2.0可以通过SurfaceTexture来接收图像数据,webrtc提供了SurfaceTextureHelper来帮助管理SurfaceTexture,camera1.0还可以通过注册PreviewCallback来接收YUV数据。
CameraSession通过CreateSessionCallback和Events接口来分发事件和图像数据,如下所示,CameraCapturer实现了这两个接口来接收并转发事件和图像数据。
public interface CreateSessionCallback {
void onDone(CameraSession session);
void onFailure(FailureType failureType, String error);
}
// Events are fired on the camera thread.
public interface Events {
void onCameraOpening();
void onCameraError(CameraSession session, String error);
void onCameraDisconnected(CameraSession session);
void onCameraClosed(CameraSession session);
void onFrameCaptured(CameraSession session, VideoFrame frame);
// The old way of passing frames. Will be removed eventually.
void onByteBufferFrameCaptured(
CameraSession session, byte[] data, int width, int height, int rotation, long timestamp);
void onTextureFrameCaptured(CameraSession session, int width, int height, int oesTextureId,
float[] transformMatrix, int rotation, long timestamp);
}
视频数据表示
视频是由一帧一帧图像组成的,图像格式有RGB和YUV两类,每类又有不同的格式,webrtc中统一用VideoFrame来表示,不管是什么格式,本质就是一段buffer,只是buffer的格式不一样,webrtc图像数据需要在java层和native层相互传递,因此在java层和native层都有定义。
java层主要类如下:
Buffer接口定义如下:
public interface Buffer {
/**
* Resolution of the buffer in pixels.
*/
@CalledByNative("Buffer") int getWidth();
@CalledByNative("Buffer") int getHeight();
/**
* Returns a memory-backed frame in I420 format. If the pixel data is in another format, a
* conversion will take place. All implementations must provide a fallback to I420 for
* compatibility with e.g. the internal WebRTC software encoders.
*/
@CalledByNative("Buffer") I420Buffer toI420();
/**
* Reference counting is needed since a video buffer can be shared between multiple VideoSinks,
* and the buffer needs to be returned to the VideoSource as soon as all references are gone.
*/
@CalledByNative("Buffer") void retain();
@CalledByNative("Buffer") void release();
/**
* Crops a region defined by |cropx|, |cropY|, |cropWidth| and |cropHeight|. Scales it to size
* |scaleWidth| x |scaleHeight|.
*/
@CalledByNative("Buffer")
Buffer cropAndScale(
int cropX, int cropY, int cropWidth, int cropHeight, int scaleWidth, int scaleHeight);
}
native层主要类如下:
VideoFrameBuffer接口定义如下:
class VideoFrameBuffer : public rtc::RefCountInterface {
public:
// New frame buffer types will be added conservatively when there is an
// opportunity to optimize the path between some pair of video source and
// video sink.
enum class Type {
kNative,
kI420,
kI420A,
kI444,
};
// This function specifies in what pixel format the data is stored in.
virtual Type type() const = 0;
// The resolution of the frame in pixels. For formats where some planes are
// subsampled, this is the highest-resolution plane.
virtual int width() const = 0;
virtual int height() const = 0;
// Returns a memory-backed frame buffer in I420 format. If the pixel data is
// in another format, a conversion will take place. All implementations must
// provide a fallback to I420 for compatibility with e.g. the internal WebRTC
// software encoders.
virtual rtc::scoped_refptr ToI420() = 0;
// These functions should only be called if type() is of the correct type.
// Calling with a different type will result in a crash.
// TODO(magjed): Return raw pointers for GetI420 once deprecated interface is
// removed.
rtc::scoped_refptr GetI420();
rtc::scoped_refptr GetI420() const;
I420ABufferInterface* GetI420A();
const I420ABufferInterface* GetI420A() const;
I444BufferInterface* GetI444();
const I444BufferInterface* GetI444() const;
protected:
~VideoFrameBuffer() override {}
};
从上面两个图可以看到,java和native的定义比较类似,而且都实现了转化为I420格式的接口,以java为例,定义了几种YUV Buffer和Texture Buffer,对于YUV Buffer对象,成员就是ByteBuffer了,每一帧的ByteBuffer内容肯定不一样,而对于Texture Buffer对象,涉及到opengl这块,成员主要是id和矩阵,这两个信息每一帧都是一样的,id对应一个native层的对象,而这个对象拥有一个buffer内容可变的成员,buffer格式也是YUV或者RGB。
视频采集
以android camera1.0 PreviewCallback方式获取图像数据为例,视频采集和分发流程如下所示:
分发流程中主要类如下所示,VideoBroadcaster有个std::vector
从中可以看出,从camera获取到图像数据后,通过AndroidVideoTrackSourceObserver传递给native层的AndroidVideoTrackSource对象,再由VideoBroadcaster分发给不同的sink,通过VideoStreamEncoder分发给编码器,通过VideoSinkWrapper分发给java层的VideoSink对象,比如用于本地预览的SurfaceViewRenderer对象。
java层的VideoSink定义如下,onFrame是从native层的VideoSinkWrapper回调上来的。
public interface VideoSink {
/**
* Implementations should call frame.retain() if they need to hold a reference to the frame after
* this function returns. Each call to retain() should be followed by a call to frame.release()
* when the reference is no longer needed.
*/
@CalledByNative void onFrame(VideoFrame frame);
}
从camera到AndroidVideoTrackSourceObserver的流程比较简单,下面分析一下从AndroidVideoTrackSourceObserver之后的流程:
AndroidVideoTrackSourceObserver的nativeOnByteBufferFrameCaptured函数实现如下:
static void JNI_AndroidVideoTrackSourceObserver_OnByteBufferFrameCaptured(
JNIEnv* jni,
const JavaParamRef&,
jlong j_source,
const JavaParamRef& j_frame,
jint length,
jint width,
jint height,
jint rotation,
jlong timestamp) {
AndroidVideoTrackSource* source =
AndroidVideoTrackSourceFromJavaProxy(j_source);
jbyte* bytes = jni->GetByteArrayElements(j_frame.obj(), nullptr);
source->OnByteBufferFrameCaptured(bytes, length, width, height,
jintToVideoRotation(rotation), timestamp);
jni->ReleaseByteArrayElements(j_frame.obj(), bytes, JNI_ABORT);
}
其中source是一个AndroidVideoTrackSource对象,它的OnByteBufferFrameCaptured函数最后调用的是父类AdaptedVideoTrackSource的OnFrame函数,定义如下:
void AdaptedVideoTrackSource::OnFrame(const webrtc::VideoFrame& frame) {
rtc::scoped_refptr buffer(
frame.video_frame_buffer());
/* Note that this is a "best effort" approach to
wants.rotation_applied; apply_rotation_ can change from false to
true between the check of apply_rotation() and the call to
broadcaster_.OnFrame(), in which case we generate a frame with
pending rotation despite some sink with wants.rotation_applied ==
true was just added. The VideoBroadcaster enforces
synchronization for us in this case, by not passing the frame on
to sinks which don't want it. */
if (apply_rotation() && frame.rotation() != webrtc::kVideoRotation_0 &&
buffer->type() == webrtc::VideoFrameBuffer::Type::kI420) {
/* Apply pending rotation. */
broadcaster_.OnFrame(webrtc::VideoFrame(
webrtc::I420Buffer::Rotate(*buffer->GetI420(), frame.rotation()),
webrtc::kVideoRotation_0, frame.timestamp_us()));
} else {
broadcaster_.OnFrame(frame);
}
}
其中broadcaster_是一个VideoBroadcaster对象,在OnFrame函数中通过for循环分发到注册好的sink对象,如下所示:
void VideoBroadcaster::OnFrame(const webrtc::VideoFrame& frame) {
rtc::CritScope cs(&sinks_and_wants_lock_);
for (auto& sink_pair : sink_pairs()) {
if (sink_pair.wants.rotation_applied &&
frame.rotation() != webrtc::kVideoRotation_0) {
// Calls to OnFrame are not synchronized with changes to the sink wants.
// When rotation_applied is set to true, one or a few frames may get here
// with rotation still pending. Protect sinks that don't expect any
// pending rotation.
RTC_LOG(LS_VERBOSE) << "Discarding frame with unexpected rotation.";
continue;
}
if (sink_pair.wants.black_frames) {
sink_pair.sink->OnFrame(webrtc::VideoFrame(
GetBlackFrameBuffer(frame.width(), frame.height()), frame.rotation(),
frame.timestamp_us()));
} else {
sink_pair.sink->OnFrame(frame);
}
}
}
如果sink是一个VideoStreamEncoder对象,则是分发给编码器,如果sink是一个VideoSinkWrapper对象,则是分发给java层的VideoSink对象,比如用于本地预览的SurfaceViewRenderer对象。
VideoSinkWrapper的OnFrame定义如下:
void VideoSinkWrapper::OnFrame(const VideoFrame& frame) {
JNIEnv* jni = AttachCurrentThreadIfNeeded();
Java_VideoSink_onFrame(jni, j_sink_, NativeToJavaFrame(jni, frame));
}
Java_VideoSink_onFrame完成了从native层到java层VideoSink对象的回调。调用原理就是根据java类方法的签名获取到jmethodID,然后再用jni提供的接口调用jmethodID对应的java类方法。不过在源码里用Java_VideoSink_onFrame这个名称搜索不到源码,因为这个函数的代码是自动生成的,在c/c++开发中,一般有两种方式来完成代码自动生成工作:
宏定义,这种方式实际上是在编译代码时由预处理器来完成的,在c/c++开发中属于比较常见的一种方式,不过这种方式不够灵活,对于复杂点的代码就有点力不从心了。
工具,这种方式实际上是在编译前用其他工具根据写好的配置信息(比如IDL)来生成代码,这种方式灵活性要好很多,写配置信息比写一堆代码简单多了,像那种没啥技术含量的,重复度高的代码应该都能做到,比如aidl、probuffer、gsoap都属于这种。
webrtc就是采用第二种方式,编译后在顶层源码目录下搜索一下,在./out/Debug/gen/sdk/android/generated_video_jni/jni/VideoSink_jni.h这个文件中。
这个文件的内容如下所示:
// Copyright 2019 The Chromium Authors. All rights reserved.
// Use of this source code is governed by a BSD-style license that can be
// found in the LICENSE file.
// This file is autogenerated by
// base/android/jni_generator/jni_generator.py
// For
// org/webrtc/VideoSink
#ifndef org_webrtc_VideoSink_JNI
#define org_webrtc_VideoSink_JNI
#include
#include "../../../../../../../sdk/android/src/jni/jni_generator_helper.h"
// Step 1: forward declarations.
JNI_REGISTRATION_EXPORT extern const char kClassPath_org_webrtc_VideoSink[];
const char kClassPath_org_webrtc_VideoSink[] = "org/webrtc/VideoSink";
// Leaking this jclass as we cannot use LazyInstance from some threads.
JNI_REGISTRATION_EXPORT base::subtle::AtomicWord g_org_webrtc_VideoSink_clazz =
0;
#ifndef org_webrtc_VideoSink_clazz_defined
#define org_webrtc_VideoSink_clazz_defined
inline jclass org_webrtc_VideoSink_clazz(JNIEnv* env) {
return base::android::LazyGetClass(env, kClassPath_org_webrtc_VideoSink,
&g_org_webrtc_VideoSink_clazz);
}
#endif
// Step 2: method stubs.
static base::subtle::AtomicWord g_org_webrtc_VideoSink_onFrame = 0;
static void Java_VideoSink_onFrame(JNIEnv* env, const
base::android::JavaRef& obj, const base::android::JavaRef&
frame) {
CHECK_CLAZZ(env, obj.obj(),
org_webrtc_VideoSink_clazz(env));
jmethodID method_id =
base::android::MethodID::LazyGet<
base::android::MethodID::TYPE_INSTANCE>(
env, org_webrtc_VideoSink_clazz(env),
"onFrame",
"("
"Lorg/webrtc/VideoFrame;"
")"
"V",
&g_org_webrtc_VideoSink_onFrame);
env->CallVoidMethod(obj.obj(),
method_id, frame.obj());
jni_generator::CheckException(env);
}
#endif // org_webrtc_VideoSink_JNI
从文件的内容可以看出,这个文件是用jni_generator.py脚本生成的。
分发流程
视频分发主要类的对象的创建和关联过程是由PeerConnectionClient的以下代码完成的:
private VideoTrack createVideoTrack(VideoCapturer capturer) {
videoSource = factory.createVideoSource(capturer);
capturer.startCapture(videoWidth, videoHeight, videoFps);
localVideoTrack = factory.createVideoTrack(VIDEO_TRACK_ID, videoSource);
localVideoTrack.setEnabled(renderVideo);
localVideoTrack.addSink(localRender);
return localVideoTrack;
}
创建流程和简化后的主要对象如下所示:
黄色来表示java层类,用紫色来表示native层类。
通过以上流程,在java层和native层创建了source、track、sink等主要的对象,并将这些对象关联起来,流程主要如下:
step2~step13,根据VideoCapturer对象创建VideoSource,VideoCapturer和VideoSource通过AndroidVideoTrackSourceObserver关联起来,AndroidVideoTrackSourceObserver对象和VideoSource对象的nativeSource成员对应的都是native层的AndroidVideoTrackSource对象,后面来自VideoCapturer对象的图像数据就可以经过AndroidVideoTrackSourceObserver传递给native层的AndroidVideoTrackSource对象,最后通过VideoBroadcaster分发到sink对象上。
step13~step18,根据VideoSource对象创建VideoTrack,主要是创建了native层的VideoTrack对象,video_source_成员就是上面流程中创建的AndroidVideoTrackSource对象,这样track就和source关联起来,往track添加sink其实是添加到source上。
step19~step25,往VideoTrack添加VideoSink对象,流程最后其实是把创建的native层的VideoSinkWrapper对象添加到AndroidVideoTrackSource的broadcaster_成员中,VideoSinkWrapper对象的j_sink_成员引用的是是java层的VideoSink对象,这样就可以从native层回调到java层。可以往VideoTrack中添加多个VideoSink对象,demo中添加了一个用于本地预览的SurfaceViewRenderer对象(真正添加到track中其实是SurfaceViewRenderer的代理对象ProxyVideoSink),而用于编码传输的VideoStreamEncoder对象是在建立RTP传输会话时由PeerConnection来创建和添加的,因此需要将VideoTrack信息添加到PeerConnection对象中,相关代码如下所示:
mediaStream = factory.createLocalMediaStream("ARDAMS");
if (videoCallEnabled) {
mediaStream.addTrack(createVideoTrack(videoCapturer));
}
mediaStream.addTrack(createAudioTrack());
peerConnection.addStream(mediaStream);
可见是先把VideoTrack对象添加到MediaStream中,再将MediaStream添加到PeerConnection对象中。