本篇文章是在开发新功能-观众端录制直播流的小视频过程中,记录学习到内容,踩过的坑,分享一下.
需求
在观众端可以录制正在播放的流的小视频,同时要将屏幕上的用户互动,包括:礼物,聊天,弹幕等元素同时录制下来,与视频流合在一起.
背景介绍
播放器使用七牛PLPlayerKit.而该框架在播放流时有两个回调方法,将解析到的流数据暴露出来.
/**
回调将要渲染的帧数据
该功能只支持直播
@param player 调用该方法的 PLPlayer 对象
@param frame 将要渲染帧 YUV 数据。
CVPixelBufferGetPixelFormatType 获取 YUV 的类型。
软解为 kCVPixelFormatType_420YpCbCr8Planar.
硬解为 kCVPixelFormatType_420YpCbCr8BiPlanarVideoRange.
@param pts 显示时间戳 单位ms
@param sarNumerator
@param sarDenominator
其中sar 表示 storage aspect ratio
视频流的显示比例 sarNumerator sarDenominator
@discussion sarNumerator = 0 表示该参数无效
@since v2.4.3
*/
- (void)player:(nonnull PLPlayer *)player willRenderFrame:(nullable CVPixelBufferRef)frame pts:(int64_t)pts sarNumerator:(int)sarNumerator sarDenominator:(int)sarDenominator;
/**
回调音频数据
@param player 调用该方法的 PLPlayer 对象
@param audioBufferList 音频数据
@param audioStreamDescription 音频格式信息
@param pts 显示时间戳 是解码器进行显示帧时相对于SCR(系统参考)的时间戳。SCR可以理解为解码器应该开始从磁盘读取数据时的时间
@param sampleFormat 采样位数 枚举:PLPlayerAVSampleFormat
@return audioBufferList 音频数据
@since v2.4.3
*/
- (nonnull AudioBufferList *)player:(nonnull PLPlayer *)player willAudioRenderBuffer:(nonnull AudioBufferList *)audioBufferList asbd:(AudioStreamBasicDescription)audioStreamDescription pts:(int64_t)pts sampleFormat:(PLPlayerAVSampleFormat)sampleFormat;
分析
拿到需求时,针对要将用户互动内容一起渲染的需求,首先想到了OpenGL中的多重纹理混合的应用,将通过视频流创建的纹理和通过屏幕元素创建的纹理混合后,输出我们需要的纹理数据,转为视频数据,通过回调接口的pts与音频数据同步,录入视频.
而这个流程中的合成和写入视频,基于 OpenGL ES的GPUImage都有很好的是实现,本着不重复造轮子,合理利用资源,于是就决定基于GPUImage来实现.
视频数据
通过可以拿到的视频数据为kCVPixelFormatType_420YpCbCr8Planar(软解)或者kCVPixelFormatType_420YpCbCr8BiPlanarVideoRange(硬解)的CVPixelBufferRef数据.
/**
@abstract 是否使用 video toolbox 硬解码。
@discussion 使用 video toolbox Player 将尝试硬解码,失败后,将切换回软解码。
@waring 该参数仅对 rtmp/flv 直播生效, 默认不使用。支持 iOS 8.0 及更高版本。
@since v2.1.4
*/
extern NSString * _Nonnull PLPlayerOptionKeyVideoToolbox;
虽然GPUImage对kCVPixelFormatType_420YpCbCr8BiPlanarVideoRange格式的数据有着很好的支持和使用过程,但是本着兼容性考虑,必须对使用kCVPixelFormatType_420YpCbCr8Planar格式视频数据作为输入.
代码
GPUImagePixelRender继承GPUImageOutput,作为输出视频纹理的类,进入GPUImage响应链.基本仿照了GPUImageMovie类的初始化流程.关键点在修改shader. kCVPixelFormatType_420YpCbCr8Planar是三个planar来分别存储YUV数据,在上传纹理时必然使用是三个纹理采样.
// DTVRecordVideoFrame:数据模型类,记录视频数据
- (DTVRecordVideoFrame *)creatTextureYUV:(CVPixelBufferRef)pixelBuffer
{
OSType pixelType = CVPixelBufferGetPixelFormatType(pixelBuffer);
NSAssert(pixelType == kCVPixelFormatType_420YpCbCr8Planar, @"pixelType error ...");
int pixelWidth = (int)CVPixelBufferGetWidth(pixelBuffer);
int pixelHeight = (int)CVPixelBufferGetHeight(pixelBuffer);
CVPixelBufferLockBaseAddress(pixelBuffer, 0);
DTVRecordVideoFrame *yuv = [[DTVRecordVideoFrame alloc] init];
// 视频数据的宽高
yuv.width = pixelWidth;
yuv.height = pixelHeight;
// YUV三个分量数据
size_t y_size = pixelWidth * pixelHeight;
uint8_t *yuv_y_frame = malloc(y_size);
uint8_t *y_frame = CVPixelBufferGetBaseAddressOfPlane(pixelBuffer, 0);
memcpy(yuv_y_frame, y_frame, y_size);
yuv.Y = yuv_y_frame;
size_t u_size = y_size / 4;
uint8_t *yuv_u_frame = malloc(u_size);
uint8_t *u_frame = CVPixelBufferGetBaseAddressOfPlane(pixelBuffer, 1);
memcpy(yuv_u_frame, u_frame, u_size);
yuv.U = yuv_u_frame;
size_t v_size = y_size / 4;
uint8_t *yuv_v_frame = malloc(v_size);
uint8_t *v_frame = CVPixelBufferGetBaseAddressOfPlane(pixelBuffer, 2);
memcpy(yuv_v_frame, v_frame, v_size);
yuv.V = yuv_v_frame;
CVPixelBufferUnlockBaseAddress(pixelBuffer, 0);
return yuv;
}
解析出数据后,创建FBO,上传顶点和纹理数据等不做详解,可参考GPUImageMovie来做.下面是创建纹理对象的代码
- (void)setupTexture:(DTVRecordVideoFrame *)videoFrame
{
if (0 == _textures[0]) glGenTextures(3, _textures);
const uint8_t *pixelByte[3] = {videoFrame.Y , videoFrame.U , videoFrame.V};
const int widths[3] = { videoFrame.width, videoFrame.width / 2, videoFrame.width / 2 };
const int heights[3] = { videoFrame.height, videoFrame.height / 2, videoFrame.height / 2 };
for (int i = 0; i < 3; i++) {
glBindTexture(GL_TEXTURE_2D, _textures[i]);
glTexImage2D(GL_TEXTURE_2D,
0,
GL_LUMINANCE,
widths[i],
heights[i],
0,
GL_LUMINANCE,
GL_UNSIGNED_BYTE,
pixelByte[i]);
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MAG_FILTER, GL_LINEAR);
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER, GL_LINEAR);
glTexParameterf(GL_TEXTURE_2D, GL_TEXTURE_WRAP_S, GL_CLAMP_TO_EDGE);
glTexParameterf(GL_TEXTURE_2D, GL_TEXTURE_WRAP_T, GL_CLAMP_TO_EDGE);
glBindTexture(GL_TEXTURE_2D, 0);
}
}
再看一下shader,shader的代码是从kxMoive中学习到的,用来渲染YUV三个分量,
NSString *const kGPUImageYUVPlanarFragmentShaderString = SHADER_STRING
(
varying highp vec2 textureCoordinate;
uniform sampler2D s_texture_y;
uniform sampler2D s_texture_u;
uniform sampler2D s_texture_v;
void main()
{
highp float y = texture2D(s_texture_y, textureCoordinate).r;
highp float u = texture2D(s_texture_u, textureCoordinate).r - 0.5;
highp float v = texture2D(s_texture_v, textureCoordinate).r - 0.5;
highp float r = y + 1.402 * v;
highp float g = y - 0.344 * u - 0.714 * v;
highp float b = y + 1.772 * u;
gl_FragColor = vec4(r,g,b,1.0);
}
);
View数据
GPUImageUIElement就是用来根据view或layer来生成纹理的,按理说可以拿来直接使用,然而我们屏幕上的动画并不是都用的layer.contents来实现的,有些是基于UIView或者CAAnimation的一些layer动画,如果直接使用view或layer,一些动画根本不会显示出来.如果对CALayer图层了解的话,肯定知道为什么了.Layer层中modelLayer的属性是在修改后立刻就变为终值的,而presentationLayer则会经历一个渐变的修改过程.而我们平常view.layer就是modelLayer,直接是终值了.所以我们拿到的纹理动画上会有些奇怪.
知道了这点,对GPUImageUIElement进行修改,每次创建纹理获取数据时,是用presentationLayer来渲染.
同时发现GPUImageUIElement每次更新纹理时,都创建新的FBO,不会cache回收,对下面这段代码进行了修改,使用完FBO后.
for (id currentTarget in targets)
{
if (currentTarget != self.targetToIgnoreForUpdates)
{
NSInteger indexOfObject = [targets indexOfObject:currentTarget];
NSInteger textureIndexOfTarget = [[targetTextureIndices objectAtIndex:indexOfObject] integerValue];
[currentTarget setInputSize:layerPixelSize atIndex:textureIndexOfTarget];
[currentTarget setInputFramebuffer:outputFramebuffer atIndex:textureIndexOfTarget];
[currentTarget newFrameReadyAtTime:kCMTimeIndefinite atIndex:textureIndexOfTarget];
}
}
合成
PlanA:视频收到一帧合成绘制一帧.通常我们采用的视频流帧率是24或36,而屏幕刷新是60,经测试,以视频的帧率来刷新会比较节省CPU,但视频卡顿时,屏幕元素也会卡住,同时动画不够流畅.
PlanB:以CADisplayLink刷新屏幕元素,以接收到的帧数据刷新视频帧.
GPUImageMovieWriter用来写入视频数据,存入本地.
//缓存视频帧
- (void)addVideoPixelBuffer:(CVPixelBufferRef)pixelBuffer pts:(int64_t)videoPts fps:(int)videoFPS;
{
// 已缓存足够的数据
if (_hasFillFrame) {
return;
}
// 记录开始的pts
if (!_firstFramePTS) _firstFramePTS = videoPts;
DTVRecordVideoFrame *videoframe = [self creatTextureYUV:pixelBuffer];
if (videoframe.Y == NULL || videoframe.U == NULL || videoframe.V == NULL ) {
NSLog(@"无视频效帧");
return;
}
videoframe.pts = videoPts;
//帧持续时长
videoframe.duration = _previousFrame ? (videoPts - _previousFrame.pts) : (1 / 24.f * 1000);
//帧在我们录制视频中的pts
videoframe.frameTime = CMTimeMake((videoPts - _firstFramePTS) * 600, 600 * 1000);
// 缓存
[self.videoBuffer addObject:videoframe];
_previousFrame = videoframe;
if (self.videoBuffer.count > 3 && !self.displayLink) {
//循环切换视频帧
[self tick];
}
}
在tick中会根据缓存的数量,和帧持续的时长切换当前的帧数据.通过GPUImageMovieWriter写入视频中.
https://github.com/BradLarson/GPUImage/issues/1729解答GPUImageMovieWriter写入AVFileTypeMPEG4时出现问题解决办法.
- (void)tick
{
if (self.videoBuffer.count < 1) {
if (_hasFillFrame) {
[self stopDisplayLinkTimer];
if (_movieWriter) {
[_movieWriter finishRecording];
[_blendFilter removeTarget:_movieWriter];
_movieWriter = NULL;
}
if (self.completeBlock) self.completeBlock(_coverImage);
}
else{
_renderVideoFrame = NO;
NSLog(@"卡住...");
}
}
else
{
_renderVideoFrame = YES;
DTVRecordVideoFrame *frameTexture = self.videoBuffer.firstObject;
if (!self.movieWriter) {
unlink([DefaultFuckVideoPath UTF8String]);
_movieWriter = [[GPUImageMovieWriter alloc] initWithMovieURL:[NSURL fileURLWithPath:DefaultFuckVideoPath] size:CGSizeMake(540, 960) fileType:AVFileTypeMPEG4 outputSettings:nil];
_movieWriter.encodingLiveVideo = YES;
_movieWriter.hasAudioTrack = YES;
_movieWriter.assetWriter.movieFragmentInterval = kCMTimeInvalid;
[self.pixelRender addTarget:self.blendFilter];
[self.layerRender addTarget:self.blendFilter];
[self.blendFilter addTarget:_movieWriter];
[_movieWriter startRecording];
}
[self startDisplayLinkTimer];
runAsynchronouslyOnVideoProcessingQueue(^{
[self.pixelRender processVideoFrame:frameTexture];
});
dispatch_after(dispatch_time(DISPATCH_TIME_NOW, (int64_t)(frameTexture.duration * NSEC_PER_MSEC)), dispatch_get_main_queue(), ^{
[self.videoBuffer removeObjectAtIndex:0];
[self tick];
});
//作为封面
if (CMTimeGetSeconds(_previousFrame.frameTime) > 0.5f && !_coverImage) {
[self.blendFilter useNextFrameForImageCapture];
_coverImage = [self.blendFilter imageFromCurrentFramebuffer];
}
}
}
CADisplayLink方法writerFrame,负责刷新获取屏幕元素数据,与当前视频帧_currentFrame通过GPUImageAlphaBlendFilter的滤镜合成最终一帧.
- (void)writerFrame
{
[self.layerRender updateWithPresentationLayer:_renderView.layer.presentationLayer];
}
音频
音频要和视频同步,由于我们通过七牛接口拿到的是AudioBufferList数据,需要转换为GPUImageMovieWriter 需要的CMSampleBufferRef数据.
// 根据视频的pts重新计算获取音频的pts
CMTime time = CMTimeMake((audioPts - _firstFramePTS) * 600, 600 * 1000);
//转换CMSampleBufferRef
CMSampleBufferRef audioBuffer = NULL;
CMFormatDescriptionRef format = NULL;
CMSampleTimingInfo timing = {CMTimeMake(1, audioStreamDescription.mSampleRate),time, kCMTimeInvalid};
UInt32 size = audioBufferList->mBuffers->mDataByteSize / sizeof(UInt32);
UInt32 mNumberChannels = audioBufferList->mBuffers->mNumberChannels;
CMItemCount numSamples = (CMItemCount)size / mNumberChannels;
OSStatus status;
status = CMAudioFormatDescriptionCreate(kCFAllocatorDefault, &audioStreamDescription, 0, NULL, 0, NULL, NULL, &format);
if (status != noErr) {
CFRelease(format);
return;
}
status = CMSampleBufferCreate(kCFAllocatorDefault,NULL,false,NULL,NULL,format,numSamples, 1, &timing, 0, NULL, &audioBuffer);
if (status != noErr) {
CFRelease(format);
return;
}
status = CMSampleBufferSetDataBufferFromAudioBufferList(audioBuffer, kCFAllocatorDefault,kCFAllocatorDefault, 0,audioBufferList);
if (status != noErr) {
CFRelease(format);
return;
}
if (_movieWriter && audioBuffer) {
[_movieWriter processAudioBuffer:audioBuffer];
}
总结
在做这个功能的过程中学习到了很多内容,CALayer图层,视频数据格式,音频转换,简单的音视频同步,加深了GPUImage的理解.个人感觉收获颇多.