这里有必要要阐述下字幕文件从单独的一个文件到成为视频的一部分做要进行的操作。这边的描述可能不太准确,还是需要看完整个系列比较好。
第一步,我们前面已经做过了,就是文本的解析。
第二步,就是把字幕放进CMSampleBufferRef里。
第三步,将CMSampleBufferRef装入AVAssetWriterInput
第四步,如果有多个字幕的话,重复第三步,所有的AVAssetWriterInput组成一个Input Group。
这边有个麻烦的地方就是你要把原视频的各种track及其关系重新读取,写入新文件。
第五步,使用AVAssetWriter将input group写入视频文件中。
本篇将演示在前篇的基础上,将字幕放入CMSampleBufferRef。
首先我们要创建一个CMFormatDescriptionRef,显然你可以猜到,这是字幕文件的格式描述结构。在当前,AVMediaTypeSubtitle的文件格式是tx3g。前面我们还放着一个坑,subtitle类里的第二个,第三个方法都还没介绍。现在介绍第二个:
- (CMFormatDescriptionRef)copyFormatDescription
{
// Create a subtitle 3g text format description with extensions
NSDictionary *extensions = @{(id)kCMTextFormatDescriptionExtension_DisplayFlags : @(self.displayFlags),
(id)kCMTextFormatDescriptionExtension_BackgroundColor : @{
(id)kCMTextFormatDescriptionColor_Red : @0,
(id)kCMTextFormatDescriptionColor_Green : @0,
(id)kCMTextFormatDescriptionColor_Blue : @0,
(id)kCMTextFormatDescriptionColor_Alpha : @255},
(id)kCMTextFormatDescriptionExtension_DefaultTextBox : @{
(id)kCMTextFormatDescriptionRect_Top : @0,
(id)kCMTextFormatDescriptionRect_Left : @0,
(id)kCMTextFormatDescriptionRect_Bottom : @0,
(id)kCMTextFormatDescriptionRect_Right : @0},
(id)kCMTextFormatDescriptionExtension_DefaultStyle : @{
(id)kCMTextFormatDescriptionStyle_StartChar : @0,
(id)kCMTextFormatDescriptionStyle_EndChar : @0,
(id)kCMTextFormatDescriptionStyle_Font : @1,
(id)kCMTextFormatDescriptionStyle_FontFace : @0,
(id)kCMTextFormatDescriptionStyle_ForegroundColor : @{
(id)kCMTextFormatDescriptionColor_Red : @255,
(id)kCMTextFormatDescriptionColor_Green : @255,
(id)kCMTextFormatDescriptionColor_Blue : @255,
(id)kCMTextFormatDescriptionColor_Alpha : @255},
(id)kCMTextFormatDescriptionStyle_FontSize : @255},
(id)kCMTextFormatDescriptionExtension_HorizontalJustification : @0,
(id)kCMTextFormatDescriptionExtension_VerticalJustification : @0,
(id)kCMTextFormatDescriptionExtension_FontTable : @{@"1" : @"Sans-Serif"}};
CMFormatDescriptionRef formatDescription;
CMFormatDescriptionCreate(NULL, kCMMediaType_Subtitle, kCMTextFormatType_3GText, (__bridge CFDictionaryRef)extensions, &formatDescription);
return formatDescription;
}
别害怕,我们来捋一下思路:首先我们要创建个字典,然后声明一个指向表述器结构的指针,最后调用创建方法创建一个表述器结构,并将地址传给前面声明的那个指针,对吧,就这么简单。
现在我们的重点是来了解下这个字典的键值对的意义。
结构有点复杂,这里的所有键都可以在CMTextFormatDescription Constants中找到。
kCMTextFormatDescriptionExtension_DisplayFlags,类型CFNumber,描述是否强制显示,值是我们前面说的是否强制显示。
kCMTextFormatDescriptionExtension_BackgroundColor,类型CFDictionary,描述背景颜色,这里采用RGBA模型,值为:
@{
(id)kCMTextFormatDescriptionColor_Red : @0,
(id)kCMTextFormatDescriptionColor_Green : @0,
(id)kCMTextFormatDescriptionColor_Blue : @0,
(id)kCMTextFormatDescriptionColor_Alpha : @255}
kCMTextFormatDescriptionExtension_DefaultTextBox,类型CFDictionary,描述字幕位置,值为:
@{
(id)kCMTextFormatDescriptionRect_Top : @0,
(id)kCMTextFormatDescriptionRect_Left : @0,
(id)kCMTextFormatDescriptionRect_Bottom : @0,
(id)kCMTextFormatDescriptionRect_Right : @0}
kCMTextFormatDescriptionExtension_DefaultStyle,类型CFDictionary,描述字幕的默认样式,下面又包含一些键,部分意思还不清楚
kCMTextFormatDescriptionStyle_StartChar,类型CFNumber (SInt16 for 3G), (SInt32 for QT)
kCMTextFormatDescriptionStyle_EndChar,类型CFNumber (SInt16 for 3G), (SInt32 for QT)
kCMTextFormatDescriptionStyle_Font,类型CFNumber (SInt16)
kCMTextFormatDescriptionStyle_FontFace,类型CFNumber (SInt8)
kCMTextFormatDescriptionStyle_ForegroundColor, 类型CFDictionary, The dictionary contains values forkCMTextFormatDescriptionColor_Red
,kCMTextFormatDescriptionColor_Green
, and so on。这里使用RGBA模型。
kCMTextFormatDescriptionStyle_FontSize
kCMTextFormatDescriptionExtension_HorizontalJustification,类型a CFNumber (SInt8) containing a CMTextJustificationValue,水平对齐方式
kCMTextFormatDescriptionExtension_VerticalJustification,类型a CFNumber (SInt8) containing a CMTextJustificationValue,垂直对齐方式
kCMTextFormatDescriptionExtension_FontTable,类型CFDictionary,Keys are FontIDs as CFStrings, values are font names as CFStrings.
很费力,目前有几个常量官网也没给出什么说明。
最后调用CMFormatDescriptionCreate,几个参数看一下
OSStatus CMFormatDescriptionCreate (
CFAllocatorRef allocator,
CMMediaType mediaType,
FourCharCode mediaSubtype,
CFDictionaryRef extensions,
CMFormatDescriptionRef *descOut
);
这里mediaType常见的有几个值,我们这边使用kCMMediaType_Subtitle,要和kCMMediaType_Text作下区别。
FourCharCode有点奇怪,这里用的是CMTextFormatType结构,表明了字幕文件类型。一般是 kCMTextFormatType_QTText和kCMTextFormatType_3GText。
extensions就是前面的字典,不过要类型转换下。
descOut你也看到了,它是一个out型参数,就是我们前面声明的那个formatDescription的引用。
很费劲,到现在我们还没有完全掌握这个字典里面的东西,但是暂时先过去吧,我们为某句字幕创建了它的格式表述。
好了 现在我们要讲第三个方法copySampleBuffer了。
这个buffer用来存放我们的描述,字幕内容,强制标识等,那么首先就是计算buffer的size。
这里有必要画一下结构图,不幸的是似乎目前还没有什么资料解释这些结构。
const char *text = self.text.UTF8String;
// Setup the sample size
uint16_t textLength = 0;
size_t sampleSize = 0;
if (text != NULL)
textLength = strlen(text); // don't include terminator in the length
sampleSize = textLength + sizeof(uint16_t);
if (self.forced)
sampleSize += (sizeof (uint32_t) * 2); // for the 'frcd' atom
uint8_t *samplePtr = malloc(sampleSize); // malloc space for length of text, text, and extensions. This variable should be char *, uint8_t *, UInt8 * for byte alignment reasons.
uint16_t textLengthBigEndian = CFSwapInt16HostToBig(textLength);
memcpy(samplePtr, &textLengthBigEndian, sizeof(textLengthBigEndian));
if (textLength)
memcpy ((samplePtr + sizeof(uint16_t)), text, textLength);
uint8_t *ptr = samplePtr + sizeof (uint16_t) + textLength;
if (self.forced)
{
// Make room for the forced atom.
(*(uint32_t *) ptr) = CFSwapInt32HostToBig((sizeof (uint32_t) * 2));
ptr += sizeof(uint32_t);
// Set the forced atom.
(*(uint32_t *) ptr) = CFSwapInt32HostToBig('frcd');
}
uint16_t textLength, //CFSwapInt16HostToBig(textLength)
uint16_t *text, // text
uint32_t forcedSize,//CFSwapInt32HostToBig((sizeof (uint32_t) * 2))
uint32_t forcedType //CFSwapInt32HostToBig('frcd')
这边使用了字节序调整方法CFSwapInt16HostToBig。text没有调用该方法可能是NSString本身已经是大端序。
这边需要解释一些东西,可以帮助你去了解QuickTime File Format 。Subtitle sample data就是我们前面描述的这个结构。
Subtitle sample data consists of a 16-bit word that specifies the length (number of bytes) of the subtitle text, followed by the subtitle text and then by optional sample extensions. The subtitle text is Unicode text, encoded either as UTF-8 text or UTF-16 text beginning with a UTF-16 BYTE ORDER MARK ('\uFEFF') in big or little endian order. There is no null termination for the text.
Subtitle sample data首先是字幕文本的长度,接着是字幕文本本身,接着是可选的Subtitle sample extensions。每个extension都是atom结构。你有必要去了解下atom结构。它一般包含两部分size和type。
The basic data unit in a QuickTime file is the atom. Each atom contains size and type fields that precede any other data. The size field indicates the total number of bytes in the atom, including the size and type fields. The type field specifies the type of data stored in the atom and, by implication, the format of that data. In some cases, the size and type fields are followed by a version field and a flags field. An atom with these version and flags fields is sometimes called a full atom.
这样你就可以理解force atom结构的来龙去脉了。
完成这一结构以后,我们需要把它放到CMBlockBuffer中,CMBlockBuffer是一个很关键的结构,sampleBuffer的数据部分即是它构成的。它有什么作用呢?根据文档
This document describes the Core Media objects that you use to move blocks of memory through a processing system.
A CMBlockBuffer is a CFType object that represents a contiguous range of data offsets (from zero to CMBlockBufferGetDataLength) across a possibly noncontiguous memory region. The memory region is composed of memory blocks and buffer references. The buffer references can in turn refer to additional regions. CMBlockBuffer uses CMAttachment protocol to propagate memory blocks.
CMBlockBuffer是在处理系统中用于移动内存块的对象。它表示在可能的非连续内存区域中,数据的连续值。怎么理解?我的理解是,可能CMBlockBuffer中的数据存放在不同的区域中,可能来自内存块,也可能来自其他的buffer reference,使用CMBlockBuffer就隐藏了具体的存储细节,让你可以简单地使用0到CMBlockBufferGetDataLength的索引来定位数据。
我们接着完成下面部分。
CMBlockBufferRef dataBuffer;
CMBlockBufferCreateWithMemoryBlock(kCFAllocatorDefault, samplePtr, sampleSize, kCFAllocatorMalloc, NULL, 0, sampleSize, 0, &dataBuffer);
CMSampleTimingInfo sampleTiming;
sampleTiming.duration = self.timeRange.duration;
sampleTiming.presentationTimeStamp = self.timeRange.start;
sampleTiming.decodeTimeStamp = kCMTimeInvalid;
CMFormatDescriptionRef formatDescription = [self copyFormatDescription];
CMSampleBufferRef sampleBuffer;
CMSampleBufferCreate(kCFAllocatorDefault, dataBuffer, true, NULL, 0, formatDescription, 1, 1, &sampleTiming, 1, &sampleSize, &sampleBuffer);
if (formatDescription)
CFRelease(formatDescription);
if (dataBuffer)
CFRelease(dataBuffer);
return sampleBuffer;
1.从内存块也就是我们刚才创建的Subtitle sample data结构创建blockBuffer
2.创建sampleTimingInfo,主要包括时长,显示时间点和解码时间点
3.创建该字幕sample的格式描述结构
4.从blockBuffer,sampleTimingInfo,formatDescription中创建sampleBuffer。它包括了数据内容,时间信息和格式描述信息。
5.释放Ref指针。
更多资料请看:
https://developer.apple.com/library/mac/documentation/QuickTime/QTFF/QTFFChap4/qtff4.html#//apple_ref/doc/uid/TP40000939-CH206-18745
https://developer.apple.com/library/mac/documentation/QuickTime/QTFF/QTFFChap3/qtff3.html#//apple_ref/doc/uid/TP40000939-CH205-SW82
http://mczonk.de/reading-chapters-with-avfoundation/
http://www.verious.com/qa/cmsample-buffer-get-audio-buffer-list-with-retained-block-buffer-documentation/