使用AVFoundation处理视频

项目中有类似视频编辑的功能,之前都是直接使用AVFoundation开发完成,实际效果也不错;
对于一些常见用法,现在花时间来个总结;

基础

  • CMTime 资源时间(视频)
  • AVAsset 资源信息
  • AVURLAsset 根据URL路径创建的资源
  • AVAssetTrack 资源轨道,包括音频轨道和视频轨道
  • AVMutableComposition 包含多个轨道的资源集合,可以添加、删除轨道
  • AVMutableCompositionTrack 可变轨道用于集合
  • AVMutableVideoComposition 视频操作指令集合
  • AVMutableVideoCompositionInstruction 视频操作指令
  • AVMutableVideoCompositionLayerInstruction视频轨道操作指令,需要添加到AVMutableVideoCompositionInstruction
  • AVMutableAudioMix 音频配置
  • AVMutableAudioMixInputParameters音频操作参数
  • AVPlayerItem媒体资源管理对象,管理视频的基本信息和状态
  • AVPlayer 视频播放类,本身不显示视频,需创建一个AVPlayerLayer层,添加到视图
  • AVAssetExportSession 导出资源

一个视频的信息 大致如图所示:

来源https://www.jianshu.com/p/3c585899c455

CMTime

为了确保时间精度,AVFoundation时间单位使用了CMTime(Core Media); CMTime由value和timescale表示:

public var value: CMTimeValue
public var timescale: CMTimeScale

timescale可以理解为将1s时间分成了多少段,value可以理解为有多少段时间,最终的时间则为 value/timescale;

        let time3 = CMTime(value: 1, timescale: 30)
        print(time3.seconds) // 0.033

也可以通过时长创建CMTime:

        let time2 = CMTime(seconds: 1, preferredTimescale: 2)
        print(time2.value) // 2

CMTime同样支持加减运算:

let time = time2 + time3 

最终结果为分数加减结果,最终分母timescale为参与运算所有时间的timescale的最小公倍数;

用于视频时,timescale一般都设置为600
主要是因为,各种视频的帧率frame不一致,有的视频为30fps,有的视频为24fps,还有些视频为60fps;为了能兼容,timescale设置为其最小公倍数;

AVAsset

表示资源信息,这个资源泛指视频、音频、图片等等;
其中包含资源轨道、资源时长、资源类型等信息;

AVURLAsset为AVAsset子类,为通过URL创建的资源,这个URL既可以是网络的url也可以是客户端存储的file url;

        let asset = AVURLAsset(url: videoUrl)
        let duration = asset.duration
        CMTimeShow(duration)

        let tracks = asset.tracks

AVAssetTrack

资源轨道信息,可以理解为资源的最小表示对象;一个资源由多个轨道组成;
轨道种类:

public static let video: AVMediaType
public static let audio: AVMediaType
public static let text: AVMediaType
public static let closedCaption: AVMediaType
public static let subtitle: AVMediaType
public static let timecode: AVMediaType
public static let metadata: AVMediaType
public static let muxed: AVMediaType
public static let metadataObject: AVMediaType
public static let depthData: AVMediaType

一般视频包含视频、音频、字幕轨道;
轨道包含的主要信息:

open var timeRange: CMTimeRange { get } // 时间范围
open var naturalSize: CGSize { get } // 长宽
open var minFrameDuration: CMTime { get }  // frame
open var preferredVolume: Float { get } // 音量

AVMutableComposition

AVAsset子类,也是表示资源;可以简单看做是可以编辑的资源:可以对轨道进行组合、删除等操作

AVMutableCompositionTrack

AVAssetTrack子类,也是表示轨道;可以简单看做是可以编辑的轨道:可以插入别的轨道

AVMutableVideoComposition

处理视频的信息: 可以设置视频背景颜色,视频size等,在添加视频水印、视频转场动画时需要使用到

AVMutableAudioMix

音频配置,通过配置可更改视频的音量等


资源、轨道等之间的关系(引用官方文档):

AVAssetExportSession

通过轨道设置、音频设置、视频设置等操作后即可导出一个新的资源;

编码

在实际编码时,这些类容易搞混;为了更直观更清晰的理解其间的关系,特意做了UML图:

涉及到对资源、轨道的修改,因此Composition、CompositionTrack等都是使用的可变类型AVMutableComposition、AVMutableCompositionTrack(它们均有自己的不可变的基类)

多个视频合并成一个视频

思路:
按照需要的合并顺序依次将多个视频原始的视频、音频等轨道信息分离出来,将分离的视频、音频轨道分别根据时间线合并到一个新的视频、音频轨道;由新的视频、音频轨道再创建一个新的资源(AVComposition)

// 创建资源集合composition及可编辑轨道
let composition = AVMutableComposition()
// 视频轨道
let videoCompositionTrack = composition.addMutableTrack(withMediaType: .video, preferredTrackID:
kCMPersistentTrackID_Invalid) // kCMPersistentTrackID_Invalid 自动创建随机ID
// 音频轨道
let audioCompositionTrack = composition.addMutableTrack(withMediaType: .audio, preferredTrackID:
kCMPersistentTrackID_Invalid)
var insertTime = CMTime.zero
for url in urls {
    autoreleasepool {
        // 获取视频资源 并分离出视频、音频轨道
        let asset = AVURLAsset(url: url)
        let videoTrack = asset.tracks(withMediaType: .video).first
        let audioTrack = asset.tracks(withMediaType: .audio).first
        let videoTimeRange = videoTrack?.timeRange
        let audioTimeRange = audioTrack?.timeRange
        
        // 将多个视频轨道合到一个轨道上(AVMutableCompositionTrack)
        if let insertVideoTrack = videoTrack, let insertVideoTime = videoTimeRange {
            do {
                // 在某个时间点插入轨道
                try videoCompositionTrack?.insertTimeRange(CMTimeRange(start: .zero, duration:
insertVideoTime.duration), of: insertVideoTrack, at: insertTime)
            } catch let e {
                callback(false, e)
                return
            }
        }
        
        // 将多个音频轨道合到一个轨道上(AVMutableCompositionTrack)
        if let insertAudioTrack = audioTrack, let insertAudioTime = audioTimeRange {
            do {
                try audioCompositionTrack?.insertTimeRange(CMTimeRange(start: .zero, duration:
insertAudioTime.duration), of: insertAudioTrack, at: insertTime)
            } catch let e {
                callback(false, e)
                return
            }
        }
        
        /* insertTimeRange 多个轨道,后一个视频黑屏
          let assetTimeRage = CMTimeRange(start: .zero, duration: asset.duration)
          do {
              try composition.insertTimeRange(assetTimeRage, of: asset, at: insertTime)
          } catch let e {
              callback(false, e)
              return
          }
        */

        insertTime = insertTime + asset.duration
    }
}

至此,得到了合并后的资源,将资源导出才能取得最终的视频数据:

guard let exportSession = AVAssetExportSession(asset: asset, presetName: AVAssetExportPresetPassthrough) else {
    callback?(false, nil)
    return
}
exportSession.outputFileType = .mp4
exportSession.outputURL = outputUrl
exportSession.shouldOptimizeForNetworkUse = true
exportSession.exportAsynchronously {
    switch exportSession.status {
        case .completed:
            callback?(true, nil)
        default:
            callback?(false, exportSession.error)
    }
}

preset:
视频预设,决定导出后的视频编码方式、清晰度;
AVAssetExportPresetPassthrough表示使用视频原有预设,使用该值audioMix,videoInstruction均不生效;

ps:
AVMutableComposition提供了方法:

- (BOOL)insertTimeRange:(CMTimeRange)timeRange ofAsset:(AVAsset *)asset atTime:(CMTime)startTime error:(NSError * _Nullable * _Nullable)outError;

可以直接插入某个资源,而不用分离轨道;之前尝试使用该方式合并多个视频,但最终结果只有第一个视频能正常显示后面的全黑屏(大概是因为存在多个轨道无法切换吧);

调整视频方向、统一视频size

以上多个视频合并时存在的2个问题:

  1. 拍摄视频时,手机的方向将会决定视频的方向;视频方向不是默认的landscapeRight导出的视频也会是旋转的;
    视频方向和设备方向类似:
  • portrait : home键在下,竖屏拍摄的视频 视频旋转了90度
  • landscapeLeft:home键朝左,横屏拍摄的视频 视频旋转了180度
  • portraitUpsideDown:home键在上,竖屏拍摄的视频 视频旋转了270度
  • landscapeRight: home键朝右,横屏拍摄的视频(视频显示默认方向)
  1. 多个视频的size不一样,合成一个视频时有时需要统一size;

以上2点需要配置AVMutableVideoComposition实现:

let composition = AVMutableComposition()
guard let videoCompositionTrack = composition.addMutableTrack(withMediaType: .video, preferredTrackID:
kCMPersistentTrackID_Invalid) else {
    callback(false, nil)
    return
}
let audioCompositionTrack = composition.addMutableTrack(withMediaType: .audio, preferredTrackID:
kCMPersistentTrackID_Invalid)
// layerInstruction 用于更改视频图层
let vcLayerInstruction = AVMutableVideoCompositionLayerInstruction(assetTrack: videoCompositionTrack)
var layerInstructions = [vcLayerInstruction]
var insertTime = CMTime.zero
for url in urls {
    autoreleasepool {
        let asset = AVURLAsset(url: url)
        let videoTrack = asset.tracks(withMediaType: .video).first
        let audioTrack = asset.tracks(withMediaType: .audio).first
        let videoTimeRange = videoTrack?.timeRange
        let audioTimeRange = audioTrack?.timeRange
        
        if let insertVideoTrack = videoTrack, let insertVideoTime = videoTimeRange {
            do {
                try videoCompositionTrack.insertTimeRange(CMTimeRange(start: .zero, duration:
insertVideoTime.duration), of: insertVideoTrack, at: insertTime)
                
                // 更改Transform 调整方向、大小 , 并统一一致的rendersize
                var trans = insertVideoTrack.preferredTransform
                let size = insertVideoTrack.naturalSize
                let orientation = orientationFromVideo(assetTrack: insertVideoTrack)
                switch orientation {
                    case .portrait:
                        let scale = renderSize.height / size.width
                        trans = CGAffineTransform(scaleX: scale, y: scale)
                        trans = trans.translatedBy(x: size.height, y: 0)
                        trans = trans.rotated(by: .pi / 2.0)
                    case .landscapeLeft:
                        let scale = renderSize.width / size.width
                        trans = CGAffineTransform(scaleX: scale, y: scale)
                        trans = trans.translatedBy(x: size.width, y: size.height + (renderSize.height -
size.height * scale) / scale / 2.0)
                        trans = trans.rotated(by: .pi)
                    case .portraitUpsideDown:
                        let scale = renderSize.height / size.width
                        trans = CGAffineTransform(scaleX: scale, y: scale)
                        trans = trans.translatedBy(x: 0, y: size.width)
                        trans = trans.rotated(by: .pi / 2.0 * 3)
                    case .landscapeRight:
                        // 默认方向
                        let scale = renderSize.width / size.width
                        trans = CGAffineTransform(scaleX: scale, y: scale)
                        trans = trans.translatedBy(x: 0, y: (renderSize.height - size.height * scale) /
scale / 2.0)
                }
                
                vcLayerInstruction.setTransform(trans, at: insertTime)
                layerInstructions.append(vcLayerInstruction)
            } catch let e {
                callback(false, e)
                return
            }
        }
        if let insertAudioTrack = audioTrack, let insertAudioTime = audioTimeRange {
            do {
                try audioCompositionTrack?.insertTimeRange(CMTimeRange(start: .zero, duration:
insertAudioTime.duration), of: insertAudioTrack, at: insertTime)
            } catch let e {
                callback(false, e)
                return
            }
        }
        
        insertTime = insertTime + asset.duration
    }
}

导出视频时,设置videoComposition即可:

let videoComposition = AVMutableVideoComposition()
// videoComposition必须指定 帧率frameDuration、大小renderSize
videoComposition.frameDuration = CMTime(value: 1, timescale: 30)
videoComposition.renderSize = renderSize
let vcInstruction = AVMutableVideoCompositionInstruction()
vcInstruction.timeRange = CMTimeRange(start: .zero, duration: composition.duration)
vcInstruction.backgroundColor = UIColor.red.cgColor // 可以设置视频背景颜色
vcInstruction.layerInstructions = layerInstructions
videoComposition.instructions = [vcInstruction]
guard let exportSession = AVAssetExportSession(asset: composition, presetName:
AVAssetExportPreset1280x720) else {
    callback(false, nil)
    return
}
exportSession.videoComposition = videoComposition
exportSession.outputFileType = .mp4
exportSession.outputURL = outputUrl
exportSession.shouldOptimizeForNetworkUse = true
exportSession.exportAsynchronously {
    switch exportSession.status {
        case .completed:
            callback(true, nil)
        default:
            callback(false, exportSession.error)
    }
}

缩放、旋转、平移视频的操作,是通过修改AVMutableVideoCompositionLayerInstruction的CGAffineTransform实现;视频的CGAffineTransform和UIView的CGAffineTransform一样,是一个矩阵,通过修改矩阵的值最终跳转视频;

CGAffineTransform矩阵包含a,b,c,d,tx,ty6个值:通过矩阵运算得到最终的结果:

CGAffineTransform原理及使用可参考: https://www.jianshu.com/p/ca7f9bc62429
https://www.jianshu.com/p/a848d6b5a4b5

通过以上处理,之前横屏的视频竖向显示效果如下:

添加音频

给原有视频添加音频:可用于添加解说、添加背景音乐等;
实现的思路万变不离其宗,在原有音频轨道添加新的音频轨道数据即可:

var audioParameters: [AVMutableAudioMixInputParameters] = []
let asset = AVURLAsset(url: videoUrl)
let composition = AVMutableComposition()
do {
    try composition.insertTimeRange(CMTimeRange(start: .zero, duration: asset.duration), of: asset, at:
.zero)
} catch let e {
    callback(false, e)
    return
}
let audioAsset = AVURLAsset(url: audioUrl)
let audioCompositionTrack = composition.addMutableTrack(withMediaType: .audio, preferredTrackID:
kCMPersistentTrackID_Invalid)
let audioTracks = audioAsset.tracks(withMediaType: .audio)
for audioTrack in audioTracks {
    let adParameter = AVMutableAudioMixInputParameters(track: audioTrack)
    adParameter.setVolume(1, at: .zero)
    audioParameters.append(adParameter)
    
    do {
        try audioCompositionTrack?.insertTimeRange(audioTrack.timeRange, of: audioTrack, at: .zero)
    } catch let e {
        callback(false, e)
        return
    }
}
// AVAssetExportPresetPassthrough报错:Code=-11838 "Operation Stopped"
guard let exportSession = AVAssetExportSession(asset: composition, presetName:
AVAssetExportPresetMediumQuality) else {
    callback(false, nil)
    return
}
// 调节音频
let audioMix = AVMutableAudioMix()
audioMix.inputParameters = audioParameters
exportSession.audioMix = audioMix

在添加音频的同时设置了AVMutableAudioMix,可以通过audiomix调节音频的音量;Volume的值范围:0--1(由低到高)

删除轨道数据

以上都是添加轨道的操作,有时也需要删除轨道,如去除视频原声等:

let asset = AVURLAsset(url: url)
let composition = AVMutableComposition()
do {
    try composition.insertTimeRange(CMTimeRange(start: .zero, duration: asset.duration), of: asset,
.zero)
} catch let e {
    exportback?(false, e)
    return nil
}
let tracks = composition.tracks(withMediaType: type)
for track in tracks {
    composition.removeTrack(track)
}

裁剪视频

根据某个时间段裁剪视频:

public static func cutVideo(url: URL, outputUrl: URL, secondsRange: ClosedRange, callback: @escaping
VideoResult) {
    let asset = AVURLAsset(url: url)
    let composition = AVMutableComposition()
    do {
        let timeRange = CMTimeRange(start: CMTime(seconds: secondsRange.lowerBound, preferredTimescale:
timescale), end: CMTime(seconds: secondsRange.upperBound, preferredTimescale: timescale))
        try composition.insertTimeRange(timeRange, of: asset, at: .zero)
    } catch let e {
        callback(false, e)
        return
    }
    
    exportVideo(composition, AVAssetExportPresetPassthrough, outputUrl, callback)
}

获取指定帧的图片

一般用于设置视频封面
使用AVAssetImageGenerator,可以获取资源asset某个时间cmtime的数据:

let imageGenerator = AVAssetImageGenerator(asset: asset)
imageGenerator.appliesPreferredTrackTransform = true
// .zero 精确获取
imageGenerator.requestedTimeToleranceAfter = .zero
imageGenerator.requestedTimeToleranceBefore = .zero
var actualTime: CMTime = .zero
do {
    let time = CMTime(seconds: seconds, preferredTimescale: timescale)
    let imageRef = try imageGenerator.copyCGImage(at: time, actualTime: &actualTime)
    print(actualTime)
    return UIImage(cgImage: imageRef)
} catch {
    return nil
}

requestedTimeToleranceAfter、requestedTimeToleranceBefore用于指定误差范围;2者都设置zero则为精确的时间点,但是相对来说耗时也长点;
actualTime为获取到的图片的实际视频时间点,如果前面误差都设置zero,则actualTime就是设置的时间

保存视频到系统相册

需使用Photos库

let photoLibrary = PHPhotoLibrary.shared()
photoLibrary.performChanges {
    PHAssetChangeRequest.creationRequestForAssetFromVideo(atFileURL: url)
} completionHandler: { success, error in
    callback(success, error)
}

视频添加水印

视频水印需要使用前面提到的videoComposition.animationTool

// instructions
var trans = videoCompositionTrack.preferredTransform
let size = videoCompositionTrack.naturalSize
let orientation = orientationFromVideo(assetTrack: videoCompositionTrack)
switch orientation {
    case .portrait:
        trans = CGAffineTransform(translationX: size.height, y: 0)
        trans = trans.rotated(by: .pi / 2.0)
    case .landscapeLeft:
        trans = CGAffineTransform(translationX: size.width, y: size.height)
        trans = trans.rotated(by: .pi)
    case .portraitUpsideDown:
        trans = CGAffineTransform(translationX: 0, y: size.width)
        trans = trans.rotated(by: .pi / 2.0 * 3)
    case .landscapeRight:
        // 默认方向
        break
}
let vcLayerInstruction = AVMutableVideoCompositionLayerInstruction(assetTrack: videoCompositionTrack)
vcLayerInstruction.setTransform(trans, at: .zero)
let videoComposition = AVMutableVideoComposition()
videoComposition.frameDuration = CMTime(value: 1, timescale: 30)
videoComposition.renderSize = composition.naturalSize
let vcInstruction = AVMutableVideoCompositionInstruction()
vcInstruction.timeRange = CMTimeRange(start: .zero, duration: composition.duration)
vcInstruction.layerInstructions = [vcLayerInstruction]
videoComposition.instructions = [vcInstruction]
// animationTool
let renderFrame = CGRect(origin: .zero, size: size)
let imageLayer = CALayer()
let textLayer = CATextLayer()
let watermarkLayer = CALayer()
let videoLayer = CALayer()
let animationLayer = CALayer()
// 水印layer 可以包含多个图层
watermarkLayer.frame = wmframe
imageLayer.frame = watermarkLayer.bounds
imageLayer.contents = wmImage.cgImage
textLayer.frame = watermarkLayer.bounds
textLayer.string = wmText
textLayer.foregroundColor = UIColor.red.cgColor
textLayer.fontSize = 30
watermarkLayer.addSublayer(imageLayer)
watermarkLayer.addSublayer(textLayer)
watermarkLayer.masksToBounds = true
watermarkLayer.backgroundColor = UIColor.red.cgColor
// 视频layer
videoLayer.frame = renderFrame
// core animation layer
animationLayer.frame = renderFrame
animationLayer.addSublayer(videoLayer)
animationLayer.addSublayer(watermarkLayer)
let animationTool = AVVideoCompositionCoreAnimationTool(postProcessingAsVideoLayer: videoLayer, in:
animationLayer)
videoComposition.animationTool = animationTool
guard let exportSession = AVAssetExportSession(asset: composition, presetName: AVAssetExportPreset1280x720)
else {
    callback(false, nil)
    return
}
exportSession.videoComposition = videoComposition
exportVideo(exportSession, outputUrl, callback)

添加水印主要涉及到3个layer:

  • animationLayer
  • videoLayer
  • 水印图层 watermarkLayer

animationLayer为父layer,videoLayer和水印层都需要添加在此上面;
videoLayer为视频图层、不需要设置内容,只需设置frame,视频数据会渲染这个layer

videoComposition.animationTool是针对图层处理,除了可以添加水印,同样能实现各种转场动画

水印最终结果:

AVAssetExportSession导出视频遇到的错误

在使用AVAssetExportSession导出视频时,经常会遇到各种莫名奇妙的错误;
常见的错误信息/类型:

  • Code=-11821 "Cannot Decode"
    解决方案参考:https://stackoverflow.com/questions/55144267/iphone-xr-xs-avassetexportsession-status-failed-with-error
  • Code=-11838 "Operation Stopped"
  • Code=-11800 "这项操作无法完成"

这些错误大部分原因是因为视频的preset预设、分辨率、编码等不同导致,而AVAssetExportSession除了能设置preset外,对于输出的编码不能进行更加细致的设置;
如果需要对输出的视频编码进行细致的设置,就需要使用AVFoundation另一个类:AVAssetWriter;关于这部分,有时间我会单独另开一篇总结下:

使用AVAssetReader、AVAssetWriter导出视频


完整代码:https://github.com/momoAI/VideoComposition-AV



参考:
https://developer.apple.com/library/archive/documentation/AudioVideo/Conceptual/AVFoundationPG/Articles/03_Editing.html#//apple_ref/doc/uid/TP40010188-CH8-SW1

你可能感兴趣的:(使用AVFoundation处理视频)