Live Streaming Datasets--MPEG DASH编码的基于chunk的视频数据集下载

文章目录

      • MPEG DASH简介
      • MPEG DASH视频的chunk下载
        • 视频来源
        • chunk的下载步骤
        • 代码实现

MPEG DASH简介

  • Manifest.mpd (MPD:Media Presentation Description) 文件是MPEG DASH编码视频的索引文件,该文件包含了整个mpeg dash码流的构成(manifest以.mpd结尾).包括视频流与音频流信息,视频chunk信息,视频码率,帧率,编码,带宽等信息.类似于HLS编码视频的m3u8文件是.MPD是一个XML Document,通过MPD的内容可以构造出用于HTTP GET下载的URL。 详细介绍请看三种主流的流媒体协议MEPG DASH,HLS,Smooth Streaming及其manifest 文件字段解释
  • 其优势就是不需要流媒体服务器缓存该视频的全部视频流,客户端请求视频服务器,服务器将manifest.mpd发送给客户端,客户端可以根据视频的URL+不同的byte range指定(chunk的url)进行任意chunk的下载,更加灵活.详细请看pensieve论文中视频文件的处理.
    Live Streaming Datasets--MPEG DASH编码的基于chunk的视频数据集下载_第1张图片
    Figure 1 illustrates the end-to-end process of streaming a video over HTTP today. As shown, a player embedded in a client application first sends a token to a video service provider for authentication. The provider responds with a manifest file that directs the client to a CDN hosting the video and lists the available bitrates for the video. The client then requests video chunks one by one, using an adaptive bitrate (ABR) algorithm. These algorithms use a variety of different inputs (e.g., playback buffer occupancy, throughput measurements, etc.) to select the bitrate for future chunks. As chunks are downloaded, they are played back to the client; note that playback of a given chunk cannot begin until the entire chunk has been downloaded.

MPEG DASH视频的chunk下载

视频来源

目前普遍使用的视频源来自于DASH264 JavaScript reference client test page,其中视频的编码方式是H.264/MPEG-4 AVC

chunk的下载步骤

  1. 通过DASH264 JavaScript reference client test page获取所需视频的Manifest.mpd.根据Manifest.mpd中的period字段信息,我们可以获取到该视频一共包括几种视频流,每一种视频流的编码,带宽等信息.
    Live Streaming Datasets--MPEG DASH编码的基于chunk的视频数据集下载_第2张图片
  2. 通过DASH264 JavaScript reference client test page获取所需视频的base_url
    Live Streaming Datasets--MPEG DASH编码的基于chunk的视频数据集下载_第3张图片
  3. 根据Manifest.mpd中的period/segmentTemplate信息,构建每一种视频流的每个chunk(.m4s文件,视频片段)的url.然后通过requests.get(url)逐一下载视频片段(chunk).

代码实现

  • chunk视频文件下载
# This script scraped all of the video chunks (65 per encoding)
# 本例子所采用的视频其chunk大小为2s,一共97个chunk

import requests
import os

# 根据DASH264 JavaScript reference client test page获取所需视频的基础url
base_url = "https://dash.akamaized.net/envivio/EnvivioDash3/"
# 根据Manifest.mpd获取该视频一共有几个视频流
medias = ["v%s_257" % (i+1,) for i in range(9)]

# 文件夹创建
def mkdir():
    for dir in medias:
        isExists = os.path.exists(dir)
        if not isExists:
            os.makedirs(dir)
        else:
            continue

# chunk下载,下载media_name视频流的第id个chunk
def download(media_name, id):
    # 根据SegmentTemplete中的url格式,进行chunk url构建
    url = base_url + media_name + "-" + id + ".m4s"
    # 爬虫,通过url爬取该chunk视频内容
    r = requests.get(url)
    if r.status_code == 200:
        with open(media_name + "/" + id + ".m4s", "wb") as f:
            f.write(r.content)
            return True
    return False

def main():
    mkdir()
    for m in medias:
        # header.m4s包含了该视频流的编解码信息,是该视频流的初始化信息
        download(m, "Header")
        # 在SegmentTemplete字段中, 令 i = startNumber
        i = 1
        while download(m, "270146-i-"+str(i)):
            print("Downloaded %s %s" % (m, i))
            i += 1
            
main()
  • chunk大小读取
import os

TOTAL_VIDEO_CHUNCK = 97
BITRATE_LEVELS = 9
VIDEO_PATH = '../chunk_datasets/Envivio/'
VIDEO_FOLDER = 'v'

# assume videos are in ../video_servers/video[1, 2, 3, 4, 5]
# the quality at video5 is the lowest and video1 is the highest

for bitrate in range(BITRATE_LEVELS):
    with open('video_size_' + str(bitrate), 'w') as f:
        for chunk_num in range(1, TOTAL_VIDEO_CHUNCK+1):
            video_chunk_path = VIDEO_PATH + \
                                VIDEO_FOLDER + str(BITRATE_LEVELS - bitrate) + "_257" + \
                                '/' + "270146-i-" + str(chunk_num) + '.m4s'
            chunk_size = os.path.getsize(video_chunk_path)
            f.write(str(chunk_size) + '\n')

如有任何问题,欢迎留言.

你可能感兴趣的:(Media,Streaming)