原文地址:http://www.zhiboshequ.com/news/604.html
随着移动互联网应用的大规模普及,移动端的视频播放体验日益受到重视。
2016年11月17日,由Google与Microsoft等互联网巨头主导的HTML5 MSE扩展标准已经正式发布,这标志着以往由Apple公司发布的HLS协议标准将会很快退出历史舞台,就像PC端的 Adobe Flash Player 播放器一样,都将被新的通用性更强的技术标准所代替。这些变化最终都将大幅提升终端用户的视频播放体验,无疑将受到行业的广泛支持和拥抱。
下面是MSE的 推荐标准,国内目前已有公司按照该标准在产品中有了具体实现。
标准原文如下:
Please check the errata for any errors or issues reported since publication.
The English version of this specification is the only normative version. Non-normativetranslations may also be available.
Copyright © 2016W3C® (MIT,ERCIM,Keio,Beihang). W3C liability, trademark and permissive document license rules apply.
This specification extends HTMLMediaElement
[HTML51] to allow JavaScript to generate media streams for playback. Allowing JavaScript to generate streams facilitates a variety of use cases like adaptive streaming and time shifting live streams.
This section describes the status of this document at the time of its publication. Other documents may supersede this document. A list of currentW3C publications and the latest revision of this technical report can be found in theW3C technical reports index at https://www.w3.org/TR/.
The working group maintains a list of all bug reports. New features for this specification are expected to be incubated in theWeb Platform Incubator Community Group.
One editorial issue (removing the exposure ofcreateObjectURL(mediaSource)
in workers) was addressed since the previous publication. For the list of changes done since the previous version, see thecommits.
By publishing this Recommendation, W3C expects the functionality specified in this Recommendation will not be affected by changes to File API. The Working Group will continue to track these specifications.
This document was published by the HTML Media Extensions Working Group as a Recommendation. If you wish to make comments regarding this document, theGitHub repository is preferred for discussion of this specification. Historical discussion can also be found in themailing list archives).
In September 2016, the Working Group used an implementation report to move this document to Recommendation.
This document has been reviewed by W3C Members, by software developers, and by otherW3C groups and interested parties, and is endorsed by the Director as aW3C Recommendation. It is a stable document and may be used as reference material or cited from another document.W3C's role in making the Recommendation is to draw attention to the specification and to promote its widespread deployment. This enhances the functionality and interoperability of the Web.
This document was produced by a group operating under the 5 February 2004 W3C Patent Policy.W3C maintains apublic list of any patent disclosures made in connection with the deliverables of the group; that page also includes instructions for disclosing a patent. An individual who has actual knowledge of a patent which the individual believes containsEssential Claim(s) must disclose the information in accordance with section 6 of the W3C Patent Policy.
This document is governed by the 1 September 2015 W3C Process Document.
This section is non-normative.
This specification allows JavaScript to dynamically construct media streams for
This specification was designed with the following goals in mind:
This specification defines:
The track buffers that provide coded frames for the enabled
audioTracks
, theselected
videoTracks
, and the"showing"
or"hidden"
textTracks
. All these tracks are associated withSourceBuffer
objects in theactiveSourceBuffers
list.
A presentation timestamp range used to filter out coded frames while appending. The append window represents a single continuous time range with a single start time and end time. Coded frames withpresentation timestamp within this range are allowed to be appended to the SourceBuffer
while coded frames outside this range are filtered out. The append window start and end times are controlled by theappendWindowStart
andappendWindowEnd
attributes respectively.
A unit of media data that has a presentation timestamp, a decode timestamp, and a coded frame duration.
The duration of a coded frame. For video and text, the duration indicates how long the video frame or textSHOULD be displayed. For audio, the duration represents the sum of all the samples contained within the coded frame. For example, if an audio frame contained 441 samples @44100Hz the frame duration would be 10 milliseconds.
The sum of a coded frame presentation timestamp and its coded frame duration. It represents the presentation timestamp that immediately follows the coded frame.
A group of coded frames that are adjacent and have monotonically increasing decode timestamps without any gaps. Discontinuities detected by the coded frame processing algorithm and abort()
calls trigger the start of a new coded frame group.
The decode timestamp indicates the latest time at which the frame needs to be decoded assuming instantaneous decoding and rendering of this and any dependant frames (this is equal to thepresentation timestamp of the earliest frame, in presentation order, that is dependant on this frame). If frames can be decoded out ofpresentation order, then the decode timestampMUST be present in or derivable from the byte stream. The user agentMUST run the append error algorithm if this is not the case. If frames cannot be decoded out ofpresentation order and a decode timestamp is not present in the byte stream, then the decode timestamp is equal to thepresentation timestamp.
A sequence of bytes that contain all of the initialization information required to decode a sequence ofmedia segments. This includes codec initialization data,Track ID mappings for multiplexed segments, and timestamp offsets (e.g., edit lists).
The byte stream format specifications in the byte stream format registry [MSE-REGISTRY] contain format specific examples.
A sequence of bytes that contain packetized & timestamped media data for a portion of themedia timeline. Media segments are always associated with the most recently appendedinitialization segment.
The byte stream format specifications in the byte stream format registry [MSE-REGISTRY] contain format specific examples.
A MediaSource object URL is a unique Blob URI [FILE-API] created bycreateObjectURL()
. It is used to attach aMediaSource
object to an HTMLMediaElement.
These URLs are the same as a Blob URI, except that anything in the definition of that feature that refers toFile andBlob objects is hereby extended to also apply to MediaSource
objects.
The origin of the MediaSource object URL is the relevant settings object of this
during the call to createObjectURL()
.
For example, the origin of the MediaSource object URL affects the way that the media element isconsumed by canvas.
The parent media source of a SourceBuffer
object is the MediaSource
object that created it.
The presentation start time is the earliest time point in the presentation and specifies theinitial playback position andearliest possible position. All presentations created using this specification have a presentation start time of 0.
For the purposes of determining if HTMLMediaElement.buffered
contains aTimeRange
that includes the current playback position, implementationsMAY choose to allow a current playback position at or afterpresentation start time and before the first TimeRange
to play the firstTimeRange
if thatTimeRange
starts within a reasonably short time, like 1 second, afterpresentation start time. This allowance accommodates the reality that muxed streams commonly do not begin all tracks precisely atpresentation start time. ImplementationsMUST report the actual buffered range, regardless of this allowance.
The presentation interval of a coded frame is the time interval from its presentation timestamp to the presentation timestamp plus the coded frame's duration. For example, if a coded frame has a presentation timestamp of 10 seconds and acoded frame duration of 100 milliseconds, then the presentation interval would be [10-10.1). Note that the start of the range is inclusive, but the end of the range is exclusive.
The order that coded frames are rendered in the presentation. The presentation order is achieved by orderingcoded frames in monotonically increasing order by theirpresentation timestamps.
A reference to a specific time in the presentation. The presentation timestamp in acoded frame indicates when the frameSHOULD be rendered.
A position in a media segment where decoding and continuous playback can begin without relying on any previous data in the segment. For video this tends to be the location of I-frames. In the case of audio, most audio frames can be treated as a random access point. Since video tracks tend to have a more sparse distribution of random access points, the location of these points are usually considered the random access points for multiplexed streams.
The specific byte stream format specification that describes the format of the byte stream accepted by aSourceBuffer
instance. Thebyte stream format specification, for a SourceBuffer
object, is selected based on the type passed to theaddSourceBuffer()
call that created the object.
A specific set of tracks distributed across one or more SourceBuffer
objects owned by a single MediaSource
instance.
Implementations MUST support at least 1 MediaSource
object with the following configurations:
MediaSource objects MUST support each of the configurations above, but they are only required to support one configuration at a time. Supporting multiple configurations at once or additional configurations is a quality of implementation issue.
A byte stream format specific structure that provides the Track ID, codec configuration, and other metadata for a single track. Each track description inside a singleinitialization segment has a uniqueTrack ID. The user agentMUST run the append error algorithm if the Track ID is not unique within the initialization segment.
A Track ID is a byte stream format specific identifier that marks sections of the byte stream as being part of a specific track. The Track ID in atrack description identifies which sections of amedia segment belong to that track.
The MediaSource object represents a source of media data for an HTMLMediaElement. It keeps track of thereadyState
for this source as well as a list ofSourceBuffer
objects that can be used to add media data to the presentation. MediaSource objects are created by the web application and then attached to an HTMLMediaElement. The application uses theSourceBuffer
objects insourceBuffers
to add media data to this source. The HTMLMediaElement fetches this media data from theMediaSource
object when it is needed during playback.
Each MediaSource
object has a live seekable range variable that stores anormalized TimeRanges object. This variable is initialized to an empty TimeRanges
object when theMediaSource
object is created, is maintained by setLiveSeekableRange()
andclearLiveSeekableRange()
, and is used inHTMLMediaElement Extensions to modifyHTMLMediaElement.seekable
behavior.
enum ReadyState
{
"closed",
"open",
"ended"
};
Enumeration description | |
---|---|
closed |
Indicates the source is not currently attached to a media element. |
open |
The source has been opened by a media element and is ready for data to be appended to theSourceBuffer objects insourceBuffers . |
ended |
The source is still attached to a media element, but endOfStream() has been called. |
enum EndOfStreamError {
"network",
"decode"
};
Enumeration description | |
---|---|
network |
Terminates playback and signals that a network error has occured.
Note
JavaScript applications SHOULD use this status code to terminate playback with a network error. For example, if a network error occurs while fetching media data. |
decode |
Terminates playback and signals that a decoding error has occured.
Note
JavaScript applications SHOULD use this status code to terminate playback with a decode error. For example, if a parsing error occurs while processing out-of-band media data. |
[Constructor]
interface MediaSource : EventTarget {
readonly attribute SourceBufferList
sourceBuffers
;
readonly attribute SourceBufferList
activeSourceBuffers
;
readonly attribute ReadyState
readyState
;
attribute unrestricted double
duration
;
attribute EventHandler
onsourceopen
;
attribute EventHandler
onsourceended
;
attribute EventHandler
onsourceclose
;
SourceBuffer
addSourceBuffer
(DOMString
type);
void
removeSourceBuffer
(SourceBuffer
sourceBuffer);
void
endOfStream
(optional EndOfStreamError
error);
void
setLiveSeekableRange
(double
start, double
end);
void
clearLiveSeekableRange
();
static boolean
isTypeSupported
(DOMString
type);
};
sourceBuffers
of type
SourceBufferList
, readonly
SourceBuffer
objects associated with this
MediaSource
. When
readyState
equals
"closed"
this list will be empty. Once
readyState
transitions to
"open"
SourceBuffer objects can be added to this list by using
addSourceBuffer()
.
activeSourceBuffers
of type
SourceBufferList
, readonly
Contains the subset of sourceBuffers
that are providing theselected video track, the enabled audio track(s), and the "showing"
or"hidden"
text track(s).
SourceBuffer
objects in this listMUST appear in the same order as they appear in thesourceBuffers
attribute; e.g., if only sourceBuffers[0] and sourceBuffers[3] are inactiveSourceBuffers
, then activeSourceBuffers[0]MUST equal sourceBuffers[0] and activeSourceBuffers[1]MUST equal sourceBuffers[3].
The Changes to selected/enabled track state section describes how this attribute gets updated.
readyState
of type
ReadyState
, readonly
Indicates the current state of the MediaSource
object. When the MediaSource
is created readyState
MUST be set to"closed"
.
duration
of type
unrestricted double
Allows the web application to set the presentation duration. The duration is initially set to NaN when theMediaSource
object is created.
On getting, run the following steps:
readyState
attribute is"closed"
then return NaN and abort these steps.On setting, run the following steps:
TypeError
exception and abort these steps.readyState
attribute is not"open"
then throw anInvalidStateError
exception and abort these steps.updating
attribute equals true on anySourceBuffer
in sourceBuffers
, then throw anInvalidStateError
exception and abort these steps.The duration change algorithm will adjust new duration higher if there is any currently buffered coded frame with a higher end time.
appendBuffer()
andendOfStream()
can update the duration under certain circumstances.
onsourceopen
of type
EventHandler
The event handler for the sourceopen
event.
onsourceended
of type
EventHandler
The event handler for the sourceended
event.
onsourceclose
of type
EventHandler
The event handler for the sourceclose
event.
addSourceBuffer
Adds a new SourceBuffer
to sourceBuffers
.
Parameter | Type | Nullable | Optional | Description |
---|---|---|---|---|
type | DOMString |
✘ | ✘ |
SourceBuffer
When this method is invoked, the user agent must run the following steps:
TypeError
exception and abort these steps.SourceBuffer
objects in sourceBuffers
, then throw aNotSupportedError
exception and abort these steps.QuotaExceededError
exception and abort these steps.
For example, a user agent MAY throw a QuotaExceededError
exception if the media element has reached theHAVE_METADATA
readyState. This can occur if the user agent's media engine does not support adding more tracks during playback.
readyState
attribute is not in the"open"
state then throw anInvalidStateError
exception and abort these steps.SourceBuffer
object and associated resources.mode
attribute on the new object to
"sequence"
.
mode
attribute on the new object to
"segments"
.
sourceBuffers
andqueue a task tofire a simple event namedaddsourcebuffer
atsourceBuffers
.removeSourceBuffer
Removes a SourceBuffer
from sourceBuffers
.
Parameter | Type | Nullable | Optional | Description |
---|---|---|---|---|
sourceBuffer | SourceBuffer |
✘ | ✘ |
void
When this method is invoked, the user agent must run the following steps:
sourceBuffers
then throw aNotFoundError
exception and abort these steps.updating
attribute equals true, then run the following steps:
updating
attribute to false.abort
atsourceBuffer.updateend
atsourceBuffer.AudioTrackList
object returned bysourceBuffer.audioTracks
.AudioTrackList
object returned by theaudioTracks
attribute on the HTMLMediaElement.AudioTrack
object in theSourceBuffer audioTracks list, run the following steps:
sourceBuffer
attribute on theAudioTrack
object to null.AudioTrack
object from theHTMLMediaElement audioTracks list.
This should trigger AudioTrackList
[HTML51] logic to queue a task to fire a trusted event named removetrack
, that does not bubble and is not cancelable, and that uses theTrackEvent
interface, with thetrack
attribute initialized to theAudioTrack
object, at theHTMLMediaElement audioTracks list. If the enabled
attribute on theAudioTrack
object was true at the beginning of this removal step, then this should also triggerAudioTrackList
[HTML51] logic to queue a task to fire a simple event named change
at theHTMLMediaElement audioTracks list
AudioTrack
object from theSourceBuffer audioTracks list.
This should trigger AudioTrackList
[HTML51] logic to queue a task to fire a trusted event named removetrack
, that does not bubble and is not cancelable, and that uses theTrackEvent
interface, with thetrack
attribute initialized to theAudioTrack
object, at theSourceBuffer audioTracks list. If the enabled
attribute on theAudioTrack
object was true at the beginning of this removal step, then this should also triggerAudioTrackList
[HTML51] logic to queue a task to fire a simple event named change
at theSourceBuffer audioTracks list
VideoTrackList
object returned bysourceBuffer.videoTracks
.VideoTrackList
object returned by thevideoTracks
attribute on the HTMLMediaElement.VideoTrack
object in theSourceBuffer videoTracks list, run the following steps:
sourceBuffer
attribute on theVideoTrack
object to null.VideoTrack
object from theHTMLMediaElement videoTracks list.
This should trigger VideoTrackList
[HTML51] logic to queue a task to fire a trusted event named removetrack
, that does not bubble and is not cancelable, and that uses theTrackEvent
interface, with thetrack
attribute initialized to theVideoTrack
object, at theHTMLMediaElement videoTracks list. If the selected
attribute on theVideoTrack
object was true at the beginning of this removal step, then this should also triggerVideoTrackList
[HTML51] logic to queue a task to fire a simple event named change
at theHTMLMediaElement videoTracks list
VideoTrack
object from theSourceBuffer videoTracks list.
This should trigger VideoTrackList
[HTML51] logic to queue a task to fire a trusted event named removetrack
, that does not bubble and is not cancelable, and that uses theTrackEvent
interface, with thetrack
attribute initialized to theVideoTrack
object, at theSourceBuffer videoTracks list. If the selected
attribute on theVideoTrack
object was true at the beginning of this removal step, then this should also triggerVideoTrackList
[HTML51] logic to queue a task to fire a simple event named change
at theSourceBuffer videoTracks list
TextTrackList
object returned bysourceBuffer.textTracks
.TextTrackList
object returned by thetextTracks
attribute on the HTMLMediaElement.TextTrack
object in theSourceBuffer textTracks list, run the following steps:
sourceBuffer
attribute on theTextTrack
object to null.TextTrack
object from theHTMLMediaElement textTracks list.
This should trigger TextTrackList
[HTML51] logic to queue a task to fire a trusted event named removetrack
, that does not bubble and is not cancelable, and that uses theTrackEvent
interface, with thetrack
attribute initialized to theTextTrack
object, at theHTMLMediaElement textTracks list. If the mode
attribute on theTextTrack
object was"showing"
or"hidden"
at the beginning of this removal step, then this should also triggerTextTrackList
[HTML51] logic to queue a task to fire a simple event named change
at theHTMLMediaElement textTracks list.
TextTrack
object from theSourceBuffer textTracks list.
This should trigger TextTrackList
[HTML51] logic to queue a task to fire a trusted event named removetrack
, that does not bubble and is not cancelable, and that uses theTrackEvent
interface, with thetrack
attribute initialized to theTextTrack
object, at theSourceBuffer textTracks list. If the mode
attribute on theTextTrack
object was"showing"
or"hidden"
at the beginning of this removal step, then this should also triggerTextTrackList
[HTML51] logic to queue a task to fire a simple event named change
at theSourceBuffer textTracks list.
activeSourceBuffers
, then removesourceBuffer fromactiveSourceBuffers
andqueue a task tofire a simple event namedremovesourcebuffer
at theSourceBufferList
returned byactiveSourceBuffers
.sourceBuffers
andqueue a task tofire a simple event namedremovesourcebuffer
at theSourceBufferList
returned bysourceBuffers
.endOfStream
Signals the end of the stream.
Parameter | Type | Nullable | Optional | Description |
---|---|---|---|---|
error | EndOfStreamError |
✘ | ✔ |
void
When this method is invoked, the user agent must run the following steps:
readyState
attribute is not in the"open"
state then throw anInvalidStateError
exception and abort these steps.updating
attribute equals true on anySourceBuffer
in sourceBuffers
, then throw anInvalidStateError
exception and abort these steps.setLiveSeekableRange
Updates the live seekable range variable used inHTMLMediaElement Extensions to modify HTMLMediaElement.seekable
behavior.
Parameter | Type | Nullable | Optional | Description |
---|---|---|---|---|
start | double |
✘ | ✘ | The start of the range, in seconds measured from presentation start time. While set, and if duration equals positive Infinity,HTMLMediaElement.seekable will return a non-empty TimeRanges object with a lowest range start timestamp no greater thanstart. |
end | double |
✘ | ✘ | The end of range, in seconds measured from presentation start time. While set, and if duration equals positive Infinity,HTMLMediaElement.seekable will return a non-empty TimeRanges object with a highest range end timestamp no less thanend. |
void
When this method is invoked, the user agent must run the following steps:
readyState
attribute is not"open"
then throw anInvalidStateError
exception and abort these steps. TypeError
exception and abort these steps.clearLiveSeekableRange
Updates the live seekable range variable used inHTMLMediaElement Extensions to modify HTMLMediaElement.seekable
behavior.
void
When this method is invoked, the user agent must run the following steps:
readyState
attribute is not"open"
then throw anInvalidStateError
exception and abort these steps.TimeRanges
object.isTypeSupported
, static
Check to see whether the MediaSource
is capable of creating SourceBuffer
objects for the specified MIME type.
If true is returned from this method, it only indicates that the MediaSource
implementation is capable of creating SourceBuffer
objects for the specified MIME type. An addSourceBuffer()
callSHOULD still fail if sufficient resources are not available to support the addition of a newSourceBuffer
.
This method returning true implies that HTMLMediaElement.canPlayType() will return "maybe" or "probably" since it does not make sense for aMediaSource
to support a type the HTMLMediaElement knows it cannot play.
Parameter | Type | Nullable | Optional | Description |
---|---|---|---|---|
type | DOMString |
✘ | ✘ |
boolean
When this method is invoked, the user agent must run the following steps:
Event name | Interface | Dispatched when... |
---|---|---|
sourceopen |
Event |
readyState transitions from"closed" to"open" or from"ended" to"open" . |
sourceended |
Event |
readyState transitions from"open" to"ended" . |
sourceclose |
Event |
readyState transitions from"open" to"closed" or"ended" to"closed" . |
A MediaSource
object can be attached to a media element by assigning aMediaSource object URL to the media elementsrc
attribute or the src attribute of a
If the resource fetch algorithm was invoked with a media provider object that is a MediaSource object or a URL record whose object is a MediaSource object, then let mode be local, skip the first step in theresource fetch algorithm (which may otherwise set mode to remote) and add the steps and clarifications below to the"Otherwise (mode is local)" section of theresource fetch algorithm.
The resource fetch algorithm's first step is expected to eventually align with selecting local mode for URL records whose objects are media provider objects. The intent is that if the HTMLMediaElement'ssrc
attribute or selected child's
src
attribute is ablob:
URL matching aMediaSource object URL when the respective src
attribute was last changed, then that MediaSource object is used as the media provider object and current media resource in the local mode logic in theresource fetch algorithm. This also means that the remote mode logic that includes observance of any preload attribute is skipped when a MediaSource object is attached. Even with that eventual change to [HTML51], the execution of the following steps at the beginning of the local mode logic is still required when the current media resource is a MediaSource object.
Relative to the action which triggered the media element's resource selection algorithm, these steps are asynchronous. The resource fetch algorithm is run after the task that invoked the resource selection algorithm is allowed to continue and a stable state is reached. Implementations may delay the steps in the "Otherwise" clause, below, until the MediaSource object is ready for use.
readyState
is NOT set to
"closed"
readyState
attribute to"open"
.sourceopen
at theMediaSource
.appendBuffer()
.MediaSource
is attached.An attached MediaSource does not use the remote mode steps in the resource fetch algorithm, so the media element will not fire "suspend" events. Though future versions of this specification will likely remove "progress" and "stalled" events from a media element with an attached MediaSource, user agents conforming to this version of the specification may still fire these two events as these [HTML51] references changed after implementations of this specification stabilized.
The following steps are run in any case where the media element is going to transition toNETWORK_EMPTY andqueue a task to fire a simple event named emptied at the media element. These steps SHOULD be run right before the transition.
readyState
attribute to"closed"
.duration
to NaN.SourceBuffer
objects from activeSourceBuffers
.removesourcebuffer
atactiveSourceBuffers
.SourceBuffer
objects from sourceBuffers
.removesourcebuffer
atsourceBuffers
.sourceclose
at theMediaSource
.Going forward, this algorithm is intended to be externally called and run in any case where the attachedMediaSource
, if any, must be detached from the media element. ItMAY be called on HTMLMediaElement [HTML51] operations like load() and resource fetch algorithm failures in addition to, or in place of, when the media element transitions toNETWORK_EMPTY. Resource fetch algorithm failures are those which abort either the resource fetch algorithm or the resource selection algorithm, with the exception that the "Final step" [HTML51] is not considered a failure that triggers detachment.
Run the following steps as part of the "Wait until the user agent has established whether or not the media data for the new playback position is available, and, if it is, until it has decoded enough data to play back that position" step of theseek algorithm:
The media element looks for media segments containing the new playback position in each SourceBuffer
object in activeSourceBuffers
. Any position within aTimeRange
in the current value of theHTMLMediaElement.buffered
attribute has all necessary media segments buffered for that position.
TimeRange
of
HTMLMediaElement.buffered
HTMLMediaElement.readyState
attribute is greater thanHAVE_METADATA
, then set theHTMLMediaElement.readyState
attribute toHAVE_METADATA
.
Per HTMLMediaElement ready states
[HTML51] logic, HTMLMediaElement.readyState
changes may trigger events on the HTMLMediaElement.
appendBuffer()
call causes thecoded frame processing algorithm to set the HTMLMediaElement.readyState
attribute to a value greater thanHAVE_METADATA
.
The web application can use buffered
andHTMLMediaElement.buffered
to determine what the media element needs to resume playback.
If the readyState
attribute is"ended"
and thenew playback position is within a TimeRange
currently inHTMLMediaElement.buffered
, then the seek operation must continue to completion here even if one or more currently selected or enabled track buffers' largest range end timestamp is less thannew playback position. This condition should only occur due to logic inbuffered
whenreadyState
is"ended"
.
The following steps are periodically run during playback to make sure that all of theSourceBuffer
objects inactiveSourceBuffers
haveenough data to ensure uninterrupted playback. Changes toactiveSourceBuffers
also cause these steps to run because they affect the conditions that trigger state transitions.
Having enough data to ensure uninterrupted playback is an implementation specific condition where the user agent determines that it currently has enough data to play the presentation without stalling for a meaningful period of time. This condition is constantly evaluated to determine when to transition the media element into and out of theHAVE_ENOUGH_DATA
ready state. These transitions indicate when the user agent believes it has enough data buffered or it needs more data respectively.
An implementation MAY choose to use bytes buffered, time buffered, the append rate, or any other metric it sees fit to determine when it has enough data. The metrics usedMAY change during playback so web applicationsSHOULD only rely on the value ofHTMLMediaElement.readyState
to determine whether more data is needed or not.
When the media element needs more data, the user agent SHOULD transition it from HAVE_ENOUGH_DATA
toHAVE_FUTURE_DATA
early enough for a web application to be able to respond without causing an interruption in playback. For example, transitioning when the current playback position is 500ms before the end of the buffered data gives the application roughly 500ms to append more data before playback stalls.
HTMLMediaElement.readyState
attribute equals
HAVE_NOTHING
:
HTMLMediaElement.buffered
does not contain a
TimeRange
for the current playback position:
HTMLMediaElement.readyState
attribute toHAVE_METADATA
.
Per HTMLMediaElement ready states
[HTML51] logic, HTMLMediaElement.readyState
changes may trigger events on the HTMLMediaElement.
HTMLMediaElement.buffered
contains a
TimeRange
that includes the current playback position and enough data to ensure uninterrupted playback:
HTMLMediaElement.readyState
attribute toHAVE_ENOUGH_DATA
.
Per HTMLMediaElement ready states
[HTML51] logic, HTMLMediaElement.readyState
changes may trigger events on the HTMLMediaElement.
HAVE_CURRENT_DATA
.HTMLMediaElement.buffered
contains a
TimeRange
that includes the current playback position and some time beyond the current playback position, then run the following steps:
HTMLMediaElement.readyState
attribute toHAVE_FUTURE_DATA
.
Per HTMLMediaElement ready states
[HTML51] logic, HTMLMediaElement.readyState
changes may trigger events on the HTMLMediaElement.
HAVE_CURRENT_DATA
.HTMLMediaElement.buffered
contains a
TimeRange
that ends at the current playback position and does not have a range covering the time immediately after the current position:
HTMLMediaElement.readyState
attribute toHAVE_CURRENT_DATA
.
Per HTMLMediaElement ready states
[HTML51] logic, HTMLMediaElement.readyState
changes may trigger events on the HTMLMediaElement.
During playback activeSourceBuffers
needs to be updated if theselected video track, the enabled audio track(s), or a text track mode changes. When one or more of these changes occur the following steps need to be followed.
SourceBuffer
associated with the previously selected video track is not associated with any other enabled tracks, run the following steps:
SourceBuffer
from activeSourceBuffers
.removesourcebuffer
atactiveSourceBuffers
SourceBuffer
associated with the newly selected video track is not already inactiveSourceBuffers
, run the following steps:
SourceBuffer
to activeSourceBuffers
.addsourcebuffer
atactiveSourceBuffers
SourceBuffer
associated with this track is not associated with any other enabled or selected track, then run the following steps:
SourceBuffer
associated with the audio track from activeSourceBuffers
removesourcebuffer
atactiveSourceBuffers
SourceBuffer
associated with this track is not already in
activeSourceBuffers
, then run the following steps:
SourceBuffer
associated with the audio track to activeSourceBuffers
addsourcebuffer
atactiveSourceBuffers
"disabled"
and the
SourceBuffer
associated with this track is not associated with any other enabled or selected track, then run the following steps:
SourceBuffer
associated with the text track from activeSourceBuffers
removesourcebuffer
atactiveSourceBuffers
"showing"
or
"hidden"
and the
SourceBuffer
associated with this track is not already in
activeSourceBuffers
, then run the following steps:
SourceBuffer
associated with the text track to activeSourceBuffers
addsourcebuffer
atactiveSourceBuffers
Follow these steps when duration
needs to change to anew duration.
duration
is equal tonew duration, then return.SourceBuffer
objects in sourceBuffers
, then throw anInvalidStateError
exception and abort these steps.
Duration reductions that would truncate currently buffered media are disallowed. When truncation is necessary, useremove()
to reduce the buffered range before updatingduration
.
SourceBuffer
objects in sourceBuffers
.This condition can occur because the coded frame removal algorithm preserves coded frames that start before the start of the removal range.
duration
tonew duration.media duration
tonew duration and run theHTMLMediaElement duration change algorithm.This algorithm gets called when the application signals the end of stream via anendOfStream()
call or an algorithm needs to signal a decode error. This algorithm takes anerror parameter that indicates whether an error will be signalled.
readyState
attribute value to"ended"
.sourceended
at theMediaSource
.SourceBuffer
objects in sourceBuffers
.
This allows the duration to properly reflect the end of the appended media segments. For example, if the duration was explicitly set to 10 seconds and only media segments for 0 to 5 seconds were appended before endOfStream() was called, then the duration will get updated to 5 seconds.
"network"
HTMLMediaElement.readyState
attribute equals
HAVE_NOTHING
HTMLMediaElement.readyState
attribute is greater than
HAVE_NOTHING
"decode"
HTMLMediaElement.readyState
attribute equals
HAVE_NOTHING
HTMLMediaElement.readyState
attribute is greater than
HAVE_NOTHING
enum AppendMode {
"segments",
"sequence"
};
Enumeration description | |
---|---|
segments |
The timestamps in the media segment determine where the coded frames are placed in the presentation. Media segments can be appended in any order. |
sequence |
Media segments will be treated as adjacent in time independent of the timestamps in the media segment. Coded frames in a new media segment will be placed immediately after the coded frames in the previous media segment. The |
interface SourceBuffer : EventTarget {
attribute AppendMode
mode
;
readonly attribute boolean
updating
;
readonly attribute TimeRanges
buffered
;
attribute double
timestampOffset
;
readonly attribute AudioTrackList
audioTracks
;
readonly attribute VideoTrackList
videoTracks
;
readonly attribute TextTrackList
textTracks
;
attribute double
appendWindowStart
;
attribute unrestricted double
appendWindowEnd
;
attribute EventHandler
onupdatestart
;
attribute EventHandler
onupdate
;
attribute EventHandler
onupdateend
;
attribute EventHandler
onerror
;
attribute EventHandler
onabort
;
void
appendBuffer
(BufferSource
data);
void
abort
();
void
remove
(double
start, unrestricted double
end);
};
mode
of type
AppendMode
Controls how a sequence of media segments are handled. This attribute is initially set by addSourceBuffer()
after the object is created.
On getting, Return the initial value or the last value that was successfully set.
On setting, run the following steps:
sourceBuffers
attribute of theparent media source, then throw an InvalidStateError
exception and abort these steps.updating
attribute equals true, then throw anInvalidStateError
exception and abort these steps."segments"
, then throw aTypeError
exception and abort these steps.If the readyState
attribute of theparent media source is in the "ended"
state then run the following steps:
readyState
attribute of theparent media source to "open"
sourceopen
at theparent media source.InvalidStateError
and abort these steps."sequence"
, then set thegroup start timestamp to thegroup end timestamp.updating
of type
boolean
, readonly
Indicates whether the asynchronous continuation of an appendBuffer()
orremove()
operation is still being processed. This attribute is initially set to false when the object is created.
buffered
of type
TimeRanges
, readonly
Indicates what TimeRanges
are buffered in theSourceBuffer
. This attribute is initially set to an empty TimeRanges
object when the object is created.
When the attribute is read the following steps MUST occur:
sourceBuffers
attribute of theparent media source then throw an InvalidStateError
exception and abort these steps.SourceBuffer
object.TimeRange
object containing a single range from 0 tohighest end time.SourceBuffer
, run the following steps:
Text track-buffers are included in the calculation of highest end time, above, but excluded from the buffered range calculation here. They are not necessarily continuous, nor should any discontinuity within them trigger playback stall when the other media tracks are continuous over the same time range.
readyState
is"ended"
, then set the end time on the last range intrack ranges to highest end time.timestampOffset
of type
double
Controls the offset applied to timestamps inside subsequent media segments that are appended to this SourceBuffer
. The timestampOffset
is initially set to 0 which indicates that no offset is being applied.
On getting, Return the initial value or the last value that was successfully set.
On setting, run the following steps:
sourceBuffers
attribute of theparent media source, then throw an InvalidStateError
exception and abort these steps.updating
attribute equals true, then throw anInvalidStateError
exception and abort these steps.If the readyState
attribute of theparent media source is in the "ended"
state then run the following steps:
readyState
attribute of theparent media source to "open"
sourceopen
at theparent media source.InvalidStateError
and abort these steps.mode
attribute equals"sequence"
, then set thegroup start timestamp tonew timestamp offset.audioTracks
of type
AudioTrackList
, readonly
AudioTrack
objects created by this object.
videoTracks
of type
VideoTrackList
, readonly
VideoTrack
objects created by this object.
textTracks
of type
TextTrackList
, readonly
TextTrack
objects created by this object.
appendWindowStart
of type
double
The presentation timestamp for the start of the append window. This attribute is initially set to the presentation start time.
On getting, Return the initial value or the last value that was successfully set.
On setting, run the following steps:
sourceBuffers
attribute of theparent media source, then throw an InvalidStateError
exception and abort these steps.updating
attribute equals true, then throw anInvalidStateError
exception and abort these steps.appendWindowEnd
then throw aTypeError
exception and abort these steps.appendWindowEnd
of type
unrestricted double
The presentation timestamp for the end of the append window. This attribute is initially set to positive Infinity.
On getting, Return the initial value or the last value that was successfully set.
On setting, run the following steps:
sourceBuffers
attribute of theparent media source, then throw an InvalidStateError
exception and abort these steps.updating
attribute equals true, then throw anInvalidStateError
exception and abort these steps.TypeError
and abort these steps.appendWindowStart
then throw aTypeError
exception and abort these steps.onupdatestart
of type
EventHandler
The event handler for the updatestart
event.
onupdate
of type
EventHandler
The event handler for the update
event.
onupdateend
of type
EventHandler
The event handler for the updateend
event.
onerror
of type
EventHandler
The event handler for the error
event.
onabort
of type
EventHandler
The event handler for the abort
event.
appendBuffer
Appends the segment data in an BufferSource
[WEBIDL] to the source buffer.
Parameter | Type | Nullable | Optional | Description |
---|---|---|---|---|
data | BufferSource |
✘ | ✘ |
void
When this method is invoked, the user agent must run the following steps:
updating
attribute to true.updatestart
at thisSourceBuffer
object.abort
Aborts the current segment and resets the segment parser.
void
When this method is invoked, the user agent must run the following steps:
sourceBuffers
attribute of theparent media source then throw an InvalidStateError
exception and abort these steps.readyState
attribute of theparent media source is not in the "open"
state then throw anInvalidStateError
exception and abort these steps.InvalidStateError
exception and abort these steps.updating
attribute equals true, then run the following steps:
updating
attribute to false.abort
at thisSourceBuffer
object.updateend
at thisSourceBuffer
object.appendWindowStart
to thepresentation start time.appendWindowEnd
to positive Infinity.remove
Removes media for a specific time range.
Parameter | Type | Nullable | Optional | Description |
---|---|---|---|---|
start | double |
✘ | ✘ | The start of the removal range, in seconds measured from presentation start time. |
end | unrestricted double |
✘ | ✘ | The end of the removal range, in seconds measured from presentation start time. |
void
When this method is invoked, the user agent must run the following steps:
sourceBuffers
attribute of theparent media source then throw an InvalidStateError
exception and abort these steps.updating
attribute equals true, then throw anInvalidStateError
exception and abort these steps.duration
equals NaN, then throw aTypeError
exception and abort these steps.duration
, then throw aTypeError
exception and abort these steps.TypeError
exception and abort these steps.If the readyState
attribute of theparent media source is in the "ended"
state then run the following steps:
readyState
attribute of theparent media source to "open"
sourceopen
at theparent media source.A track buffer stores the track descriptions and coded frames for an individual track. The track buffer is updated as initialization segments and media segments are appended to the SourceBuffer
.
Each track buffer has a last decode timestamp variable that stores the decode timestamp of the lastcoded frame appended in the currentcoded frame group. The variable is initially unset to indicate that nocoded frames have been appended yet.
Each track buffer has a last frame duration variable that stores thecoded frame duration of the lastcoded frame appended in the current coded frame group. The variable is initially unset to indicate that no coded frames have been appended yet.
Each track buffer has a highest end timestamp variable that stores the highestcoded frame end timestamp across allcoded frames in the current coded frame group that were appended to this track buffer. The variable is initially unset to indicate that nocoded frames have been appended yet.
Each track buffer has a need random access point flag variable that keeps track of whether the track buffer is waiting for arandom access point coded frame. The variable is initially set to true to indicate that random access point coded frame is needed before anything can be added to the track buffer.
Each track buffer has a track buffer ranges variable that represents the presentation time ranges occupied by thecoded frames currently stored in the track buffer.
For track buffer ranges, these presentation time ranges are based on presentation timestamps, frame durations, and potentially coded frame group start times for coded frame groups across track buffers in a muxedSourceBuffer
.
For specification purposes, this information is treated as if it were stored in anormalized TimeRanges object. Intersectedtrack buffer ranges are used to report HTMLMediaElement.buffered
, andMUST therefore support uninterrupted playback within each range ofHTMLMediaElement.buffered
.
These coded frame group start times differ slightly from those mentioned in thecoded frame processing algorithm in that they are the earliestpresentation timestamp across all track buffers following a discontinuity. Discontinuities can occur within thecoded frame processing algorithm or result from the coded frame removal algorithm, regardless of mode
. The threshold for determining disjointness oftrack buffer ranges is implementation-specific. For example, to reduce unexpected playback stalls, implementationsMAY approximate thecoded frame processing algorithm's discontinuity detection logic by coalescing adjacent ranges separated by a gap smaller than 2 times the maximum frame duration buffered so far in thistrack buffer. ImplementationsMAY also use coded frame group start times as range start times acrosstrack buffers in a muxedSourceBuffer
to further reduce unexpected playback stalls.
Event name | Interface | Dispatched when... |
---|---|---|
updatestart |
Event |
updating transitions from false to true. |
update |
Event |
The append or remove has successfully completed. updating transitions from true to false. |
updateend |
Event |
The append or remove has ended. |
error |
Event |
An error occurred during the append. updating transitions from true to false. |
abort |
Event |
The append or remove was aborted by an abort() call.updating transitions from true to false. |
All SourceBuffer objects have an internal append state variable that keeps track of the high-level segment parsing state. It is initially set toWAITING_FOR_SEGMENT and can transition to the following states as data is appended.
Append state name | Description |
---|---|
WAITING_FOR_SEGMENT | Waiting for the start of an initialization segment or media segment to be appended. |
PARSING_INIT_SEGMENT | Currently parsing an initialization segment. |
PARSING_MEDIA_SEGMENT | Currently parsing a media segment. |
The input buffer is a byte buffer that is used to hold unparsed bytes acrossappendBuffer()
calls. The buffer is empty when the SourceBuffer object is created.
The buffer full flag keeps track of whetherappendBuffer()
is allowed to accept more bytes. It is set to false when the SourceBuffer object is created and gets updated as data is appended and removed.
The group start timestamp variable keeps track of the starting timestamp for a newcoded frame group in the"sequence"
mode. It is unset when the SourceBuffer object is created and gets updated when themode
attribute equals"sequence"
and thetimestampOffset
attribute is set, or thecoded frame processing algorithm runs.
The group end timestamp variable stores the highestcoded frame end timestamp across allcoded frames in the current coded frame group. It is set to 0 when the SourceBuffer object is created and gets updated by thecoded frame processing algorithm.
The group end timestamp stores the highestcoded frame end timestamp across all track buffers in a SourceBuffer
. Therefore, care should be taken in setting the mode
attribute when appending multiplexed segments in which the timestamps are not aligned across tracks.
The generate timestamps flag is a boolean variable that keeps track of whether timestamps need to be generated for thecoded frames passed to thecoded frame processing algorithm. This flag is set byaddSourceBuffer()
when the SourceBuffer object is created.
When the segment parser loop algorithm is invoked, run the following steps:
If the append state equalsWAITING_FOR_SEGMENT, then run the following steps:
If the append state equalsPARSING_INIT_SEGMENT, then run the following steps:
If the append state equalsPARSING_MEDIA_SEGMENT, then run the following steps:
The frequency at which the coded frame processing algorithm is run is implementation-specific. The coded frame processing algorithmMAY be called when the input buffer contains the complete media segment or itMAY be called multiple times as complete coded frames are added to the input buffer.
SourceBuffer
is full and cannot accept more media data, then set thebuffer full flag to true.When the parser state needs to be reset, run the following steps:
mode
attribute equals"sequence"
, then set thegroup start timestamp to thegroup end timestampThis algorithm is called when an error occurs during an append.
updating
attribute to false.error
at thisSourceBuffer
object.updateend
at thisSourceBuffer
object."decode"
.When an append operation begins, the follow steps are run to validate and prepare theSourceBuffer
.
SourceBuffer
has been removed from the sourceBuffers
attribute of theparent media source then throw an InvalidStateError
exception and abort these steps.updating
attribute equals true, then throw anInvalidStateError
exception and abort these steps.HTMLMediaElement.error
attribute is not null, then throw anInvalidStateError
exception and abort these steps.If the readyState
attribute of theparent media source is in the "ended"
state then run the following steps:
readyState
attribute of theparent media source to "open"
sourceopen
at theparent media source.If the buffer full flag equals true, then throw aQuotaExceededError
exception and abort these step.
This is the signal that the implementation was unable to evict enough data to accommodate the append or the append is too big. The web applicationSHOULD useremove()
to explicitly free up space and/or reduce the size of the append.
When appendBuffer()
is called, the following steps are run to process the appended data.
updating
attribute to false.update
at thisSourceBuffer
object.updateend
at thisSourceBuffer
object.Follow these steps when a caller needs to initiate a JavaScript visible range removal operation that blocks other SourceBuffer updates:
updating
attribute to true.updatestart
at thisSourceBuffer
object.updating
attribute to false.update
at thisSourceBuffer
object.updateend
at thisSourceBuffer
object.The following steps are run when the segment parser loop successfully parses a complete initialization segment:
Each SourceBuffer object has an internal first initialization segment received flag that tracks whether the first initialization segment has been appended and received by this algorithm. This flag is set to false when the SourceBuffer is created and updated by the algorithm below.
duration
attribute if it currently equals NaN:
If the first initialization segment received flag is false, then run the following steps:
User agents MAY consider codecs, that would otherwise be supported, as "not supported" here if the codecs were not specified in thetype parameter passed toaddSourceBuffer()
.
For example, MediaSource.isTypeSupported('video/webm;codecs="vp8,vorbis"') may return true, but ifaddSourceBuffer()
was called with 'video/webm;codecs="vp8"' and a Vorbis track appears in theinitialization segment, then the user agentMAY use this step to trigger a decode error.
For each audio track in the initialization segment, run following steps:
AudioTrack
object.id
property onnew audio track.language
property onnew audio track.label
property onnew audio track.kind
property onnew audio track.If audioTracks
.length
equals 0, then run the following steps:
enabled
property onnew audio track to true.audioTracks
attribute on thisSourceBuffer
object.
This should trigger AudioTrackList
[HTML51] logic to queue a task to fire a trusted event named addtrack
, that does not bubble and is not cancelable, and that uses theTrackEvent
interface, with thetrack
attribute initialized tonew audio track, at theAudioTrackList
object referenced by theaudioTracks
attribute on thisSourceBuffer
object.
audioTracks
attribute on the HTMLMediaElement.
This should trigger AudioTrackList
[HTML51] logic to queue a task to fire a trusted event named addtrack
, that does not bubble and is not cancelable, and that uses theTrackEvent
interface, with thetrack
attribute initialized tonew audio track, at theAudioTrackList
object referenced by theaudioTracks
attribute on the HTMLMediaElement.
For each video track in the initialization segment, run following steps:
VideoTrack
object.id
property onnew video track.language
property onnew video track.label
property onnew video track.kind
property onnew video track.If videoTracks
.length
equals 0, then run the following steps:
selected
property onnew video track to true.videoTracks
attribute on thisSourceBuffer
object.
This should trigger VideoTrackList
[HTML51] logic to queue a task to fire a trusted event named addtrack
, that does not bubble and is not cancelable, and that uses theTrackEvent
interface, with thetrack
attribute initialized tonew video track, at theVideoTrackList
object referenced by thevideoTracks
attribute on thisSourceBuffer
object.
videoTracks
attribute on the HTMLMediaElement.
This should trigger VideoTrackList
[HTML51] logic to queue a task to fire a trusted event named addtrack
, that does not bubble and is not cancelable, and that uses theTrackEvent
interface, with thetrack
attribute initialized tonew video track, at theVideoTrackList
object referenced by thevideoTracks
attribute on the HTMLMediaElement.
For each text track in the initialization segment, run following steps:
TextTrack
object.id
property onnew text track.language
property onnew text track.label
property onnew text track.kind
property onnew text track.mode
property onnew text track equals"showing"
or"hidden"
, then setactive track flag to true. textTracks
attribute on thisSourceBuffer
object.
This should trigger TextTrackList
[HTML51] logic to queue a task to fire a trusted event named addtrack
, that does not bubble and is not cancelable, and that uses theTrackEvent
interface, with thetrack
attribute initialized tonew text track, at theTextTrackList
object referenced by thetextTracks
attribute on thisSourceBuffer
object.
textTracks
attribute on the HTMLMediaElement.
This should trigger TextTrackList
[HTML51] logic to queue a task to fire a trusted event named addtrack
, that does not bubble and is not cancelable, and that uses theTrackEvent
interface, with thetrack
attribute initialized tonew text track, at theTextTrackList
object referenced by thetextTracks
attribute on the HTMLMediaElement.
SourceBuffer
to activeSourceBuffers
.addsourcebuffer
atactiveSourceBuffers
If the HTMLMediaElement.readyState
attribute isHAVE_NOTHING
, then run the following steps:
sourceBuffers
havefirst initialization segment received flag set to false, then abort these steps.HTMLMediaElement.readyState
attribute toHAVE_METADATA
.
Per HTMLMediaElement ready states
[HTML51] logic, HTMLMediaElement.readyState
changes may trigger events on the HTMLMediaElement. This particular transition should trigger HTMLMediaElement logic to queue a task to fire a simple event named loadedmetadata
at the media element.
HTMLMediaElement.readyState
attribute is greater thanHAVE_CURRENT_DATA
, then set theHTMLMediaElement.readyState
attribute toHAVE_METADATA
.
Per HTMLMediaElement ready states
[HTML51] logic, HTMLMediaElement.readyState
changes may trigger events on the HTMLMediaElement.
When complete coded frames have been parsed by the segment parser loop then the following steps are run:
For each coded frame in the media segment run the following steps:
Special processing may be needed to determine the presentation and decode timestamps for timed text frames since this information may not be explicitly present in the underlying format or may be dependent on the order of the frames. Some metadata text tracks, like MPEG2-TS PSI data, may only have implied timestamps. Format specific rules for these situationsSHOULD be in thebyte stream format specifications or in separate extension specifications.
Implementations don't have to internally store timestamps in a double precision floating point representation. This representation is used here because it is the represention for timestamps in the HTML spec. The intention here is to make the behavior clear without adding unnecessary complexity to the algorithm to deal with the fact that adding a timestampOffset may cause a timestamp rollover in the underlying timestamp representation used by the byte stream format. Implementations can use any internal timestamp representation they wish, but the addition of timestampOffset SHOULD behave in a similar manner to what would happen if a double precision floating point representation was used.
mode
equals"sequence"
andgroup start timestamp is set, then run the following steps:
timestampOffset
equal togroup start timestamp -presentation timestamp.If timestampOffset
is not 0, then run the following steps:
timestampOffset
to thepresentation timestamp.timestampOffset
to thedecode timestamp.mode
equals
"segments"
:
mode
equals
"sequence"
:
appendWindowStart
, then set theneed random access point flag to true, drop the coded frame, and jump to the top of the loop to start processing the next coded frame.
Some implementations MAY choose to collect some of these coded frames withpresentation timestamp less thanappendWindowStart
and use them to generate a splice at the first coded frame that has apresentation timestamp greater than or equal to appendWindowStart
even if that frame is not arandom access point. Supporting this requires multiple decoders or faster than real-time decoding so for now this behavior will not be a normative requirement.
appendWindowEnd
, then set theneed random access point flag to true, drop the coded frame, and jump to the top of the loop to start processing the next coded frame.
Some implementations MAY choose to collect coded frames withpresentation timestamp less thanappendWindowEnd
andframe end timestamp greater thanappendWindowEnd
and use them to generate a splice across the portion of the collected coded frames within the append window at time of collection, and the beginning portion of later processed frames which only partially overlap the end of the collected coded frames. Supporting this requires multiple decoders or faster than real-time decoding so for now this behavior will not be a normative requirement. In conjunction with collecting coded frames that spanappendWindowStart
, implementationsMAY thus support gapless audio splicing.
This is to compensate for minor errors in frame timestamp computations that can appear when converting back and forth between double precision floating point numbers and rationals. This tolerance allows a frame to replace an existing one as long as it is within 1 microsecond of the existing frame's start time. Frames that come slightly before an existing frame are handled by the removal step below.
Removing all coded frames until the next random access point is a conservative estimate of the decoding dependencies since it assumes all frames between the removed frames and the next random access point depended on the frames that were removed.
The greater than check is needed because bidirectional prediction between coded frames can causepresentation timestamp to not be monotonically increasing even though the decode timestamps are monotonically increasing.
timestampOffset
equal toframe end timestamp.If the HTMLMediaElement.readyState
attribute isHAVE_METADATA
and the newcoded frames causeHTMLMediaElement.buffered
to have aTimeRange
for the current playback position, then set theHTMLMediaElement.readyState
attribute toHAVE_CURRENT_DATA
.
Per HTMLMediaElement ready states
[HTML51] logic, HTMLMediaElement.readyState
changes may trigger events on the HTMLMediaElement.
If the HTMLMediaElement.readyState
attribute isHAVE_CURRENT_DATA
and the newcoded frames causeHTMLMediaElement.buffered
to have aTimeRange
that includes the current playback position and some time beyond the current playback position, then set theHTMLMediaElement.readyState
attribute toHAVE_FUTURE_DATA
.
Per HTMLMediaElement ready states
[HTML51] logic, HTMLMediaElement.readyState
changes may trigger events on the HTMLMediaElement.
If the HTMLMediaElement.readyState
attribute isHAVE_FUTURE_DATA
and the newcoded frames causeHTMLMediaElement.buffered
to have aTimeRange
that includes the current playback position andenough data to ensure uninterrupted playback, then set theHTMLMediaElement.readyState
attribute toHAVE_ENOUGH_DATA
.
Per HTMLMediaElement ready states
[HTML51] logic, HTMLMediaElement.readyState
changes may trigger events on the HTMLMediaElement.
duration
, then run theduration change algorithm with new duration set to the maximum of the current duration and thegroup end timestamp.Follow these steps when coded frames for a specific time range need to be removed from the SourceBuffer:
For each track buffer in this source buffer, run the following steps:
duration
If this track buffer has a random access point timestamp that is greater than or equal to end, then updateremove end timestamp to that random access point timestamp.
Random access point timestamps can be different across tracks because the dependencies betweencoded frames within a track are usually different than the dependencies in another track.
For each removed frame, if the frame has a decode timestamp equal to the last decode timestamp for the frame's track, run the following steps:
mode
equals
"segments"
:
mode
equals
"sequence"
:
Removing all coded frames until the next random access point is a conservative estimate of the decoding dependencies since it assumes all frames between the removed frames and the next random access point depended on the frames that were removed.
If this object is in activeSourceBuffers
, thecurrent playback position is greater than or equal to start and less than theremove end timestamp, andHTMLMediaElement.readyState
is greater thanHAVE_METADATA
, then set theHTMLMediaElement.readyState
attribute toHAVE_METADATA
and stall playback.
Per HTMLMediaElement ready states
[HTML51] logic, HTMLMediaElement.readyState
changes may trigger events on the HTMLMediaElement.
This transition occurs because media data for the current position has been removed. Playback cannot progress until media for thecurrent playback position is appended or the selected/enabled tracks change.
This algorithm is run to free up space in this source buffer when new data is appended.
Implementations MAY use different methods for selectingremoval ranges so web applicationsSHOULD NOT depend on a specific behavior. The web application can use thebuffered
attribute to observe whether portions of the buffered data have been evicted.
Follow these steps when the coded frame processing algorithm needs to generate a splice frame for two overlapping audiocoded frames:
floor(x * sample_rate + 0.5) / sample_rate
).
For example, given the following values:
presentation timestamp and decode timestamp are updated to 10.0125 since 10.01255 is closer to 10 + 100/8000 (10.0125) than 10 + 101/8000 (10.012625)
Some implementations MAY apply fades to/from silence to coded frames on either side of the inserted silence to make the transition less jarring.
This is intended to allow new coded frame to be added to the track buffer as ifoverlapped frame had not been in thetrack buffer to begin with.
If the new coded frame is less than 5 milliseconds in duration, then coded frames that are appended after thenew coded frame will be needed to properly render the splice.
See the audio splice rendering algorithm for details on how this splice frame is rendered.
The following steps are run when a spliced frame, generated by the audio splice frame algorithm, needs to be rendered by the media element:
Follow these steps when the coded frame processing algorithm needs to generate a splice frame for two overlapping timed textcoded frames:
This is intended to allow new coded frame to be added to the track buffer as if it hadn't overlapped any frames intrack buffer to begin with.
SourceBufferList is a simple container object for SourceBuffer
objects. It provides read-only array access and fires events when the list is modified.
interface SourceBufferList : EventTarget {
readonly attribute unsigned long
length
;
attribute EventHandler
onaddsourcebuffer
;
attribute EventHandler
onremovesourcebuffer
;
getter SourceBuffer
(unsigned long
index);
};
length
of type
unsigned long
, readonly
Indicates the number of SourceBuffer
objects in the list.
onaddsourcebuffer
of type
EventHandler
The event handler for the addsourcebuffer
event.
onremovesourcebuffer
of type
EventHandler
The event handler for the removesourcebuffer
event.
getter
Allows the SourceBuffer objects in the list to be accessed with an array operator (i.e., []).
Parameter | Type | Nullable | Optional | Description |
---|---|---|---|---|
index | unsigned long |
✘ | ✘ |
SourceBuffer
When this method is invoked, the user agent must run the following steps:
length
attribute then return undefined and abort these steps.SourceBuffer
object in the list.Event name | Interface | Dispatched when... |
---|---|---|
addsourcebuffer |
Event |
When a SourceBuffer is added to the list. |
removesourcebuffer |
Event |
When a SourceBuffer is removed from the list. |
This section specifies extensions to the URL[FILE-API] object definition.
[Exposed=Window]
partial interface URL {
static DOMString
createObjectURL
(MediaSource
mediaSource);
};
createObjectURL
, static
Creates URLs for MediaSource
objects.
This algorithm is intended to mirror the behavior of the createObjectURL()[FILE-API] method, which does not auto-revoke the created URL. Web authors are encouraged to userevokeObjectURL()[FILE-API] for anyMediaSource object URL that is no longer needed for attachment to a media element.
Parameter | Type | Nullable | Optional | Description |
---|---|---|---|---|
mediaSource | MediaSource |
✘ | ✘ |
DOMString
When this method is invoked, the user agent must run the following steps:
This section specifies what existing attributes on the HTMLMediaElement
MUST return when a MediaSource
is attached to the element.
The HTMLMediaElement.seekable
attribute returns a new staticnormalized TimeRanges object created based on the following steps:
duration
equals NaN:
TimeRanges
object.
duration
equals positive Infinity:
HTMLMediaElement.buffered
attribute.HTMLMediaElement.buffered
attribute returns an emptyTimeRanges
object, then return an emptyTimeRanges
object and abort these steps.HTMLMediaElement.buffered
attribute.duration
.
The HTMLMediaElement.buffered
attribute returns a staticnormalized TimeRanges object based on the following steps.
TimeRanges
object.activeSourceBuffers
.length does not equal 0 then run the following steps:
buffered
for eachSourceBuffer
object in activeSourceBuffers
.TimeRange
object containing a single range from 0 tohighest end time.SourceBuffer
object in activeSourceBuffers
run the following steps:
buffered
attribute on the currentSourceBuffer
.readyState
is"ended"
, then set the end time on the last range insource ranges to highest end time.This section specifies extensions to the HTML AudioTrack
definition.
partial interface AudioTrack {
readonly attribute SourceBuffer
? sourceBuffer
;
};
sourceBuffer
of type
SourceBuffer
, readonly , nullable
Returns the SourceBuffer
that created this track. Returns null if this track was not created by aSourceBuffer
or the SourceBuffer
has been removed from the sourceBuffers
attribute of itsparent media source.
This section specifies extensions to the HTML VideoTrack
definition.
partial interface VideoTrack {
readonly attribute SourceBuffer
? sourceBuffer
;
};
sourceBuffer
of type
SourceBuffer
, readonly , nullable
Returns the SourceBuffer
that created this track. Returns null if this track was not created by aSourceBuffer
or the SourceBuffer
has been removed from the sourceBuffers
attribute of itsparent media source.
This section specifies extensions to the HTML TextTrack
definition.
partial interface TextTrack {
readonly attribute SourceBuffer
? sourceBuffer
;
};
sourceBuffer
of type
SourceBuffer
, readonly , nullable
Returns the SourceBuffer
that created this track. Returns null if this track was not created by aSourceBuffer
or the SourceBuffer
has been removed from the sourceBuffers
attribute of itsparent media source.
The bytes provided through appendBuffer()
for aSourceBuffer
form a logical byte stream. The format and semantics of these byte streams are defined inbyte stream format specifications. The byte stream format registry [MSE-REGISTRY] provides mappings between a MIME type that may be passed to addSourceBuffer()
orisTypeSupported()
and the byte stream format expected by aSourceBuffer
created with that MIME type. Implementations are encouraged to register mappings for byte stream formats they support to facilitate interoperability. The byte stream format registry [MSE-REGISTRY] is the authoritative source for these mappings. If an implementation claims to support a MIME type listed in the registry, itsSourceBuffer
implementation MUST conform to thebyte stream format specification listed in the registry entry.
The byte stream format specifications in the registry are not intended to define new storage formats. They simply outline the subset of existing storage format structures that implementations of this specification will accept.
Byte stream format parsing and validation is implemented in the segment parser loop algorithm.
This section provides general requirements for all byte stream format specifications:
AudioTrack
,VideoTrack
, and TextTrack
attribute values from data in initialization segments.
If the byte stream format covers a format similar to one covered in the in-band tracks spec [INBANDTRACKS], then itSHOULD try to use the same attribute mappings so that Media Source Extensions playback and non-Media Source Extensions playback provide the same track information.
The number and type of tracks are not consistent.
For example, if the first initialization segment has 2 audio tracks and 1 video track, then all initialization segments that follow it in the byte stream MUST describe 2 audio tracks and 1 video track.
Codecs changes across initialization segments.
For example, a byte stream that starts with an initialization segment that specifies a single AAC track and later contains aninitialization segment that specifies a single AMR-WB track is not allowed. Support for multiple codecs is handled with multipleSourceBuffer
objects.
Video frame size changes. The user agent MUST support seamless playback.
This will cause the
Audio channel count changes. The user agent MAY support this seamlessly and could trigger downmixing.
This is a quality of implementation issue because changing the channel count may require reinitializing the audio device, resamplers, and channel mixers which tends to be audible.
buffered
attribute.
This is intended to simplify switching between audio streams where the frame boundaries don't always line up across encodings (e.g., Vorbis).
For example, if I1 is associated with M1, M2, M3 then the above MUST hold for all the combinations I1+M1, I1+M2, I1+M1+M2, I1+M2+M3, etc.
Byte stream specifications MUST at a minimum define constraints which ensure that the above requirements hold. Additional constraintsMAY be defined, for example to simplify implementation.
As well as sections marked as non-normative, all authoring guidelines, diagrams, examples, and notes in this specification are non-normative. Everything else in this specification is normative.
The key words MAY, MUST, MUST NOT, SHOULD, and SHOULD NOT are to be interpreted as described in [RFC2119].
Example use of the Media Source Extensions
<script>
function onSourceOpen(videoTag, e) {
var mediaSource = e.target;
if (mediaSource.sourceBuffers.length > 0)
return;
var sourceBuffer = mediaSource.addSourceBuffer('video/webm; codecs="vorbis,vp8"');
videoTag.addEventListener('seeking', onSeeking.bind(videoTag, mediaSource));
videoTag.addEventListener('progress', onProgress.bind(videoTag, mediaSource));
var initSegment = GetInitializationSegment();
if (initSegment == null) {
// Error fetching the initialization segment. Signal end of stream with an error.
mediaSource.endOfStream("network");
return;
}
// Append the initialization segment.
var firstAppendHandler = function(e) {
var sourceBuffer = e.target;
sourceBuffer.removeEventListener('updateend', firstAppendHandler);
// Append some initial media data.
appendNextMediaSegment(mediaSource);
};
sourceBuffer.addEventListener('updateend', firstAppendHandler);
sourceBuffer.appendBuffer(initSegment);
}
function appendNextMediaSegment(mediaSource) {
if (mediaSource.readyState == "closed")
return;
// If we have run out of stream data, then signal end of stream.
if (!HaveMoreMediaSegments()) {
mediaSource.endOfStream();
return;
}
// Make sure the previous append is not still pending.
if (mediaSource.sourceBuffers[0].updating)
return;
var mediaSegment = GetNextMediaSegment();
if (!mediaSegment) {
// Error fetching the next media segment.
mediaSource.endOfStream("network");
return;
}
// NOTE: If mediaSource.readyState == “ended”, this appendBuffer() call will
// cause mediaSource.readyState to transition to "open". The web application
// should be prepared to handle multiple “sourceopen” events.
mediaSource.sourceBuffers[0].appendBuffer(mediaSegment);
}
function onSeeking(mediaSource, e) {
var video = e.target;
if (mediaSource.readyState == "open") {
// Abort current segment append.
mediaSource.sourceBuffers[0].abort();
}
// Notify the media segment loading code to start fetching data at the
// new playback position.
SeekToMediaSegmentAt(video.currentTime);
// Append a media segment from the new playback position.
appendNextMediaSegment(mediaSource);
}
function onProgress(mediaSource, e) {
appendNextMediaSegment(mediaSource);
}
script>
<video id="v" autoplay> video>
<script>
var video = document.getElementById('v');
var mediaSource = new MediaSource();
mediaSource.addEventListener('sourceopen', onSourceOpen.bind(this, video));
video.src = window.URL.createObjectURL(mediaSource);
script>
This section is non-normative.
The video playback quality metrics described in previous revisions of this specification (e.g., sections 5 and 10 of the Candidate Recommendation) are now being developed as part of [ MEDIA-PLAYBACK-QUALITY]. Some implementations may have implemented the earlier draftVideoPlaybackQuality
object and the
HTMLVideoElement
extension method
getVideoPlaybackQuality()
described in those previous revisions.
↑