This document may refer to IsoMedia files. IsoMedia is a generic name for all formats based on the MPEG-4 Part 12 specification: MP4, 3GP and MJ2K files. Support for MJ2K files has not be tested in GPAC yet.
As of version 0.2.4, MP4Box performs in-place rewrite of IsoMedia files (the input file is overwritten). You can change this behaviour by using the -out Filename option.
For older versions, when MP4Box is used to modify an existing IsoMedia file, the original file (for example AFILE.mp4) is NOT overwritten, the resulting file is stored in out_AFILE.mp4. To specify another name for the resulting file, use the -out Filename option.
As of version 0.2.4, MP4Box always stores the file with 0.5 second interleaving and meta-data at the begining, making it suitable for HTTP streaming.
MP4Box usually generates a temporary file when creating a new IsoMedia file. The location of this temporary file is OS-dependent, and it may happen that the drive/partition the temporary file is created on has not enough space or no write access. In such a case, you can specify a temporary file location with the -tmp path_to_dir option.
MP4Box does NOT perform audio/video/image transcoding (re-encoding media tracks to a different coded format). If you need to transcode content, you will need other tools.
As of version 0.2.2, you don't need to follow any specific option ordering at prompt.
Please be aware that this page documents the latest version of MP4Box and may therefore give details on options available only on GPAC CVS. If your version of MP4Box does not support an option please upgrade.
Most of these options are used to specify how to store a given file, either just created/converted or existing.
-tmp dir: specifies where the temporary file(s) used by MP4Box shall be created. This is quite usefull on Windows systems where user may not has the rights to create temporary files. By default, MP4Box uses the OS temporary file handling as provided in C stdio.
-inter Duration : interleaves media data in chunks of desired duration (in seconds). This is usefull to optimize the file for HTTP/FTP streaming or reducing disk access. All meta data are placed first in the file, allowing a player to start playback while downloading the content. By default MP4Box always stores files with half a second interleaving and performs drift checking between tracks while interleaving. Specifying a 0 interleaving time will result in the file being stored without interleaving, with all meta-data placed at beginning of the file.
-tight : performs sample-based interleaving of media tracks (!!the created file is much larger !!). This is normally used when hinting a file, in order to reduce disk seeks at server side (depending on server implementation).
-flat : forces flat storage of the file: media data placed at the begining of the file without interleaving, and meta-data at the end of the file. When used with -add to create a new file, no temporary file is created (faster storage).
-frag time_ms : fragments the file with fragments of given duration. Movie fragmenting allows meta data (timing and co) to be interleaved with media data rather than at the begining or at the end of the file. Frgamenting a file will always disable interleaving.
-out fileName : specifies to store the modified file to a different file, rather than overriding the input file.
-new : forces creation of a new destination file. This is usefull when importing media in batch processes for example. If not set and an existing file with the given name is found, all media import operations will be done on this file. This option is ignored when encoding scenes.
-no-sys : removes all MPEG-4 systems tracks and keeps an empty InitialObjectDescriptor will be left in the file for MPEG-4 Level@Profile indications.
-no-iod : removes the file InitialObjectDescriptor.
-isma: converts file to ISMA 1.0 specification. This is extremely usefull since most MPEG-4 players only understand ISMA-like content. All systems information and tracks numbering are rewritten to comply to the specification.
WARNING: some media tracks may be removed.
-3gp : converts to 3GPP specification. This will remove all MPEG-4 Systems information, leaving only the audio/video/text media tracks supported by 3GPP. This option is always turned on when the file extension is '3gp' or '3g2'.
WARNING: some media tracks may be removed.
-brand ABCD[:v] : sets the major brand of a file. Brands are used to identify the most common usage of a file (MPEG-4 presentation, 3GP movie, etc...). If 'v' is set, also sets the version of the brand (default version is 0).
-ab ABCD : adds an alternate brand to the file. Alternate Brands are used to identify the other possible usage of a file (whether the 3GP file compliant with MPEG-4, etc...)
-rb ABCD : removes an alternate brand from the file.
-rem trackID : removes given track from file.
-par trackID=PAR : sets pixel aspect ratio of given track. PAR can be "none" to remove PAR info, or of the form "N:D" where N is PAR numerator and D its denominator. Only supported for MPEG-4 Visual and MPEG-4 AVC/H264
-lang [trackID=]lang : sets the language of the given track or of all tracks if trackID is not specified. The language can be either ISO 639-1 2-char code, ISO 639-2 3-char code, or the full language name. To get the listing of supported languages, use MP4Box -languages
-delay trackID=TIME : sets track start-time offset, specified in milliseconds.
-name trackID=NAME : sets track handler name. Handler name is sometimes used to identify the track content (for example, audio language).
-cprt string : adds copyright to file.
-chap chap_file : adds chapter information located in chap_file to the destination file. Chapter extensions have been introduced by Nero and are NOT standard extensions of IsoMedia file format, don't be surprised if some players don't understand them.
The following syntaxes are supported in the chapter text file, with one chapter entry per line:Some existing MP4 files may use MPEG-4 Visual tracks with B-Frames in an improper way. There is currently no automatic cleaning of such files in MP4Box, but reimporting the track will solve the problem. To do this:
The conversion syntax is MP4Box -add inputFile destinationFile. This option is used to import media from several sources. You can specify up to 20 -add in common MP4Box builds. This process will create the destination file if not existing, and add the track(s) to it. If you wish to erase the destination file, just add the -new option.
MP4Box can import a desired amount of the input file rather than the whole file. To do this, use the syntax -add inputFile%N, where N is the number of seconds you wish to import from input. MP4Box cannot start importing from a random point in the input, it always import from the begining.
When using -add option, MP4Box will automatically create default BIFS and OD tracks to make the resulting file compliant with the ISMA 1.0 standard if possible. If the destination file extension is .3gp or .3g2, MP4Box will automatically make the file 3GP(2) compliant. This means that MP4Box will always remove any systems tracks when using -add, you may prevent this by using the -keepsys option. If the destination file extension is .m4a, MP4Box will automatically setup the proper informations needed by iTunes.
When using -add option to import an existing IsoMedia file, MP4Box will automatically REMOVE ALL TRACKS not complying to the MPEG-4 or 3GPP(2) specifications. If you want to keep such tracks, use the -keepall option.
Note on text import : When importing SRT or SUB files, MP4Box will choose default layout options to make the subtitle appear at the bottom of the video. You SHOULD NOT import such files before any video track is added to the destination file, otherwise the results will likelly not be usefull (default SRT/SUB importing uses default serif font, fontSize 18 and display size 400x60). For more details on 3GPP timed text, please go here.
There are several media-specific options which can be used when importing media. To know which options are supported for non-IsoMedia files, use the -info option for the desired media track, for example MP4Box -info 2 file.mpg.
-dref : MP4Box can import media data without copying it, this is called data referencing. The resulting file only contains the meta-data of the presentation (frame sizes, timing, etc...) and references media data in the original file. This is extremely usefull when developping content, since importing and storage of the MP4 file is much faster and the resulting file much smaller. Use the -dref option to enable data referencing.
Note : Data referencing may fail on some files because it requires the framed data (eg an IsoMedia sample) to be continuous in the original file, which is not always the case depending on the original interleaving or bitstream format.
-sbr : forces importing the AAC-ADTS file as AAC SBR (aka HE-AAC, aka aacPlus) with backward compatible signaling (eg non SBR aware decoders should play the file).
-sbrx : forces importing the AAC-ADTS file as AAC SBR (aka HE-AAC, aka aacPlus) with non-backward compatible signaling (eg non SBR aware decoders should NOT play the file).
Note : MP4Box cannot detect whether AAC input is regular or SBR AAC, so you must use one fo these options if you want to import AAC SBR files.
-nodrop : Some AVI files may have non-coded frames (n-VOPs) introduced by the encoder. By default, MP4Box will discard these frames, hence producing a variable frame-rate visual stream. You can force MP4Box to keep constant frame-rate by specifying -nodrop while importing the AVI file.
-packed: When importing raw MPEG-4 Video, forces considering the bitstream as the dump of an AVI Packed Bitstream (removes all n-vops and import as constant FPS).
-fps FrameRate : If possible, will override the original video frame rate. This option is also used when importing SUB text files to specify the SUB framerate. Framerate is a double-precision number.
-mpeg4 : This option forces MPEG-4 stream descriptions for formats having several description syntax available (QCELP, EVRC and SMV audio).
-agg N : Aggregates N audio frames in an IsoMedia sample. This option is only valid for some 3GP(2) audio formats (AMR, QCELP, EVRC and SMV audio). The maximum acceptable value is 15.
When importing several tracks/sources in one pass, all options will be applied if relevant to each source. These options are set for all imported streams. If you need to specify these options par stream, the syntax is:
MP4Box -add stream[:opt1:...:optN] dest.mp4
Note on OGG Support : MP4Box can import OGG files containing either Vorbis audio or Theora video. This feature is experimental and support for these media formats in IsoMedia files is NOT STANDARDIZED anywhere. This should only be used for development and R&D purposes, and you must be aware that files created this way may be unusable, even with future versions of GPAC.
MP4Box can split IsoMedia files by size, duration or extract a given part of the file to new IsoMedia file(s). This process requires that at most one track in the input file has non random-access points (typically one video track at most). This process will also ignore all MPEG-4 Systems tracks and hint tracks, but will try to split private media tracks.
-split time_in_seconds : splits the input file in a sequence of files lasting at most the specified time. Depending on random access distribution in the file (sync samples), the duration of the resulting files may be less than specified.
-splits size_in_kb : splits the input file in a sequence of files of maximum specified size. Depending on random access distribution in the file (sync samples), the size of the resulting files may be less than specified.
-splitx StartTime:EndTime : extracts a subfile from the input file. StartTime and EndTime are specified in seconds. Depending on random access distribution in the file (sync samples), the startTime will be adjusted to the previous random access time in the file.
-cat a_file : concatenates a_file to input file (samples are added to existing tracks rather than added to new tracks). The usage is the same as -add, you may use non IsoMedia input files (for example, AVIs or MPEGs) and concatenates them directly into a new IsoMedia file. This process will remove all MPEG-4 systems tracks from the final file and make it compliant to ISMA or 3GP just like the -add process. You can instruct MP4Box not to remove MPEG-4 systems tracks by specifying -keepsys.
IsoMedia File Hinting consists in creating special tracks in the file that contain transport protocol specific information and optionally multiplexing information. These tracks are then used by the server to create the actual packets being sent over the network, in other words they provide the server 'hints' regarding packet building, hence their names: Hint Tracks.
MP4Box can generate these hint tracks for the RTP protocol (the most widely used protocol for multimedia streaming). The resulting file can then be streamed to clients with any streaming server understanding the IsoMedia file format and hint tracks, such as Apple's QTSS/DSS servers.
-hint : hints the given file for RTP/RTSP
-mtu size : specifies the desired maximum packet size, or MTU (Maximum Transmission Unit). This must be choosen carefully: specifying too large packets will result in undesired packet fragmentation at lower transport layers. The default size when hinting is 1450 bytes (including the 12 bytes RTP header).
-multi [maxptime] : enables sample concatenation in a single RTP packet for payload formats supporting it. maxptime is an optional integer specifying the maximum packet duration in milliseconds, used for some audio payloads. Its default value is 100 ms.
-copy : forces hinted data to be copied to the hint track. This speeds up packet building at server side but takes much more space on disk.
-rate clock_rate : specifies the rtp clock rate in Hz when no default one exists for the given RTP payload. The default rate of most AV formats is 90000 Hz or the audio sample rate.
-mpeg4 : forces usage of MPEG-4 Generic Payload whenever possible.
-latm : forces usage of LATM payload for MPEG-4 AAC.
-static : enables usage of static RTP payload IDs (pre-defined IDs as specified in RTP). By default MP4Box always uses dynamic payload IDs, since some players do not recognize static ones.
-sdp_ex string : adds the given text to the movie SDP information (-sdp_ex "a=x-test: an sdp test") or to a track (-sdp_ex "N:a=x-test", where N is the hint track or its base track ID). This will take care of SDP line ordering. WARNING: You cannot add anything to SDP, please refer toRFC2327 for more info.
-unhint : removes all hint tracks and SDP information from file. This can be usefull since MP4Box doesn't remove any existing hint tracks when hinting the file.
For advanced users, MP4Box can allow you to specify special options of the MPEG-4 Generic RTP payload format:
-ocr : forces all media tracks in the file to be served synchronized. This is needed because most streaming servers don't support desynchronized tracks in a single file. Be extremelly carefull when designing MPEG-4 interactive presentations for streaming since you will have to take care of the streaming server capabilities... MP4Box generates warnings when the file timeline can be ambiguously interpreted by the server.
-iod : prevents ISMA-like IOD generation in SDP. MP4Box automatically detects ambiguous (ISMA/non-ISMA) files but nobody's perfect. This shouldn't be used with -isma option.
-rap : signals random access points in the payload.
-ts : signals AU timestamps in the payload. This option is automatically turned on when B-Frames (or similar) are detected in the media.
-size : signals AU size in the payload.
-idx : signals AU sequence number in the payload.
MP4Box always detects the best payload possible and when not found gets back to MPEG-4 Generic payload. The configuration of the MPEG-4 Generic payload is quite complex, so MP4Box always computes the most suitable configuration for you.
Q&As:
-info : prints some file information. File can be an IsoMedia file or any file supported by MP4Box for import.
-info TrackID : prints extended track information for IsoMedia files, and supported import flags for other files.
-std : dumps to stdout instead of file.
-diso : creates XML dump of the file structure.
-drtp : creates XML dump of all hint tracks samples of a hinted mp4 file.
-dcr : creates XML dump of all ISMACryp tracks.
-dts : dumps DTS (decoding timestamp) and CTS (composition timestamp) of all tracks, reporting found errors.
-sdp : creates SDP file associated with a hinted mp4 file.
-ttxt : converts input subtitle (SRT, SUB) to GPAC TTXT format.
-ttxt TrackID : dumps text track to TTXT XML format.
-srt : converts input subtitle (TTXT, SUB) to SRT format.
-srt TrackID : dumps text track to SRT format.
-raw TrackID : extracts track to its native format.
-raws TrackID : extracts each track sample to a file. To extract a single sample, use -raws TrackID:N
-avi TrackID : extracts visual track in avi format (MPEG-4 Visual and AVC/H264 supported).
-nhnt TrackID : extracts track in NHNT format.
-nhml TrackID : extracts track in NHML format.
-qcp TrackID : same as -raw but defaults to QCP file for EVRC/SMV.
-aviraw track : extracts avi track to its native format. track can be one of video, audio, audioN N being the number of the audio track.
-single TrackID : extracts track in a new MP4 with a single track.
-saf : remux input file to a SAF multiplex. This can also be used directly when encoding a LASeR content.
-mp4 : specifies input file is to be encoded. Supports .bt (BT), .xmt (XMT-A), .wrl (VRML97), .swf (Flash) and SVG/LASeR (.svg or .xsr) input. For more details on flash input, try MP4Box -h swf. For more details on BT/XMT-A, go here.
-def : encodes nodes and routes names, rather than just binary identifiers. This is usefull when developping content otherwise the decoded scene becomes quickly messy.
-log : generates log file for BIFS encoder and for LASeR encoder/decoder. The log is only usefull to debug the scene codecs.
-ms : specifies the media source to check for track importing. This is needed when no MuxInfo is present in the BT file, although this is not recommended. By default, MP4Box looks for tracks in MYFILE.mp4 when encoding MYFILE.bt
-bt : dumps scene in a BT file.
-xmt : dumps scene in an XMT-A file.
-wrl : dumps scene into VRML97 format - unknown/incompatible nodes are removed.
-x3d : dumps scene into X3D/XML format - unknown/incompatible nodes are removed.
-x3dv : dumps scene into X3D/text format - - unknown/incompatible nodes are removed.
-lsr : dumps scene in a LASeR+XML file.
-svg : dumps LASeR scene root node to an SVG file.
Note : conversion from VRML-based scene graphs to/from SVG-based scene graphs is not supported.
-resolution res : specifies the resolution to use when encoding points. Value ranges from -8 to 7, and all coordinates are multiplied by 2^res. The default resolution used is 0.
-coord-bits bits : Number of bits used to encode a point coordinate. Default value is 12 bits.
-scale-bits bits : Number of extra bits used to encode a scale factor (scale factor are therefore encoded on coord_bits+scale_bits). Default value is 0 bits.
-auto-quant res : resolution is given as if using -resolution but coord-bits and scale-bits are computed dynamically. The default resolution used is 0.
-carousel time : inserts random access points at the desired frequency, specified in milliseconds. This cannot be used with the -sync or -shadow option.
-shadow time : inserts random access points at the desired frequency, specified in milliseconds. This cannot be used with the -sync or -carousel option. The difference with -carousel is that random access samples can only be inserted as a substitution to existing samples, therefore their frequency is not guaranteed.
-sync time : forces sync sample at the desired frequency by replacing the original sample. Time is specified in milliseconds. This cannot be used with the -shadow or -carousel option.
-crypt drm_file : encrypts IsoMedia file according to rules specified drm_file.
-decrypt drm_file : decrypts IsoMedia file. drm_file is optional if the keys are stored within the file.
-set-kms [trackID=]kms_uri : changes the URI of the key management system for the specified track, or for all tracks in the file if no trackID is given.
-set-meta args : assign the given type to the meta container (similar to file branding). Arguments syntax is ABCD[:tk=ID] where:
-add-item args : adds a file resource to the meta container. Arguments syntax is file_path + options (':' separated) with the following options:
Note : a file_path of this or self means the item is the containing file itself.
-rem-item args : removes the given resource from the meta container. Arguments syntax isitem_ID[:tk=ID].
-set-primary args : sets the given item as primary item for the meta container. A primary item is the item used when no XML information is available in the meta container. Arguments syntax is item_ID[:tk=ID].
-set-xml args : sets XML data of the meta container. Arguments syntax isxml_file_path[:tk=ID][:binary], where binary specifies that the XML is not in plain text.
-rem-xml [tk=ID] : removes XML data from the meta container.
-dump-xml args : dumps XML data of the meta container to a file. Arguments syntax isoutput_file_path[:tk=ID].
-dump-item args : dumps given item to file. Arguments syntax isitem_ID[:tk=ID][:path=fileName], where path is the output file name.
-package : packages the input XML file into an ISO container. All local media referenced (except hyperlinks) are added to file (only 'href' and 'url' attributes are currently processsed).THIS IS AN EXPERIMENTAL FEATURE NOT FULLY TESTED
-nodes : prints list of MPEG-4 nodes supported in this MP4Box build.
-node NodeName : prints MPEG-4 node syntax: fields, their type, event type, default value and quantization info if any. Note this works only for nodes supported in the current built.
-xnodes : prints list of X3D nodes supported in this MP4Box build.
-xnode NodeName : prints X3D node syntax: fields, their type, event type and default value. Note this works only for nodes supported in the current built.
-snode NodeName : prints possible attributes and properties of the SVG node. Note this works only for nodes supported in the current built.
-languages : prints list of supported languages and their ISO 639 associated codes.