xvid API

From www.xvid.org

Encode:
+--------------------------------------------------------------------+
      Short explanation for the XviD data strutures and routines

  The encoding part

       If you have further questions, visit http://www.xvid.org
+--------------------------------------------------------------------+

Document version :
$Id: xvid-encoder.txt,v 1.3 2002/06/27 14:49:05 edgomez Exp $

+--------------------------------------------------------------------+
| Abstract
+--------------------------------------------------------------------+

This document presents the basic  structures and API of XviD. It tries
to explain how to use them  to obtain a simple profile compliant MPEG4
stream feeding the encoder with a sequence of frames.

+-------------------------------------------------------------------+
| Document
+-------------------------------------------------------------------+

 

     Chapter 1 : The XviD version
+-----------------------------------------------------------------+

The  Xvid version  is defined  at library  compilation time  using the
constant defined in xvid.h

#define API_VERSION ((2 << 16) | (1))

Where 2 stands for the major XviD version, and 1 for the minor version
number.

The current version  of the API is 2.1 and  should be incremented each
time   a  user   defined  structure   is   modified  (XVID_INIT_PARAM,
XVID_ENC_PARAM ... we will discuss about them later).

When you're writing a program/library which uses the XviD library, you
must  check  your  XviD  API  version against  the  available  library
version.  We will  see how  to check  the version  number in  the next
chapter.

 

   Chapter 2 : The XVID_INIT_PARAM
+-----------------------------------------------------------------+


typedef struct
{
int cpu_flags; [in/out]
int api_version; [out]
int core_build; [out]
} XVID_INIT_PARAM;

Used in:  xvid_init(NULL, 0, &xinit, NULL);

This tructure is  used and filled by the  xvid_init function depending
on the cpu_flags value.

List of valid flags for the cpu_flags member :

- XVID_CPU_MMX      : cpu feature
- XVID_CPU_MMXEXT   : cpu feature
- XVID_CPU_SSE      : cpu feature
- XVID_CPU_SSE2     : cpu feature
- XVID_CPU_3DNOW    : cpu feature
- XVID_CPU_3DNOWEXT : cpu feature
- XVID_CPU_TSC      : cpu feature
- XVID_CPU_IA64     : cpu feature
- XVID_CPU_CHKONLY  : command
- XVID_CPU_FORCE    : command

In order to set a flag : xinit.cpu_flags |= desired_flag_constant;

1st case : you call  xvid_init without setting the XVID_CPU_CHKONLY or
the XVID_CPU_FORCE flag, the xvid_init function detects auto magically
the host  cpu features and  fills the cpu_flags member.  The xvid_init
function also  performs all internal  function pointers initialization
according to deteced features and then returns XVID_ERR_OK.

2nd case :  you call xvid_init setting the  XVID_CPU_CHKONLY flag, the
xvid_init function will  just detect the host cpu  features and return
XVID_ERR_OK without  initializing the internal  function pointers (NB:
The XviD library is not usable after such a call to xvid_init).

3rd case  : you call  xvid_init with the cpu_flags  XVID_CPU_FORCE and
desired feature  flags set up  (eg : XVID_CPU_SSE |  XVID_CPU_MMX). In
this case you  force XviD to use the given cpu  features passed in the
cpu_flags member. Use this if you know what you're doing.

NB for PowerPC  archs : the ppc arch has  not automatic detection, the
library must  be compiled  for a specific  ppc target using  the right
Makefile  (the  cpu_flags  is   irrevelevant  for  these  archs).  Use
Makefile.linuxppc   for   standard   ppc   optimized   functions   and
Makefile.linuxppc_altivec for altivec simd optimized functions.

NB for IA64 archs : There's optimized ia64 assembly functions provided
in    the    library,    they     must    be    forced    using    the
XVID_CPU_FORCE|XVID_CPU_IA64 pair of flags.

To check the  XviD library version against your  own XviD header file,
you have just to call the xvid_init function (no matter the cpu_flags)
and  compare   the  returnded  xinit.api_version   integer  with  your
API_VERSION number. The core_build build member is not relevant at the
moment but is reserved for future  use (when XviD would have reached a
certain stability in its API and releases).

 

Chapter 3 : XVID_ENC_PARAM structure
+-----------------------------------------------------------------+


typedef struct
{
int width, height; [in]
int fincr, fbase; [in]
int rc_bitrate; [in]
int rc_reaction_delay_factor; [in]
int rc_averaging_period; [in]
int rc_buffer; [in]
int max_quantizer; [in]
int min_quantizer; [in]
int max_key_interval; [in]

void *handle; [out]
}
XVID_ENC_PARAM;

Used in:    xerr = xvid_encore(NULL, XVID_ENC_CREATE, &xparam, NULL);

This structure has to be filled to create a new encoding instance:

- width and height.

They have to be set to the size of the image to be encoded.

- fincr and fbase (<0 forces default value 25fps - [25,1]).

They  are the  MPEG-way of  defining the  framerate.  If  you  have an
integer framerate, say 24,  25 or 30fps, use fincr=1, fbase=framerate.
However,  if   framerate  is  non-integer,  like   23.996fps  you  can
e.g. multiply  with 1000,  getting fincr=1000 and  fbase=23996, giving
you integer values again.

- rc_bitrate (<0 forces default value : 900000).

This  the desired  target bitrate.  XviD will  try to  do its  best to
respect this setting but keep in mind XviD is still in development and
it has not been tuned for very low bitrates.

- Any other rc_xxxx parameter are for the bit rate controler in order
   to  respect your  rc_bitrate setting  the best  it can.  (<0 forces
   default values)

Default's are good enough and you should not change them.

ToDo :  describe briefly their impact  on the bit  rate variations and
the rc_bitrate setting respect.

- min_quantizer and max_quantizer (<0 forces default values : 1,31).

These  2 memebers limit  the range  of allowed  quantizers.  Normally,
quantizer's range is [1..31], so min=1 and max=31.

NB : the HIGHER the quantizer, the LOWER the quality.
     the HIGHER the quantizer, the HIGHER the compression ratio. 

min_quant=1 is somewhat overkill, min_quant=2 is good enough max_quant
depends on what you encode, leave  it with 31 or lower it to something
like 15 or  10 for better quality (but encoding  with very low bitrate
might fail then).

- max_key_interval (<0 forces default value : 10*framerate == 10s)

This   is  the  maximum   value  of   frames  between   two  keyframes
(I-frames). Keyframes  are also inserted dynamically  at scene breaks.
It is important to have some  keyframes, even in longer scenes, if you
want to skip position in  the resulting file, because skipping is only
possible from  one keyframe to  the next. However, keyframes  are much
larger than non-keyframes, so do not use too many of them.  A value of
framerate*10 is a good choice normally.

- handle

This is the returned internal encoder instance.

 

      Chapter 4 : the XVID_ENC_FRAME structure.
+-----------------------------------------------------------------+

typedef struct
{
int general; [in]
int motion; [in]
void *bitstream; [in]
int length; [out]

void *image; [in]
int colorspace; [in]

unsigned char *quant_intra_matrix;  [in]
unsigned char *quant_inter_matrix;  [in]
int quant;     [in]
int intra;     [in/out]

HINTINFO hint;     [in/out]
}
XVID_ENC_FRAME;

Used in:
  xerr = xvid_encore(enchandle, XVID_ENC_ENCODE, &xframe, &xstats);

This is  the main structure to encode  a frame, it gives  hints to the
encoder on how to process an image.

- general flag member.

The general flag member informs XviD on general algorithm choices made
by the library client.

Valid flags :

    - XVID_CUSTOM_QMATRIX  :  informs  xvid  to use  the  custom  user
      matrices.

    - XVID_H263QUANT   :  informs  xvid   to  use   H263  quantization
      algorithm.

    - XVID_MPEGQUANT   :  informs  xvid   to  use   MPEG  quantization
      algorithm.

    - XVID_HALFPEL  : informs  xvid  to perform  a  half pixel  motion
      estimation.

    - XVID_ADAPTIVEQUANT  :  informs  xvid  to perform  an  adaptative
      quantization.

    - XVID_LUMIMASKING : infroms xvid to use a lumimasking algorithm.

    - XVID_LATEINTRA : ???

    - XVID_INTERLACING  : informs  xvid  to use  the MPEG4  interlaced
      mode.

    - XVID_TOPFIELDFIRST : ???

    - XVID_ALTERNATESCAN : ???

    - XVID_HINTEDME_GET  : informs  xvid to  return  Motion Estimation
      vectors from the ME encoder algorithm. Used during a first pass.

    - XVID_HINTEDME_SET :  informs xvid to  use the user  given motion
      estimation vectors as hints  for the encoder ME algorithms. Used
      during a 2nd pass.

    - XVID_INTER4V : forces XviD to search a vector for each 8x8 block
      within the 16x16  Macro Block. This mode should  be used only if
      the  XVID_HALFPEL mode is  activated (this  could change  in the
      future).

    - XVID_ME_ZERO : forces XviD to use the zero ME algorithm.

    - XVID_ME_LOGARITHMIC  :  forces   XviD  to  use  the  logarithmic
      ME algorithm.

    - XVID_ME_FULLSEARCH  : forces  XviD  to use  the  full search  ME
      algorithm.

    - XVID_ME_PMVFAST : forces XviD to use the PMVFAST ME algorithm.

    - XVID_ME_EPZS : forces XviD to use the EPZS ME algorithm.

ToDo :  fill the void entries  in flags, and describe  briefly each ME
algorithm.

- motion member.

Valid flags for  16x16 motion estimation (no XVID_INTER4V  flag in the
general flag).

    - PMV_ADVANCEDDIAMOND16  : XviD has  a modified  diamond algorithm
      that performs a bit faster  than the original one. Use this flag
      if  you want  to use  the  speed optimized  diamond serach.  The
      quality loss is  not big (better quality than  square search but
      less than the normal diamond search).

    - PMV_HALFPELDIAMOND16 : switches the search algorithm from 1 or 2
      full pixels precision to 1 or 2 half pixel precision.

    - PMV_HALFPELREFINE16  :  After normal  diamond  search, an  extra
      halfpel refinement step is  performed.  Should always be used if
      XVID_HALFPEL is  on, because it  gives a rather big  increase in
      quality.

    - PMV_EXTSEARCH16 :  Normal PMVfast predicts one  start vector and
      does diamond search around this position. EXTSEARCH means that 2
      more  start vectors  are used:  (0,0) and  median  predictor and
      diamond search  is done for  those, too.  Makes  search slightly
      slower, but quality sometimes gets better.

    - PMV_EARLYSTOP16 :  PMVfast and EPZS stop search  if current best
      is  below some dynamic  threshhold. No  diamond search  is done,
      only halfpel  refinement (if active).  Without EARLYSTOP diamond
      search is always done. That would be much slower, but not really
      lead to better quality.

    - PMV_QUICKSTOP16   :  like  EARLYSTOP,   but  not   even  halfpel
      refinement is  done. Normally worse  quality, so it  defaults to
      off. Might be removed, too.

    - PMV_UNRESTRICTED16   :  "unrestricted  ME"   is  a   feature  of
      MPEG4. It's not  implemented, so this flag is  ignored (not even
      checked).

    - PMV_OVERLAPPING16 :  same as unrestricted.  Not implemented, nor
      checked.

    - PMV_USESQUARES16  : Replace  the  diamond search  with a  square
      search.


Valid flags  when using 4 vectors  mode prediction. They  have the same
meaning as their 16x16 counter part so we only give the list :

    - PMV_ADVANCEDDIAMOND8
    - PMV_HALFPELDIAMOND8
    - PMV_HALFPELREFINE8
    - PMV_EXTSEARCH8
    - PMV_EARLYSTOP8
    - PMV_QUICKSTOP8
    - PMV_UNRESTRICTED8
    - PMV_OVERLAPPING8
    - PMV_USESQUARES8

- quant member.

The quantizer value  is used when the DCT  coefficients are divided to
zero those coefficients not important (according to the target bitrate
not the image quality :-)

Valid values :

     - 0 (zero) : Then the  rate controler chooses the right quantizer
       for you.  Tipically used in ABR encoding or first pass of a VBR
       encoding session.

     - !=  0  :  Then you  force  the  encoder  to use  this  specific
       quantizer   value.     It   is   clamped    in   the   interval
       [1..31]. Tipically used  during the 2nd pass of  a VBR encoding
       session.

- intra member.

[in usage]
The intra value  decides wether the frame is going to  be a keyframe or
not.

Valid values :

    - 1 : forces the encoder  to create a keyframe. Mainly used during
      a VBR 2nd pass.

    - 0 :  forces the  encoder not to  create a keyframe.  Minaly used
      during a VBR second pass

    - -1   :  let   the  encoder   decide  (based   on   contents  and
       max_key_interval). Mainly  used in ABR  mode and dunring  a 1st
       VBR pass.

[out usage]

When first set to -1, the encoder returns the effective keyframe state
of the frame.

    - 0 : the resulting frame is not a keyframe

    - 1 : the resulting frame is a keyframe (scene change).

    - 2  : the resulting  frame is  a keyframe  (max_keyframe interval
      reached)

- quant_intra_matrix and quant_inter_matrix members.

These are  pointers to  to a pair  of user quantization  matrices. You
must set the  general XVID_CUSTOM_QMATRIX flag to make  sure XviD uses
them.

When set to NULL, the default XviD matrices are used.

NB : each time the matrices  change, XviD must write a header into the
bitstream, so  try not changing  these matrices very often.  This will
save space.

 

       Chapter 5 : The XVID_ENC_STATS structure
+-----------------------------------------------------------------+


typedef struct
{
int quant; // [out] frame quantizer
int hlength; // [out] header length (bytes)
int kblks, mblks, ublks; // [out]

} XVID_ENC_STATS;

Used in:
  xerr = xvid_encore(enchandle, XVID_ENC_ENCODE, &xframe, &xstats);

In  this  structure the  encoder  return  statistical  data about  the
encoding process,  e.g. to be  saved for two-pass-encoding.   quant is
the quantizer  chosen for  this frame (if  you let ratecontrol  do it)
hlength  is  the  length  of  the  frame's  header,  including  motion
information etc.  kblks, mblks, ublks are unused at the moment.

 

Chapter 6 : The xvid_encode function
+-----------------------------------------------------------------+


int xvid_encore(void * handle,
int opt,
void * param1,
void * param2);


XviD uses a single-function API, so  everything you want to do is done
by  this routine.  The  opt  parameter chooses  the  behaviour of  the
routine:

XVID_ENC_CREATE:  create a  new encoder,  XVID_ENC_PARAM in  param1, a
handle to the new encoder is returned in handle.

XVID_ENC_ENCODE: encode one frame, XVID_ENC_FRAME-structure in param1,
XVID_ENC_STATS  in param2  (or  NULL,  if you  are  not interested  in
statistical data).

XVID_DEC_DESTROY: shut down this encoder, do not use handle afterwards.

 

Decode:
XviD core API overview: Decoding
+-----------------------------------------------------------------+
* Short explanation for the XviD data strutures and routines
*
*                       decoding part
*
* if you have further questions, visit http://www.xvid.org
*
+-----------------------------------------------------------------+

/* these are are structures/routines from xvid.h needed for decoding */

+-----------------------------------------------------------------+

#define API_VERSION ((1 << 16) | (0))

This is the revision of the xvid.h file that you have in front of you.
Check it against the
library's version.
+-----------------------------------------------------------------+
typedef struct
{
int cpu_flags; [in/out]
int api_version; [out]
int core_build; [out]
} XVID_INIT_PARAM;

This is filled by xvid_init with the correct CPU flags for initialization
(auto-detect), unless you pass flag to it (cpu_flags!=0). Do not use that
unless you really know what you are doing.
api_version can (should) be checked against API_VERSION, to see if you
have the right core library.

Used in:  xvid_init(NULL, 0, &xinit, NULL);

+-----------------------------------------------------------------+

typedef struct
{
int width; [in] (should be a multiple of 16, max is )
int height; [in]    (should be a multiple of 16, max is )
void *handle; [out]
} XVID_DEC_PARAM;

When creating decoder, you have to provide it with height and width of the
image to decode (this is _not_ in the bytestream itself!).
In handle a unique handle is given back, that has to be used to identify
this instance of decoding.

Used in:  xerr = xvid_decore(NULL, XVID_DEC_CREATE, &xparam, NULL);

+-----------------------------------------------------------------+

typedef struct
{
void * bitstream; [in]
int length; [in]

void * image; [in]
int stride; [in]
int colorspace; [in]
} XVID_DEC_FRAME;

This is the main structure for decoding itself. You provide the
MPEG4-bitstream and it's length,
image is the position where the decoded picture should be stored.
stride is the difference between the memory address of the first pixel of
a row in the image and the first pixel of the next row. If the image is
going to be one big block, then stride=width, but by making it larger you
can create an "edged" picture.
By colorspace the output format for the image is given, XVID_CSP_RGB24 or
XVID_CSP_YV12 might be might common.

A special case is XVID_CSP_USER. If you use this, then *image will not
filled with the image but with a structure that contains pointers to the
decoder's internal representation of it. That's faster, because no memcopy
is involved, but don't use it, if you don't know what you're doing.

Used in:   xerr = xvid_decore(dechandle, XVID_DEC_DECODE, &xframe, NULL);

+-----------------------------------------------------------------+

int xvid_decore(void * handle, [in/out]
int opt, [in]
void * param1, [in]
void * param2); [in]


XviD uses a single-function API, so everything you want to do is done by
this routine. The opt parameter chooses the behaviour of the routine:

XVID_DEC_CREATE:   create a new decoder, XVID_DEC_PARAM in param1,
   a handle to the new decoder is returned in handle

XVID_DEC_DECODE:   decode one frame, XVID_DEC_FRAME-structure in param1

XVID_DEC_DESTROY:  shut down this decoder, do not use handle afterwards
 

你可能感兴趣的:(video,encodec,api,structure,search,pointers,library,encoding)