There are three types of pictures (or frames) used in video compression: I-frames, P-frames, and B-frames. An I-frame is an 'Intra-coded picture', in effect a fully-specified picture, like a conventional static image file. P-frames and B-frames hold only part of the image information, so they need less space to store than an I-frame, and thus improve video compression rates.
A P-frame ('Predicted picture') holds only the changes in the image from the previous frame. For example, in a scene where a car moves across a stationary background, only the car's movements need to be encoded. The encoder does not need to store the unchanging background pixels in the P-frame, so saving space. P-frames are also known as delta-frames. A B-frame ('Bi-predictive picture') saves even more space by using differences between the current frame and both the preceding and following frames to specify its content.
Pictures
Pictures that are used as a reference for predicting other pictures are referred to as reference pictures. In such designs, the pictures that are coded without prediction from other pictures are called the I pictures. Pictures that use prediction from a single reference picture (or a single picture for prediction of each region) are called the P pictures. And pictures that use a prediction signal that is formed as a (possibly weighted) average of two reference pictures are called the B pictures.
Slices
In the latest international standard, known as H.264/MPEG-4 AVC, the granularity of the establishment of prediction types is brought down to a lower level called the slice level of the representation. A slice is a spatially distinct region of a picture that is encoded separately from any other region in the same picture. In that standard, instead of I pictures, P pictures, and B pictures, there are I slices, P slices, and B slices.
Macroblocks
Strictly speaking, the term picture is a more general term than frame, as a picture can be either a frame or a field. A frame is a complete image captured during a known time interval, and a field is the set of odd-numbered or even-numbered scanning lines composing a partial image. When video is sent in interlaced-scan format, each frame is sent as the field of odd-numbered lines followed by the field of even-numbered lines. Informally, the term "frame" is often used when the actual intent is the alternate meaning of "picture" -- a field.
Typically, pictures are segmented into macroblocks, and individual prediction types can be selected on a macroblock basis rather than being the same for the entire picture, as follows:
Furthermore, in the most recent video codec standard H.264, the picture can be segmented into sequences of macroblocks called slices, and instead of using I, B and P picture type selections, the encoder can choose the prediction style distinctly on each individual slice. Also in H.264 are found several additional types of pictures/slices:
Multi-frame motion estimation will allow increases in the quality of the video while allowing the same compression ratio. SI- SP-frames (defined for Extended profile) will allow for increases in the error resistance. When such frames are used along with a smart decoder, it is possible to recover the broadcast streams of damaged DVDs.
Intra coded frames (or slices or I-frames or Key frames)
Typically require more bits to encode than other picture types.
Often, I-frame are used for random access and are used as references for the decoding of other pictures. Intra refresh periods of a half-second are common on such applications as digital television broadcast and DVD storage. Longer refresh periods may be used in some environments. For example, in videoconferencing systems it is common to send I frames very infrequently.
Predicted frames (or slices)
Bi-directional predicted frames (or slices,) a.k.a. B pictures