Rtp/Rtcp协议

RFC 3550:

RTP provides end-to-end network transport functions suitable for applications transmitting real-time data, such as audio, video or simulation data, over multicast or unicast network services. RTP does not address resource reservation and does not guarantee quality-of-service for real-time services. The data transport is augmented by a control protocol (RTCP) to allow monitoring of the data delivery in a manner scalable to large multicast networks, and to provide minimal control and identification functionality. RTP and RTCP are designed to be independent of the underlying transport and network layers. The protocol supports the use of RTP-level translators and mixers.

MAC header IP header UDP header RTP message

RTP header, version 2:

00 01 02 03 04 05 06 07 08 09 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31
Ver P X CC M PT Sequence Number
Timestamp
SSRC
CSRC [0..15] :::

Ver, Version. 2 bits.
RTP version number. Always set to 2.

P, Padding. 1 bit.
If set, this packet contains one or more additional padding bytes at the end which are not part of the payload. The last byte of the padding contains a count of how many padding bytes should be ignored. Padding may be needed by some encryption algorithms with fixed block sizes or for carrying several RTP packets in a lower-layer protocol data unit.

X, Extension. 1 bit.
If set, the fixed header is followed by exactly one header extension.

CC, CSRC count. 4 bits.
The number of CSRC identifiers that follow the fixed header.

M, Marker. 1 bit.
The interpretation of the marker is defined by a profile. It is intended to allow significant events such as frame boundaries to be marked in the packet stream. A profile may define additional marker bits or specify that there is no marker bit by changing the number of bits in the payload type field.

PT, Payload Type. 7 bits.
Identifies the format of the RTP payload and determines its interpretation by the application. A profile specifies a default static mapping of payload type codes to payload formats. Additional payload type codes may be defined dynamically through non-RTP means. An RTP sender emits a single RTP payload type at any given time; this field is not intended for multiplexing separate media streams.

PT Name Type Clock rate (Hz) Audio channels References
0 PCMU Audio 8000 1 RFC 3551
1 1016 Audio 8000 1 RFC 3551
2 G721 Audio 8000 1 RFC 3551
3 GSM Audio 8000 1 RFC 3551
4 G723 Audio 8000 1  
5 DVI4 Audio 8000 1 RFC 3551
6 DVI4 Audio 16000 1 RFC 3551
7 LPC Audio 8000 1 RFC 3551
8 PCMA Audio 8000 1 RFC 3551
9 G722 Audio 8000 1 RFC 3551
10 L16 Audio 44100 2 RFC 3551
11 L16 Audio 44100 1 RFC 3551
12 QCELP Audio 8000 1  
13 CN Audio 8000 1 RFC 3389
14 MPA Audio 90000   RFC 2250, RFC 3551
15 G728 Audio 8000 1 RFC 3551
16 DVI4 Audio 11025 1  
17 DVI4 Audio 22050 1  
18 G729 Audio 8000 1  
19 reserved Audio      
20
-
24
         
25 CellB Video 90000   RFC 2029
26 JPEG Video 90000   RFC 2435
27          
28 nv Video 90000   RFC 3551
29
30
         
31 H261 Video 90000   RFC 2032
32 MPV Video 90000   RFC 2250
33 MP2T Audio/Video 90000   RFC 2250
34 H263 Video 90000    
35
-
71
         
72
-
76
reserved       RFC 3550
77
-
95
         
96
-
127
dynamic       RFC 3551
dynamic GSM-HR Audio 8000 1  
dynamic GSM-EFR Audio 8000 1  
dynamic L8 Audio variable variable  
dynamic RED Audio      
dynamic VDVI Audio variable 1  
dynamic BT656 Video 90000    
dynamic H263-1998 Video 90000    
dynamic MP1S Video 90000    
dynamic MP2P Video 90000    
dynamic BMPEG Video 90000    

Sequence Number. 16 bits.
The sequence number increments by one for each RTP data packet sent, and may be used by the receiver to detect packet loss and to restore packet sequence. The initial value of the sequence number is random (unpredictable) to make known-plaintext attacks on encryption more difficult, even if the source itself does not encrypt, because the packets may flow through a translator that does.

Timestamp. 32 bits.
The timestamp reflects the sampling instant of the first octet in the RTP data packet. The sampling instant must be derived from a clock that increments monotonically and linearly in time to allow synchronization and jitter calculations. The resolution of the clock must be sufficient for the desired synchronization accuracy and for measuring packet arrival jitter (one tick per video frame is typically not sufficient). The clock frequency is dependent on the format of data carried as payload and is specified statically in the profile or payload format specification that defines the format, or may be specified dynamically for payload formats defined through non-RTP means. If RTP packets are generated periodically, the nominal sampling instant as determined from the sampling clock is to be used, not a reading of the system clock. As an example, for fixed-rate audio the timestamp clock would likely increment by one for each sampling period. If an audio application reads blocks covering 160 sampling periods from the input device, the timestamp would be increased by 160 for each such block, regardless of whether the block is transmitted in a packet or dropped as silent.

SSRC, Synchronization source. 32 bits.
Identifies the synchronization source. The value is chosen randomly, with the intent that no two synchronization sources within the same RTP session will have the same SSRC. Although the probability of multiple sources choosing the same identifier is low, all RTP implementations must be prepared to detect and resolve collisions. If a source changes its source transport address, it must also choose a new SSRC to avoid being interpreted as a looped source.

CSRC, Contributing source. 32 bits.
An array of 0 to 15 CSRC elements identifying the contributing sources for the payload contained in this packet. The number of identifiers is given by the CC field. If there are more than 15 contributing sources, only 15 may be identified. CSRC identifiers are inserted by mixers, using the SSRC identifiers of contributing sources. For example, for audio packets the SSRC identifiers of all sources that were mixed together to create a packet are listed, allowing correct talker indication at the receiver.

Glossary:

CSRC, Contributing source.
(RFC 1889) A source of a stream of RTP packets that has contributed to the combined stream produced by an RTP mixer. The mixer inserts a list of the SSRC identifiers of the sources that contributed to the generation of a particular packet into the RTP header of that packet. This list is called the CSRC list. An example application is audio conferencing where a mixer indicates all the talkers whose speech was combined to produce the outgoing packet, allowing the receiver to indicate the current talker, even though all the audio packets contain the same SSRC identifier.

End system.
(RFC 1889) An application that generates the content to be sent in RTP packets and/or consumes the content of received RTP packets. An end system can act as one or more synchronization sources in a particular RTP session, but typically only one.

Mixer.
(RFC 3550) An intermediate system that receives RTP packets from one or more sources, possibly changes the data format, combines the packets in some manner and then forwards a new RTP packet. Since the timing among multiple input sources will not generally be synchronized, the mixer will make timing adjustments among the streams and generate its own timing for the combined stream. Thus, all data packets originating from a mixer will be identified as having the mixer as their synchronization source.

Monitor.
(RFC 1889) An application that receives RTCP packets sent by participants in an RTP session, in particular the reception reports, and estimates the current quality of service for distribution monitoring, fault diagnosis and long-term statistics. The monitor function is likely to be built into the application(s) participating in the session, but may also be a separate application that does not otherwise participate and does not send or receive the RTP data packets. These are called third party monitors.

RTP packet.
(RFC 1889) A data packet consisting of the fixed RTP header, a possibly empty list of contributing sources, and the payload data. Some underlying protocols may require an encapsulation of the RTP packet to be defined. Typically one packet of the underlying protocol contains a single RTP packet, but several RTP packets may be contained if permitted by the encapsulation method.

RTP payload.
(RFC 1889) The data transported by RTP in a packet, for example audio samples or compressed video data.

RTP session.
(RFC 1889) The association among a set of participants communicating with RTP. For each participant, the session is defined by a particular pair of destination transport addresses (one network address plus a port pair for RTP and RTCP). The destination transport address pair may be common for all participants, as in the case of IP multicast, or may be different for each, as in the case of individual unicast network addresses plus a common port pair. In a multimedia session, each medium is carried in a separate RTP session with its own RTCP packets. The multiple RTP sessions are distinguished by different port number pairs and/or different multicast addresses.

SSRC, Synchronization source.
(RFC 1889) The source of a stream of RTP packets, identified by a 32-bit numeric SSRC identifier carried in the RTP header so as not to be dependent upon the network address. All packets from a synchronization source form part of the same timing and sequence number space, so a receiver groups packets by synchronization source for playback. Examples of synchronization sources include the sender of a stream of packets derived from a signal source such as a microphone or a camera, or an RTP mixer. A synchronization source may change its data format, e.g., audio encoding, over time. The SSRC identifier is a randomly chosen value meant to be globally unique within a particular RTP session. A participant need not use the same SSRC identifier for all the RTP sessions in a multimedia session; the binding of the SSRC identifiers is provided through RTCP. If a participant generates multiple streams in one RTP session, for example from separate video cameras, each must be identified as a different SSRC.

Translator.
(RFC 1889) An intermediate system that forwards RTP packets with their synchronization source identifier intact. Examples of translators include devices that convert encodings without mixing, replicators from multicast to unicast, and application- level filters in firewalls.

 

你可能感兴趣的:(session,video,header,application,encryption,audio)