十二 h264 rtp包的时间戳
这次我们一起来分析一下live555中是怎样为rtp包打时间戳的.就以h264为例吧.
void H264VideoRTPSink::doSpecialFrameHandling(unsigned /*fragmentationOffset*/, unsigned char* /*frameStart*/, unsigned /*numBytesInFrame*/, struct timeval framePresentationTime, unsigned /*numRemainingBytes*/) { // Set the RTP 'M' (marker) bit iff // 1/ The most recently delivered fragment was the end of (or the only fragment of) an NAL unit, and // 2/ This NAL unit was the last NAL unit of an 'access unit' (i.e. video frame). if (fOurFragmenter != NULL) { H264VideoStreamFramer* framerSource = (H264VideoStreamFramer*) (fOurFragmenter->inputSource()); // This relies on our fragmenter's source being a "H264VideoStreamFramer". if (fOurFragmenter->lastFragmentCompletedNALUnit() && framerSource != NULL && framerSource->pictureEndMarker()) { setMarkerBit(); framerSource->pictureEndMarker() = False; } } setTimestamp(framePresentationTime); }函数中先检测是否是一个帧的最后一个包,如果是,打上'M'标记.然后就设置时间戳.这个间戳是哪来的呢?需看函数doSpecialFrameHandling()是被谁调用的,经查找,是被MultiFramedRTPSink::afterGettingFrame1()调用的.MultiFramedRTPSink::afterGettingFrame1()的参数presentationTime传给了doSpecialFrameHandling().MultiFramedRTPSink::afterGettingFrame1()是在调用source的getNextFrame()时传给了source.传给哪个source呢?传给了H264FUAFragmenter,还记得暗渡陈仓那件事吗?所以H264FUAFragmenter在获取一个nal unit后调用了MultiFramedRTPSink::afterGettingFrame1().也就是H264FUAFragmenter::afterGettingFrame1()调用了MultiFramedRTPSink::afterGettingFrame1().
void StreamParser::afterGettingBytes(void* clientData, unsigned numBytesRead, unsigned /*numTruncatedBytes*/, struct timeval presentationTime, unsigned /*durationInMicroseconds*/) { StreamParser* parser = (StreamParser*) clientData; if (parser != NULL) parser->afterGettingBytes1(numBytesRead, presentationTime); } void StreamParser::afterGettingBytes1(unsigned numBytesRead, struct timeval presentationTime) { // Sanity check: Make sure we didn't get too many bytes for our bank: if (fTotNumValidBytes + numBytesRead > BANK_SIZE) { fInputSource->envir() << "StreamParser::afterGettingBytes() warning: read " << numBytesRead << " bytes; expected no more than " << BANK_SIZE - fTotNumValidBytes << "\n"; } fLastSeenPresentationTime = presentationTime; unsigned char* ptr = &curBank()[fTotNumValidBytes]; fTotNumValidBytes += numBytesRead; // Continue our original calling source where it left off: restoreSavedParserState(); // Sigh... this is a crock; things would have been a lot simpler // here if we were using threads, with synchronous I/O... fClientContinueFunc(fClientContinueClientData, ptr, numBytesRead, presentationTime); }fClientContinueFunc就是MPEGVideoStreamFramer::continueReadProcessin(),而且我们看到时间戳被传入fClientContinueFunc.
void MPEGVideoStreamFramer::continueReadProcessing(void* clientData, unsigned char* /*ptr*/, unsigned /*size*/, struct timeval /*presentationTime*/) { MPEGVideoStreamFramer* framer = (MPEGVideoStreamFramer*) clientData; framer->continueReadProcessing(); }看来真正的时间戳是在MPEGVideoStreamFramer中计算的,但是H264VideoStreamFramer并没有用到MPEGVideoStreamFramer中那些计算时间戳的函数,而是另外计算,其实H264VideoStreamFramer也没有自己去计算,而是利用H264VideoStreamParser计算的.是在哪个函数中呢?在parser()中!
unsigned H264VideoStreamParser::parse() { try { // The stream must start with a 0x00000001: if (!fHaveSeenFirstStartCode) { // Skip over any input bytes that precede the first 0x00000001: u_int32_t first4Bytes; while ((first4Bytes = test4Bytes()) != 0x00000001) { get1Byte(); setParseState(); // ensures that we progress over bad data } skipBytes(4); // skip this initial code setParseState(); fHaveSeenFirstStartCode = True; // from now on } if (fOutputStartCodeSize > 0) { // Include a start code in the output: save4Bytes(0x00000001); } // Then save everything up until the next 0x00000001 (4 bytes) or 0x000001 (3 bytes), or we hit EOF. // Also make note of the first byte, because it contains the "nal_unit_type": u_int8_t firstByte; if (haveSeenEOF()) { // We hit EOF the last time that we tried to parse this data, // so we know that the remaining unparsed data forms a complete NAL unit: unsigned remainingDataSize = totNumValidBytes() - curOffset(); if (remainingDataSize == 0) (void) get1Byte(); // forces another read, which will cause EOF to get handled for real this time if (remainingDataSize == 0) return 0; firstByte = get1Byte(); saveByte(firstByte); while (--remainingDataSize > 0) { saveByte(get1Byte()); } } else { u_int32_t next4Bytes = test4Bytes(); firstByte = next4Bytes >> 24; while (next4Bytes != 0x00000001 && (next4Bytes & 0xFFFFFF00) != 0x00000100) { // We save at least some of "next4Bytes". if ((unsigned) (next4Bytes & 0xFF) > 1) { // Common case: 0x00000001 or 0x000001 definitely doesn't begin anywhere in "next4Bytes", so we save all of it: save4Bytes(next4Bytes); skipBytes(4); } else { // Save the first byte, and continue testing the rest: saveByte(next4Bytes >> 24); skipBytes(1); } next4Bytes = test4Bytes(); } // Assert: next4Bytes starts with 0x00000001 or 0x000001, and we've saved all previous bytes (forming a complete NAL unit). // Skip over these remaining bytes, up until the start of the next NAL unit: if (next4Bytes == 0x00000001) { skipBytes(4); } else { skipBytes(3); } } u_int8_t nal_ref_idc = (firstByte & 0x60) >> 5; u_int8_t nal_unit_type = firstByte & 0x1F; switch (nal_unit_type) { case 6: { // Supplemental enhancement information (SEI) analyze_sei_data(); // Later, perhaps adjust "fPresentationTime" if we saw a "pic_timing" SEI payload??? ##### break; } case 7: { // Sequence parameter set // First, save a copy of this NAL unit, in case the downstream object wants to see it: usingSource()->saveCopyOfSPS(fStartOfFrame + fOutputStartCodeSize, fTo - fStartOfFrame - fOutputStartCodeSize); // Parse this NAL unit to check whether frame rate information is present: unsigned num_units_in_tick, time_scale, fixed_frame_rate_flag; analyze_seq_parameter_set_data(num_units_in_tick, time_scale, fixed_frame_rate_flag); if (time_scale > 0 && num_units_in_tick > 0) { usingSource()->fFrameRate = time_scale / (2.0 * num_units_in_tick); } else { } break; } case 8: { // Picture parameter set // Save a copy of this NAL unit, in case the downstream object wants to see it: usingSource()->saveCopyOfPPS(fStartOfFrame + fOutputStartCodeSize, fTo - fStartOfFrame - fOutputStartCodeSize); } } //更新时间戳变量 usingSource()->setPresentationTime(); // If this NAL unit is a VCL NAL unit, we also scan the start of the next NAL unit, to determine whether this NAL unit // ends the current 'access unit'. We need this information to figure out when to increment "fPresentationTime". // (RTP streamers also need to know this in order to figure out whether or not to set the "M" bit.) Boolean thisNALUnitEndsAccessUnit = False; // until we learn otherwise if (haveSeenEOF()) { // There is no next NAL unit, so we assume that this one ends the current 'access unit': thisNALUnitEndsAccessUnit = True; } else { Boolean const isVCL = nal_unit_type <= 5 && nal_unit_type > 0; // Would need to include type 20 for SVC and MVC ##### if (isVCL) { u_int32_t first4BytesOfNextNALUnit = test4Bytes(); u_int8_t firstByteOfNextNALUnit = first4BytesOfNextNALUnit >> 24; u_int8_t next_nal_ref_idc = (firstByteOfNextNALUnit & 0x60) >> 5; u_int8_t next_nal_unit_type = firstByteOfNextNALUnit & 0x1F; if (next_nal_unit_type >= 6) { // The next NAL unit is not a VCL; therefore, we assume that this NAL unit ends the current 'access unit': thisNALUnitEndsAccessUnit = True; } else { // The next NAL unit is also a VLC. We need to examine it a little to figure out if it's a different 'access unit'. // (We use many of the criteria described in section 7.4.1.2.4 of the H.264 specification.) Boolean IdrPicFlag = nal_unit_type == 5; Boolean next_IdrPicFlag = next_nal_unit_type == 5; if (next_IdrPicFlag != IdrPicFlag) { // IdrPicFlag differs in value thisNALUnitEndsAccessUnit = True; } else if (next_nal_ref_idc != nal_ref_idc && next_nal_ref_idc * nal_ref_idc == 0) { // nal_ref_idc differs in value with one of the nal_ref_idc values being equal to 0 thisNALUnitEndsAccessUnit = True; } else if ((nal_unit_type == 1 || nal_unit_type == 2 || nal_unit_type == 5) && (next_nal_unit_type == 1 || next_nal_unit_type == 2 || next_nal_unit_type == 5)) { // Both this and the next NAL units begin with a "slice_header". // Parse this (for each), to get parameters that we can compare: // Current NAL unit's "slice_header": unsigned frame_num, pic_parameter_set_id, idr_pic_id; Boolean field_pic_flag, bottom_field_flag; analyze_slice_header( fStartOfFrame + fOutputStartCodeSize, fTo, nal_unit_type, frame_num, pic_parameter_set_id, idr_pic_id, field_pic_flag, bottom_field_flag); // Next NAL unit's "slice_header": u_int8_t next_slice_header[NUM_NEXT_SLICE_HEADER_BYTES_TO_ANALYZE]; testBytes(next_slice_header, sizeof next_slice_header); unsigned next_frame_num, next_pic_parameter_set_id, next_idr_pic_id; Boolean next_field_pic_flag, next_bottom_field_flag; analyze_slice_header(next_slice_header, &next_slice_header[sizeof next_slice_header], next_nal_unit_type, next_frame_num, next_pic_parameter_set_id, next_idr_pic_id, next_field_pic_flag, next_bottom_field_flag); if (next_frame_num != frame_num) { // frame_num differs in value thisNALUnitEndsAccessUnit = True; } else if (next_pic_parameter_set_id != pic_parameter_set_id) { // pic_parameter_set_id differs in value thisNALUnitEndsAccessUnit = True; } else if (next_field_pic_flag != field_pic_flag) { // field_pic_flag differs in value thisNALUnitEndsAccessUnit = True; } else if (next_bottom_field_flag != bottom_field_flag) { // bottom_field_flag differs in value thisNALUnitEndsAccessUnit = True; } else if (next_IdrPicFlag == 1 && next_idr_pic_id != idr_pic_id) { // IdrPicFlag is equal to 1 for both and idr_pic_id differs in value // Note: We already know that IdrPicFlag is the same for both. thisNALUnitEndsAccessUnit = True; } } } } } //注意!注意!注意!此处计算时间戳!! if (thisNALUnitEndsAccessUnit) { usingSource()->fPictureEndMarker = True; ++usingSource()->fPictureCount; // Note that the presentation time for the next NAL unit will be different: struct timeval& nextPT = usingSource()->fNextPresentationTime; // alias nextPT = usingSource()->fPresentationTime; double nextFraction = nextPT.tv_usec / 1000000.0 + 1 / usingSource()->fFrameRate; unsigned nextSecsIncrement = (long) nextFraction; nextPT.tv_sec += (long) nextSecsIncrement; nextPT.tv_usec = (long) ((nextFraction - nextSecsIncrement) * 1000000); } setParseState(); return curFrameSize(); } catch (int /*e*/) { return 0; // the parsing got interrupted } }
每当开始一个新帧时,计算新的时间戳.时间戳保存在fNextPresentationTime中,在usingSource()->setPresentationTime()中传给fPresentationTime.
哇,我们看到live555的类之间调用关系曲折复杂,的确有点不易维护啊!同时我写的也不够清析,自己看着都晕,如果把你搞晕了,这很正常哦!
fPresentationTime是64位的时间,经convertToRTPTimestamp转换为32的rtp时间戳,见函数:
u_int32_t RTPSink::convertToRTPTimestamp(struct timeval tv) { // Begin by converting from "struct timeval" units to RTP timestamp units: u_int32_t timestampIncrement = (fTimestampFrequency * tv.tv_sec); timestampIncrement += (u_int32_t)( (2.0 * fTimestampFrequency * tv.tv_usec + 1000000.0) / 2000000); // note: rounding // Then add this to our 'timestamp base': if (fNextTimestampHasBeenPreset) { // Make the returned timestamp the same as the current "fTimestampBase", // so that timestamps begin with the value that was previously preset: fTimestampBase -= timestampIncrement; fNextTimestampHasBeenPreset = False; } u_int32_t const rtpTimestamp = fTimestampBase + timestampIncrement; return rtpTimestamp; }其实时间戳的转换主要就是把以秒为单位的时间,提升成按频率为单位的时间.也就是提升后,时间间隔不是以秒为单位,而是以1/fTimestampFrequency为单位,也就是1/9000秒。然后再强转为32。