今天在某个项目中发现,客户端通过tcp从live555 rtsp server取数据,居然也会遇到丢包花屏问题。由于走tcp,可以排除网络丢包导致的,需要具体分析live555的代码看为什么tcp发送也会丢包。
live555中通过tcp发送rtp包的代码位于RTPInterface::sendRTPorRTCPPacketOverTCP
do {
u_int8_t framingHeader[4];
framingHeader[0] = '$';
framingHeader[1] = streamChannelId;
framingHeader[2] = (u_int8_t) ((packetSize&0xFF00)>>8);
framingHeader[3] = (u_int8_t) (packetSize&0xFF);
if (!sendDataOverTCP(socketNum, framingHeader, 4, False)) break;
if (!sendDataOverTCP(socketNum, packet, packetSize, True)) break;
#ifdef DEBUG_SEND
fprintf(stderr, "sendRTPorRTCPPacketOverTCP: completed\n"); fflush(stderr);
#endif
return True;
} while (0);
具体执行发送操作的是RTPInterface::sendDataOverTCP
int sendResult = send(socketNum, (char const*)data, dataSize, 0/*flags*/);
if (sendResult < (int)dataSize) {
// The TCP send() failed - at least partially.
unsigned numBytesSentSoFar = sendResult < 0 ? 0 : (unsigned)sendResult;
if (numBytesSentSoFar > 0 || (forceSendToSucceed && envir().getErrno() == EAGAIN)) {
// The OS's TCP send buffer has filled up (because the stream's bitrate has exceeded
// the capacity of the TCP connection!).
// Force this data write to succeed, by blocking if necessary until it does:
unsigned numBytesRemainingToSend = dataSize - numBytesSentSoFar;
#ifdef DEBUG_SEND
fprintf(stderr, "sendDataOverTCP: resending %d-byte send (blocking)\n", numBytesRemainingToSend); fflush(stderr);
#endif
makeSocketBlocking(socketNum, RTPINTERFACE_BLOCKING_WRITE_TIMEOUT_MS);
sendResult = send(socketNum, (char const*)(&data[numBytesSentSoFar]), numBytesRemainingToSend, 0/*flags*/);
if ((unsigned)sendResult != numBytesRemainingToSend) {
// The blocking "send()" failed, or timed out. In either case, we assume that the
// TCP connection has failed (or is 'hanging' indefinitely), and we stop using it
// (for both RTP and RTP).
// (If we kept using the socket here, the RTP or RTCP packet write would be in an
// incomplete, inconsistent state.)
#ifdef DEBUG_SEND
fprintf(stderr, "sendDataOverTCP: blocking send() failed (delivering %d bytes out of %d); closing socket %d\n", sendResult, numBytesRemainingToSend, socketNum); fflush(stderr);
#endif
removeStreamSocket(socketNum, 0xFF);
return False;
}
makeSocketNonBlocking(socketNum);
return True;
} else if (sendResult < 0 && envir().getErrno() != EAGAIN) {
// Because the "send()" call failed, assume that the socket is now unusable, so stop
// using it (for both RTP and RTCP):
removeStreamSocket(socketNum, 0xFF);
}
return False;
}
return True;
这段发送代码大概的意思是,首先会调用一次非阻塞的send。如果send返回大于0,且forceSendToSucceed为真,后面改为阻塞的send,超时时间为500ms,强制让剩余的部分也发送出去。但在windows中调试发现,非阻塞的send返回值要么就是dataSize,要么就是小于0,errno是EAGAIN,不会出现发送少于dataSize的情况,因此不会出现进入阻塞发送的情况。
send返回值小于0发生在tcp发送缓冲已经满了,或者tcp滑动窗已经满了的情况。先看一下live555是怎么设置tcp发送缓冲的
live555中修改发送缓冲大小的接口为setSendBufferTo和increaseSendBufferTo。live555中没有调用setSendBufferTo的地方,increaseSendBufferTo在OnDemandServerMediaSubsession
::getStreamParameters中有调用
unsigned streamBitrate;
FramedSource* mediaSource
= createNewStreamSource(clientSessionId, streamBitrate);
...
if (rtpGroupsock != NULL) {
// Try to use a big send buffer for RTP - at least 0.1 second of
// specified bandwidth and at least 50 KB
unsigned rtpBufSize = streamBitrate * 25 / 2; // 1 kbps * 0.1 s = 12.5 bytes
if (rtpBufSize < 50 * 1024) rtpBufSize = 50 * 1024;
increaseSendBufferTo(envir(), rtpGroupsock->socketNum(), rtpBufSize);
}
也就是说,会根据mediaSource估计的码率大小,设置0.1s数据所需要的buffer大小。而createNewStreamSource一般是我们派生的OnDemandServerMediaSubsession子类中实现的。以前没留意到这个估计的码率大小会影响发送缓冲,直接按照例子设置了一个90000的值。看了下代码,这个90000单位是kbps,对应的rtpBufSize是1,125,000,足够大了。既然可以排除tcp发送缓冲大小不够的问题,剩下就是tcp滑动窗已满的问题了,说明要么发送方发得太快,要么接收方回应ack太慢。
上面划线的一段分析结果是不正确的。请看下一篇分析文章《live555调用increaseSendBufferTo分析》
此外,通过wireshark抓包分析,也可以排除tcp滑动窗已满的问题。在wireshark的抓包中,可以看到客户端连接上来的时候,SYN包附带的Window scale是2,说明滑动窗口可以左移两位扩大4倍。在后续的通信过程中,客户端的滑动窗大小会逐步扩大到26万多。但是当send发生WSAEWOULDBLOCK的时候,滑动窗使用的数量只有2万多字节(通过发送出去的包的sequence number,减去接收到的ack的acknowledgment number获得),还远远没有满,因此一定是发送端的发送缓冲满了