live555 server tcp发送丢包问题调试

今天在某个项目中发现，客户端通过tcp从live555 rtsp server取数据，居然也会遇到丢包花屏问题。由于走tcp，可以排除网络丢包导致的，需要具体分析live555的代码看为什么tcp发送也会丢包。

live555中通过tcp发送rtp包的代码位于RTPInterface::sendRTPorRTCPPacketOverTCP

  do {
    u_int8_t framingHeader[4];
    framingHeader[0] = '$';
    framingHeader[1] = streamChannelId;
    framingHeader[2] = (u_int8_t) ((packetSize&0xFF00)>>8);
    framingHeader[3] = (u_int8_t) (packetSize&0xFF);
    if (!sendDataOverTCP(socketNum, framingHeader, 4, False)) break;

    if (!sendDataOverTCP(socketNum, packet, packetSize, True)) break;
#ifdef DEBUG_SEND
    fprintf(stderr, "sendRTPorRTCPPacketOverTCP: completed\n"); fflush(stderr);
#endif

    return True;
  } while (0);

具体执行发送操作的是RTPInterface::sendDataOverTCP

  int sendResult = send(socketNum, (char const*)data, dataSize, 0/*flags*/);
  if (sendResult < (int)dataSize) {
    // The TCP send() failed - at least partially.

    unsigned numBytesSentSoFar = sendResult < 0 ? 0 : (unsigned)sendResult;
    if (numBytesSentSoFar > 0 || (forceSendToSucceed && envir().getErrno() == EAGAIN)) {
      // The OS's TCP send buffer has filled up (because the stream's bitrate has exceeded
      // the capacity of the TCP connection!).
      // Force this data write to succeed, by blocking if necessary until it does:
      unsigned numBytesRemainingToSend = dataSize - numBytesSentSoFar;
#ifdef DEBUG_SEND
      fprintf(stderr, "sendDataOverTCP: resending %d-byte send (blocking)\n", numBytesRemainingToSend); fflush(stderr);
#endif
      makeSocketBlocking(socketNum, RTPINTERFACE_BLOCKING_WRITE_TIMEOUT_MS);
      sendResult = send(socketNum, (char const*)(&data[numBytesSentSoFar]), numBytesRemainingToSend, 0/*flags*/);
      if ((unsigned)sendResult != numBytesRemainingToSend) {
    // The blocking "send()" failed, or timed out.  In either case, we assume that the
    // TCP connection has failed (or is 'hanging' indefinitely), and we stop using it
    // (for both RTP and RTP).
    // (If we kept using the socket here, the RTP or RTCP packet write would be in an
    //  incomplete, inconsistent state.)
#ifdef DEBUG_SEND
    fprintf(stderr, "sendDataOverTCP: blocking send() failed (delivering %d bytes out of %d); closing socket %d\n", sendResult, numBytesRemainingToSend, socketNum); fflush(stderr);
#endif
    removeStreamSocket(socketNum, 0xFF);
    return False;
      }
      makeSocketNonBlocking(socketNum);

      return True;
    } else if (sendResult < 0 && envir().getErrno() != EAGAIN) {
      // Because the "send()" call failed, assume that the socket is now unusable, so stop
      // using it (for both RTP and RTCP):
      removeStreamSocket(socketNum, 0xFF);
    }

    return False;
  }

  return True;

这段发送代码大概的意思是，首先会调用一次非阻塞的send。如果send返回大于0，且forceSendToSucceed为真，后面改为阻塞的send，超时时间为500ms，强制让剩余的部分也发送出去。但在windows中调试发现，非阻塞的send返回值要么就是dataSize，要么就是小于0，errno是EAGAIN，不会出现发送少于dataSize的情况，因此不会出现进入阻塞发送的情况。

send返回值小于0发生在tcp发送缓冲已经满了，或者tcp滑动窗已经满了的情况。先看一下live555是怎么设置tcp发送缓冲的

live555中修改发送缓冲大小的接口为setSendBufferTo和increaseSendBufferTo。live555中没有调用setSendBufferTo的地方，increaseSendBufferTo在OnDemandServerMediaSubsession
::getStreamParameters中有调用

    unsigned streamBitrate;
    FramedSource* mediaSource
      = createNewStreamSource(clientSessionId, streamBitrate);
    ...

      if (rtpGroupsock != NULL) {
    // Try to use a big send buffer for RTP -  at least 0.1 second of
    // specified bandwidth and at least 50 KB
    unsigned rtpBufSize = streamBitrate * 25 / 2; // 1 kbps * 0.1 s = 12.5 bytes
    if (rtpBufSize < 50 * 1024) rtpBufSize = 50 * 1024;
    increaseSendBufferTo(envir(), rtpGroupsock->socketNum(), rtpBufSize);
      }

也就是说，会根据mediaSource估计的码率大小，设置0.1s数据所需要的buffer大小。而createNewStreamSource一般是我们派生的OnDemandServerMediaSubsession子类中实现的。以前没留意到这个估计的码率大小会影响发送缓冲，直接按照例子设置了一个90000的值。看了下代码，这个90000单位是kbps，对应的rtpBufSize是1,125,000，足够大了。既然可以排除tcp发送缓冲大小不够的问题，剩下就是tcp滑动窗已满的问题了，说明要么发送方发得太快，要么接收方回应ack太慢。

上面划线的一段分析结果是不正确的。请看下一篇分析文章《live555调用increaseSendBufferTo分析》

此外，通过wireshark抓包分析，也可以排除tcp滑动窗已满的问题。在wireshark的抓包中，可以看到客户端连接上来的时候，SYN包附带的Window scale是2，说明滑动窗口可以左移两位扩大4倍。在后续的通信过程中，客户端的滑动窗大小会逐步扩大到26万多。但是当send发生WSAEWOULDBLOCK的时候，滑动窗使用的数量只有2万多字节（通过发送出去的包的sequence number，减去接收到的ack的acknowledgment number获得），还远远没有满，因此一定是发送端的发送缓冲满了

live555 server tcp发送丢包问题调试

你可能感兴趣的:(live555 server tcp发送丢包问题调试)