AllJoyn 的JoinSession() 返回timeout问题

在项目中AllJoyn一直有个问题困扰着我们:client在加入session时调用JoinSession()函数有时会timeout失败。

 

注意:是“有时”失败,而有时又运行的很好。这种不确定性问题向来都让人崩溃。

在AllJoyn官方论坛上也有人提到这个问题,他们开发者说在之前的版本都已经修复了。但之后又有人遇到这种问题,并且说使用异步的JoinSessionAsync()成功几率会高点。但经我们实际测试,JoinSessionAsync()和JoinSession()没什么区别。

 

前段时间的突出问题在于:DTV上运行AllJoyn,JoinSession()失败几率非常高。后来经过与AllJoyn的Sample比对代码,发现我们的LinkTimeout被设置为0了,将其改为40这个问题就暂时消失了。

(LinkTimeout为什么被设置为0?我们都没有映象,于是翻出最早的代码,发现我写的初始版本是20,经同事重构后的版本也是20,但从HQ转一圈回来后就变为0了--重要的是我们谁都没发现这个参数被改为0了。他们为什么要改这个参数?--我们不知道,也不可能知道,因为负责这个模块的人都换了好几个。。。还有笔者更悲催的吗?)

 

 

但最近的测试中这个问题又出现了,经过这么久的折腾我几乎束手无策,不知道该怎么解决这个问题--我都有了去仔细跟读AllJoyn庞大而复杂的源代码的冲动了。

还好,我发现了其他办法。

 

1.启用AllJoyn内部日志。

在AllJoyn初始化前调用如下代码:

QCC_UseOSLogging(true);
QCC_SetLogLevels("ALLJOYN=7;ALL=1");
QCC_SetDebugLevel("ALL", 7);
QCC_SetDebugLevel("ICE", 1);
QCC_SetDebugLevel("IPNS", 1);
QCC_SetDebugLevel("TIMER", 1);
关于这代码干啥的就不说了,翻一下AllJoyn相关源码就知道了。

 

然后运行程序,在logcat中查看AllJoyn输出的日志。

经过仔细查看,终于发现了端倪。

 

JoinSession() timeout的日志:

09-06 15:26:26.608: W/TCP(12712): 1.910 HL_DBG TCP JoinS-1 ...aemon/TCPTransport.cc:2715 | TCPTransport::Connect(): Interface UP with addresss fe80::50cc:f8ff:feac:498f
09-06 15:26:26.608: W/TCP(12712): 1.910 HL_DBG TCP JoinS-1 ...aemon/TCPTransport.cc:2713 | TCPTransport::Connect(): Checking interface wlan0
09-06 15:26:26.608: W/TCP(12712): 1.910 HL_DBG TCP JoinS-1 ...aemon/TCPTransport.cc:2715 | TCPTransport::Connect(): Interface UP with addresss 109.123.117.49
09-06 15:26:26.608: W/TCP(12712): 1.910 HL_DBG TCP JoinS-1 ...aemon/TCPTransport.cc:2713 | TCPTransport::Connect(): Checking interface wlan0
09-06 15:26:26.608: W/TCP(12712): 1.910 HL_DBG TCP JoinS-1 ...aemon/TCPTransport.cc:2715 | TCPTransport::Connect(): Interface UP with addresss fe80::52cc:f8ff:feac:498f
09-06 15:26:26.608: D/NETWORK(12712): 1.910 TRACE NETWORK JoinS-1 common/os/posix/Socket.cc:171 | Socket(addrFamily = 2, type = 1, sockfd = <>)
09-06 15:26:26.608: D/NETWORK(12712): 1.910 TRACE NETWORK JoinS-1 common/os/posix/Socket.cc:195 | Connect(sockfd = 122, remoteAddr = 109.123.117.48, remotePort = 9955)
09-06 15:26:26.623: I/THREAD(12712): 1.925 DEBUG THREAD lepDisp common/os/posix/Thread.cc:220 | Thread function exited: lepDisp --> 0x0
09-06 15:26:26.623: D/THREAD(12712): 1.925 TRACE THREAD lepDisp common/os/posix/Thread.cc:368 | Thread::Join() [lepDisp - 5aba2b88 : running]
09-06 15:26:26.623: I/THREAD(12712): 1.926 DEBUG THREAD lepDisp common/os/posix/Thread.cc:372 | [lepDisp - 5aba2b88] Joining thread [lepDisp - 5aba2b88]
09-06 15:26:26.623: I/THREAD(12712): 1.926 DEBUG THREAD lepDisp common/os/posix/Thread.cc:435 | Joined thread lepDisp
09-06 15:26:26.638: W/THREAD(12712): 1.943 HL_DBG THREAD lepDisp common/os/posix/Thread.cc:170 | Thread::~Thread() destroying lepDisp - 0
09-06 15:26:26.638: I/THREAD(12712): 1.943 DEBUG THREAD lepDisp common/os/posix/Thread.cc:319 | Thread::Stop() thread is dead [lepDisp]
09-06 15:26:26.638: D/THREAD(12712): 1.943 TRACE THREAD lepDisp common/os/posix/Thread.cc:368 | Thread::Join() [lepDisp - 0 : not running]
09-06 15:26:26.638: I/THREAD(12712): 1.943 DEBUG THREAD lepDisp common/os/posix/Thread.cc:372 | [lepDisp - 5a798120] Joining thread [lepDisp - 0]
09-06 15:26:26.638: I/THREAD(12712): 1.943 DEBUG THREAD lepDisp common/os/posix/Thread.cc:378 | Thread::Join() thread is dead [lepDisp]
09-06 15:26:26.638: W/THREAD(12712): 1.943 HL_DBG THREAD lepDisp common/os/posix/Thread.cc:182 | Thread::~Thread() destroyed lepDisp - 0 -- started:29 running:24 joined:5
09-06 15:26:26.743: D/NETWORK(12712): 2.047 TRACE NETWORK IpNameServiceImpl common/os/posix/Socket.cc:557 | RecvFrom(sockfd = 115, remoteAddr = <invalid IP address>, remotePort = 9956, buf = <>, len = 1454, received = <>)
09-06 15:26:26.743: I/NETWORK(12712): 2.047 DEBUG NETWORK IpNameServiceImpl common/os/posix/Socket.cc:569 | Received 119 bytes, remoteAddr = 109.123.117.48, remotePort = 9956
09-06 15:26:26.743: W/IPNS(12712): 2.047 HL_DBG IPNS IpNameServiceImpl .../IpNameServiceImpl.cc:3599 | IpNameServiceImpl::Run(): Got IPNS message from "109.123.117.48"
09-06 15:26:26.743: I/NS(12712): 2.048 DEBUG NS IpNameServiceImpl ...on/ns/IpNsProtocol.cc:1576 | Header::Deserialize(): IsAt::Deserialize() answer 0
09-06 15:26:26.743: I/NS(12712): 2.048 DEBUG NS IpNameServiceImpl ...mon/ns/IpNsProtocol.cc:686 | IsAt::Deserialize()
09-06 15:26:26.743: I/NS(12712): 2.048 DEBUG NS IpNameServiceImpl ...mon/ns/IpNsProtocol.cc:733 | IsAt::Deserialize(): G flag 1
09-06 15:26:26.743: I/NS(12712): 2.049 DEBUG NS IpNameServiceImpl ...mon/ns/IpNsProtocol.cc:736 | IsAt::Deserialize(): C flag 1
09-06 15:26:26.743: I/NS(12712): 2.049 DEBUG NS IpNameServiceImpl ...mon/ns/IpNsProtocol.cc:739 | IsAt::Deserialize(): T flag 1

...


09-06 15:27:46.413: I/NS(12712): 81.719 DEBUG NS IpNameServiceImpl ...on/ns/IpNsProtocol.cc:1034 | IsAt::Deserialize(): StringData::Deserialize() name 0
09-06 15:27:46.413: I/NS(12712): 81.719 DEBUG NS IpNameServiceImpl ...emon/ns/IpNsProtocol.cc:81 | StringData::Deserialize()
09-06 15:27:46.418: I/NS(12712): 81.720 DEBUG NS IpNameServiceImpl ...mon/ns/IpNsProtocol.cc:108 | StringData::Deserialize(): com.samsung.contextware.share.acbbc6b64c8cdf53448fbaccdd46441931326180362 from buffer
09-06 15:27:46.418: I/TCP(12712): 81.720 DEBUG TCP IpNameServiceImpl ...aemon/TCPTransport.cc:3714 | TCPTransport::FoundCallback::Found(): busAddr = "r4addr=109.123.117.48,r4port=9955"
09-06 15:27:46.418: I/TCP(12712): 81.720 DEBUG TCP IpNameServiceImpl ...aemon/TCPTransport.cc:3792 | TCPTransport::FoundCallback::Found(): newBusAddr = "tcp:r4addr=109.123.117.48,r4port=9955".
09-06 15:27:46.418: I/TCP(12712): 81.720 DEBUG TCP IpNameServiceImpl ...aemon/TCPTransport.cc:3798 | TCPTransport::FoundCallback::Found(): FoundNames(): tcp:r4addr=109.123.117.48,r4port=9955
09-06 15:27:46.418: D/ALLJOYN_OBJ(12712): 81.721 TRACE ALLJOYN_OBJ IpNameServiceImpl .../daemon/AllJoynObj.cc:3495 | AllJoynObj::FoundNames(busAddr = "tcp:r4addr=109.123.117.48,r4port=9955", guid = "7e08ad9db1299e3a505206188e09b165", names = com.samsung.contextware.share.acbbc6b64c8cdf53448fbaccdd46441931326180362, ttl = 120)
09-06 15:27:46.418: D/THREAD(12712): 81.721 TRACE THREAD IpNameServiceImpl common/os/posix/Thread.cc:333 | Thread::Alert() [NameReaper: running]
09-06 15:27:52.183: I/IFCONFIG(12712): 87.489 DEBUG IFCONFIG IpNameServiceImpl ...posix/IfConfigLinux.cc:563 | IfConfig(): The Linux way
09-06 15:27:52.193: D/NETWORK(12712): 87.495 TRACE NETWORK IpNameServiceImpl common/os/posix/Socket.cc:171 | Socket(addrFamily = 2, type = 2, sockfd = <>)
09-06 15:27:52.193: D/NETWORK(12712): 87.496 TRACE NETWORK IpNameServiceImpl common/os/posix/Socket.cc:272 | Bind(sockfd = 115, localAddr = 0.0.0.0, localPort = 9956)
09-06 15:27:52.198: D/NETWORK(12712): 87.500 TRACE NETWORK IpNameServiceImpl common/os/posix/Socket.cc:171 | Socket(addrFamily = 10, type = 2, sockfd = <>)
09-06 15:27:52.198: D/NETWORK(12712): 87.501 TRACE NETWORK IpNameServiceImpl common/os/posix/Socket.cc:272 | Bind(sockfd = 116, localAddr = ::, localPort = 9956)
09-06 15:27:56.683: I/LOCAL_TRANSPORT(12712): 91.986 DEBUG LOCAL_TRANSPORT replyTimer .../src/LocalTransport.cc:841 | Timed out waiting for METHOD_REPLY with serial 5
09-06 15:27:56.683: W/ALLJOYN(12712): 91.987 HL_DBG ALLJOYN replyTimer ...ore/src/Message_Gen.cc:988 | MarshalMessage: 80+0 ERROR[5] org.alljoyn.Bus.Timeout
09-06 15:27:56.683: D/THREAD(12712): 91.987 TRACE THREAD replyTimer common/os/posix/Thread.cc:333 | Thread::Alert() [lepDisp: running]
09-06 15:27:56.688: W/THREAD(12712): 91.991 HL_DBG THREAD lepDisp common/os/posix/Thread.cc:164 | Thread::Thread() created lepDisp - 0 -- started:29 running:24 joined:5
09-06 15:27:56.688: D/THREAD(12712): 91.993 TRACE THREAD lepDisp common/os/posix/Thread.cc:301 | Thread::Start() [lepDisp] pid = 5cbd8918
09-06 15:27:56.688: I/LOCAL_TRANSPORT(12712): 91.993 DEBUG LOCAL_TRANSPORT lepDisp .../src/LocalTransport.cc:465 | Pushing ERROR[5] org.alljoyn.Bus.Timeout into local endpoint
09-06 15:27:56.688: I/LOCAL_TRANSPORT(12712): 91.993 DEBUG LOCAL_TRANSPORT lepDisp .../src/LocalTransport.cc:703 | LocalEndpoint::RemoveReplyHandler for serial=5
09-06 15:27:56.688: I/LOCAL_TRANSPORT(12712): 91.993 DEBUG LOCAL_TRANSPORT lepDisp ...src/LocalTransport.cc:1018 | Matched reply for serial #5
09-06 15:27:56.688: I/THREAD(12712): 91.993 DEBUG THREAD external common/os/posix/Thread.cc:205 | Thread::RunInternal: lepDisp (pid=5cbd8918)
09-06 15:27:56.688: I/THREAD(12712): 91.994 DEBUG THREAD lepDisp common/os/posix/Thread.cc:216 | Starting thread: lepDisp
09-06 15:27:56.693: I/ALLJOYN(12712): 91.994 DEBUG ALLJOYN lepDisp ...e/src/Message_Parse.cc:647 | Unmarshaled
09-06 15:27:56.693: I/ALLJOYN(12712): <message endianness="LITTLE" type="ERROR" version="1" body_len="0" serial="6">
09-06 15:27:56.693: I/ALLJOYN(12712): <header_fields>
09-06 15:27:56.693: I/ALLJOYN(12712): <header field="ERROR_NAME">
09-06 15:27:56.693: I/ALLJOYN(12712): <string>org.alljoyn.Bus.Timeout</string>
09-06 15:27:56.693: I/ALLJOYN(12712): </header>
09-06 15:27:56.693: I/ALLJOYN(12712): <header field="REPLY_SERIAL">
09-06 15:27:56.693: I/ALLJOYN(12712): <uint32>5</uint32>
09-06 15:27:56.693: I/ALLJOYN(12712): </header>
09-06 15:27:56.693: I/ALLJOYN(12712): <header field="SENDER">
09-06 15:27:56.693: I/ALLJOYN(12712): <string>:_Eds_BwS.2</string>
09-06 15:27:56.693: I/ALLJOYN(12712): </header>
09-06 15:27:56.693: I/ALLJOYN(12712): </header_fields>
09-06 15:27:56.693: I/ALLJOYN(12712): </message>
09-06 15:27:56.693: E/ALLJOYN(12712): 91.996 ****** ERROR ALLJOYN lepDisp .../src/BusAttachment.cc:1477 | org.alljoyn.Bus.JoinSession returned ERROR_MESSAGE (error=org.alljoyn.Bus.Timeout): ER_BUS_REPLY_IS_ERROR_MESSAGE
09-06 15:27:56.693: D/THREAD(12712): 91.996 TRACE THREAD lepDisp common/os/posix/Thread.cc:333 | Thread::Alert() [lepDisp: running]
09-06 15:27:56.693: V/JACK(12712): [ContextSharing]AllJoyn::_join_session(): join session:com.samsung.contextware.share.acbbc6b64c8cdf53448fbaccdd46441931326180362 failed.ER_BUS_REPLY_IS_ERROR_MESSAGE

 

...


09-06 15:29:31.303: I/RENDEZVOUS_SERVER_CONNECTION(12712): 186.606 DEBUG RENDEZVOUS_SERVER_CONNECTION DiscoveryManager ...tion.cc:72 | RendezvousServerConnection::~RendezvousServerConnection()
09-06 15:29:31.303: I/ICE_DISCOVERY_MANAGER(12712): 186.606 DEBUG ICE_DISCOVERY_MANAGER DiscoveryManager ...ryManager.cc:1158 | Run: Server connect return status = ER_UNABLE_TO_CONNECT_TO_RENDEZVOUS_SERVER
09-06 15:29:31.303: D/PROXIMITY_SCAN_ENGINE(12712): 186.607 TRACE PROXIMITY_SCAN_ENGINE DiscoveryManager ...ScanEngine.cc:320 | ProximityScanEngine::StopScan() called
09-06 15:29:31.303: I/PROXIMITY_SCAN_ENGINE(12712): 186.607 DEBUG PROXIMITY_SCAN_ENGINE DiscoveryManager ...ScanEngine.cc:336 | ProximityScanEngine::StopScan() completed
09-06 15:29:31.303: I/ICE_DISCOVERY_MANAGER(12712): 186.607 DEBUG ICE_DISCOVERY_MANAGER DiscoveryManager ...ryManager.cc:3267 | DiscoveryManager::GetWaitTimeOut()
09-06 15:29:31.303: I/ICE_DISCOVERY_MANAGER(12712): 186.607 DEBUG ICE_DISCOVERY_MANAGER DiscoveryManager ...ryManager.cc:3273 | DiscoveryManager::GetWaitTimeOut(): timeout= 0xffffffff tNow = 0x2d8ef
09-06 15:29:31.303: I/ICE_DISCOVERY_MANAGER(12712): 186.607 DEBUG ICE_DISCOVERY_MANAGER DiscoveryManager ...ryManager.cc:3305 | DiscoveryManager::GetWaitTimeOut(): timeout = -1
09-06 15:29:36.218: E/NETWORK(12712): 191.520 ****** ERROR NETWORK JoinS-1 common/os/posix/Socket.cc:214 | Connecting (sockfd = 122) to 109.123.117.48 9955: 110 - Connection timed out: ER_OS_ERROR
09-06 15:29:36.218: E/TCP(12712): 191.520 ****** ERROR TCP JoinS-1 ...aemon/TCPTransport.cc:2760 | TCPTransport::Connect(): Failed: ER_OS_ERROR
09-06 15:29:36.218: I/ALLJOYN(12712): 191.521 DEBUG ALLJOYN JoinS-1 ...core/src/BusEndpoint.cc:45 | Invalidating endpoint type=0

 

JoinSesssion() 成功的日志:

09-06 17:03:39.428: W/TCP(7411): 1.731 HL_DBG TCP JoinS-1 ...aemon/TCPTransport.cc:2715 | TCPTransport::Connect(): Interface UP with addresss fe80::50cc:f8ff:feac:498f
09-06 17:03:39.428: W/TCP(7411): 1.731 HL_DBG TCP JoinS-1 ...aemon/TCPTransport.cc:2713 | TCPTransport::Connect(): Checking interface wlan0
09-06 17:03:39.428: W/TCP(7411): 1.732 HL_DBG TCP JoinS-1 ...aemon/TCPTransport.cc:2715 | TCPTransport::Connect(): Interface UP with addresss 109.123.117.49
09-06 17:03:39.428: W/TCP(7411): 1.732 HL_DBG TCP JoinS-1 ...aemon/TCPTransport.cc:2713 | TCPTransport::Connect(): Checking interface wlan0
09-06 17:03:39.428: W/TCP(7411): 1.732 HL_DBG TCP JoinS-1 ...aemon/TCPTransport.cc:2715 | TCPTransport::Connect(): Interface UP with addresss fe80::52cc:f8ff:feac:498f
09-06 17:03:39.428: D/NETWORK(7411): 1.732 TRACE NETWORK JoinS-1 common/os/posix/Socket.cc:171 | Socket(addrFamily = 2, type = 1, sockfd = <>)
09-06 17:03:39.428: D/NETWORK(7411): 1.733 TRACE NETWORK JoinS-1 common/os/posix/Socket.cc:195 | Connect(sockfd = 120, remoteAddr = 109.123.117.129, remotePort = 9955)
09-06 17:03:39.438: I/THREAD(7411): 1.742 DEBUG THREAD lepDisp common/os/posix/Thread.cc:220 | Thread function exited: lepDisp --> 0x0
09-06 17:03:39.438: D/THREAD(7411): 1.742 TRACE THREAD lepDisp common/os/posix/Thread.cc:368 | Thread::Join() [lepDisp - 5cbd2f38 : running]
09-06 17:03:39.438: I/THREAD(7411): 1.743 DEBUG THREAD lepDisp common/os/posix/Thread.cc:372 | [lepDisp - 5cbd2f38] Joining thread [lepDisp - 5cbd2f38]
09-06 17:03:39.438: I/THREAD(7411): 1.743 DEBUG THREAD lepDisp common/os/posix/Thread.cc:435 | Joined thread lepDisp
09-06 17:03:39.448: W/THREAD(7411): 1.754 HL_DBG THREAD lepDisp common/os/posix/Thread.cc:170 | Thread::~Thread() destroying lepDisp - 0
09-06 17:03:39.448: I/THREAD(7411): 1.754 DEBUG THREAD lepDisp common/os/posix/Thread.cc:319 | Thread::Stop() thread is dead [lepDisp]
09-06 17:03:39.448: D/THREAD(7411): 1.754 TRACE THREAD lepDisp common/os/posix/Thread.cc:368 | Thread::Join() [lepDisp - 0 : not running]
09-06 17:03:39.448: I/THREAD(7411): 1.754 DEBUG THREAD lepDisp common/os/posix/Thread.cc:372 | [lepDisp - 5aba3450] Joining thread [lepDisp - 0]
09-06 17:03:39.448: I/THREAD(7411): 1.754 DEBUG THREAD lepDisp common/os/posix/Thread.cc:378 | Thread::Join() thread is dead [lepDisp]
09-06 17:03:39.448: W/THREAD(7411): 1.754 HL_DBG THREAD lepDisp common/os/posix/Thread.cc:182 | Thread::~Thread() destroyed lepDisp - 0 -- started:28 running:24 joined:4
09-06 17:03:39.498: D/NETWORK(7411): 1.802 TRACE NETWORK JoinS-1 common/os/posix/Socket.cc:467 | Send(sockfd = 120, *buf = <>, len = 1, sent = <>)
09-06 17:03:39.498: W/THREAD(7411): 1.802 HL_DBG THREAD JoinS-1 common/os/posix/Thread.cc:164 | Thread::Thread() created auth - 0 -- started:28 running:24 joined:4
09-06 17:03:39.498: I/ALLJOYN(7411): 1.802 DEBUG ALLJOYN JoinS-1 ...re/src/EndpointAuth.cc:370 | EndpointAuth::Establish authMechanisms="ANONYMOUS"
09-06 17:03:39.498: I/ALLJOYN_AUTH(7411): 1.802 DEBUG ALLJOYN_AUTH JoinS-1 ...core/src/SASLEngine.cc:699 | SASL Responder mechanisms ANONYMOUS
09-06 17:03:39.498: I/ALLJOYN_AUTH(7411): 1.803 DEBUG ALLJOYN_AUTH JoinS-1 ...core/src/SASLEngine.cc:275 | Responder starting auth conversation ANONYMOUS
09-06 17:03:39.498: I/ALLJOYN_AUTH(7411): 1.803 DEBUG ALLJOYN_AUTH JoinS-1 ...core/src/SASLEngine.cc:294 | Current authSet ANONYMOUS
09-06 17:03:39.498: I/ALLJOYN_AUTH(7411): 1.803 DEBUG ALLJOYN_AUTH JoinS-1 ...core/src/SASLEngine.cc:220 | Initialized authMechanism ANONYMOUS
09-06 17:03:39.498: I/ALLJOYN_AUTH(7411): 1.803 DEBUG ALLJOYN_AUTH JoinS-1 ...core/src/SASLEngine.cc:224 | New Responder state WAIT_FOR_DATA
09-06 17:03:39.498: I/ALLJOYN_AUTH(7411): 1.803 DEBUG ALLJOYN_AUTH JoinS-1 ...core/src/SASLEngine.cc:452 | Responder sending AUTH ANONYMOUS
09-06 17:03:39.498: D/NETWORK(7411): 1.803 TRACE NETWORK JoinS-1 common/os/posix/Socket.cc:467 | Send(sockfd = 120, *buf = <>, len = 16, sent = <>)
09-06 17:03:39.498: I/ALLJOYN(7411): 1.803 DEBUG ALLJOYN JoinS-1 ...re/src/EndpointAuth.cc:433 | Sent AUTH ANONYMOUS
09-06 17:03:39.498: D/NETWORK(7411): 1.803 TRACE NETWORK JoinS-1 common/os/posix/Socket.cc:527 | Recv(sockfd = 120, buf = <>, len = 1, received = <>)
09-06 17:03:39.568: D/NETWORK(7411): 1.869 TRACE NETWORK JoinS-1 common/os/posix/Socket.cc:527 | Recv(sockfd = 120, buf = <>, len = 1, received = <>)
09-06 17:03:39.568: D/NETWORK(7411): 1.870 TRACE NETWORK JoinS-1 common/os/posix/Socket.cc:527 | Recv(sockfd = 120, buf = <>, len = 1, received = <>)
09-06 17:03:39.568: D/NETWORK(7411): 1.870 TRACE NETWORK JoinS-1 common/os/posix/Socket.cc:527 | Recv(sockfd = 120, buf = <>, len = 1, received = <>)

 

相关代码如下:

QStatus TCPTransport::Connect(const char* connectSpec, const SessionOpts& opts, BusEndpoint& newEp)
{
QCC_DbgHLPrintf(("TCPTransport::Connect(): %s", connectSpec));

QStatus status;
bool isConnected = false;

...


SocketFd sockFd = -1;
status = Socket(QCC_AF_INET, QCC_SOCK_STREAM, sockFd);
if (status == ER_OK) {
/* Turn off Nagle */
status = SetNagle(sockFd, false);
}

if (status == ER_OK) {
/*
* We got a socket, now tell TCP to connect to the remote address and
* port.
*/
status = qcc::Connect(sockFd, ipAddr, port);
if (status == ER_OK) {
/*
* We now have a TCP connection established, but DBus (the wire
* protocol which we are using) requires that every connection,
* irrespective of transport, start with a single zero byte. This
* is so that the Unix-domain socket transport used by DBus can pass
* SCM_RIGHTS out-of-band when that byte is sent.
*/
uint8_t nul = 0;
size_t sent;

status = Send(sockFd, &nul, 1, sent);
if (status != ER_OK) {
QCC_LogError(status, ("TCPTransport::Connect(): Failed to send initial NUL byte"));
}
isConnected = true;
} else {
QCC_LogError(status, ("TCPTransport::Connect(): Failed"));
}
} else {
QCC_LogError(status, ("TCPTransport::Connect(): qcc::Socket() failed"));
}

}

QStatus Connect(SocketFd sockfd, const IPAddress& remoteAddr, uint16_t remotePort)
{
QStatus status = ER_OK;
int ret;
struct sockaddr_storage addr;
socklen_t addrLen = sizeof(addr);

QCC_DbgTrace(("Connect(sockfd = %d, remoteAddr = %s, remotePort = %hu)",
sockfd, remoteAddr.ToString().c_str(), remotePort));

status = MakeSockAddr(remoteAddr, remotePort, &addr, addrLen);
if (status != ER_OK) {
return status;
}

ret = connect(static_cast<int>(sockfd), reinterpret_cast<struct sockaddr*>(&addr), addrLen);
if (ret == -1) {
if ((errno == EINPROGRESS) || (errno == EALREADY)) {
status = ER_WOULDBLOCK;
} else if (errno == EISCONN) {
status = ER_OK;
} else if (errno == ECONNREFUSED) {
status = ER_CONN_REFUSED;
} else {
status = ER_OS_ERROR;
QCC_LogError(status, ("Connecting (sockfd = %u) to %s %d: %d - %s", sockfd,
remoteAddr.ToString().c_str(), remotePort,
errno, strerror(errno)));
}
} else {
int flags = fcntl(sockfd, F_GETFL, 0);
ret = fcntl(sockfd, F_SETFL, flags | O_NONBLOCK);

if (ret == -1) {
status = ER_OS_ERROR;
QCC_LogError(status, ("Connect fcntl (sockfd = %u) to O_NONBLOCK: %d - %s", sockfd, errno, strerror(errno)));
/* Higher level code is responsible for closing the socket */
}
}

return status;
}

 

根据日志做了一个简单的流程对比:

时间(s) JoinSession() timeout JoinSession()成功
0
开始连接
Connect(sockfd = 122, remoteAddr = 109.123.117.48, remotePort = 9955)
开始连接

Connect(sockfd = 120, remoteAddr = 109.123.117.129, remotePort = 9955)
0
连接成功,发送数据
Send(sockfd = 120, *buf = <>, len = 1, sent = <>)
0
接收数据
Recv(sockfd = 120, buf = <>, len = 1, received = <>)
90
JoinSession() timeout
LocalTransport.cc:841 | Timed out waiting for METHOD_REPLY with serial 5
BusAttachment.cc:1477 | org.alljoyn.Bus.JoinSession returned ERROR_MESSAGE
(error=org.alljoyn.Bus.Timeout): ER_BUS_REPLY_IS_ERROR_MESSAGE

190 tcp connect() timeout
Connecting (sockfd = 122) to 109.123.117.48 9955: 110 - Connection timed out: ER_OS_ERROR

 

通过上面对比,可以发现JoinSession() timeout的根源在于tcp的系统调用connect阻塞超时了,即如下代码:

ret = connect(static_cast<int>(sockfd), reinterpret_cast<struct sockaddr*>(&addr), addrLen);

找到问题点了,可为什么会timeout呢?根据一般的tcp编程经验,很有可能是网络或者Server端出问题了。

会有什么问题呢?不知道。

 

刚在Android上安装好了tcpdump,下周抓包一看就知道啦!

你可能感兴趣的:(session)