关于socket-error-10054的一点认知

今天开发网络模型,又遇到了10054的错误,这是之前已经遇到过,但是一直没有处理,今天在网上查找方案,看到一篇帖子,觉得有点道理,拷贝下来,大家参详一二,希望大牛给出正确的解释。帖子是英文,大概意思我明白,不过翻译不太好,原帖奉上。

问题描述:

 I have a C/S program. Client use socket to send a file to server, after send approximate more than 700k data, client(on win7) will receive a socket 10054 error which means Connection reset by peer.

Server worked on CentOS 5.4, client is windows7 virtual machine run in virtual box. client and server communicate via a virtual network interface. The command port(send log) is normal, but the data port(send file) have the problem. If it was caused by wrong configuration of socket buffer size or something else? If anyone can help me check the problem. Thanks.

Every time I call socket send a buffer equals 4096 byte send(socket, buffer, 4096, 0 )

CentOS socket config.



down vote favorite


#sysctl -a
...
net.ipv4.tcp_rmem = 4096        87380   4194304
net.ipv4.tcp_wmem = 4096        16384   4194304
net.ipv4.tcp_mem = 196608       262144  393216
net.ipv4.tcp_dsack = 1
net.ipv4.tcp_ecn = 0
net.ipv4.tcp_reordering = 3
net.ipv4.tcp_fack = 1



I'm not quite understand what the socket buffer configuration means, if this will cause the receive incomplete result problem?

解答:


down vote accepted

It's almost definitely a bug in your code. Most likely, one side thinks the other side has timed out and so closes the connection abnormally. The most common way this happens it that you call a receive function to get data, but you actually already got that data and just didn't realize it. So you're waiting for data that you have already received and thus time out.

For example:

1) Client sends a message.

2) Client sends another message.

3) Server reads both messages but thinks it only got one, sends an acknowledge.

4) Client receives acknowledge, waits for second acknowledge which server will never send.

5) Server waits for second message which it actually already received.

Now the server is waiting for the client and the client is waiting for the server. The server was coded incorrectly and didn't realize that it actually got two messages in one go. TCP does not preserve message boundaries.

If you tell me more about your protocol, I can probably tell you in more detail what went wrong. What constitutes a message? Which side sends when? Are there any acknowledgements? And so on.

But the short version is that each side is probably waiting for the other.

Most likely, the connection reset by peer is a symptom. Your problem occurs, one side times out and aborts the connection. That causes the other side to get a connection reset because the other side aborted the connection.


2 down vote favorite

I have a C/S program. Client use socket to send a file to server, after send approximate more than 700k data, client(on win7) will receive a socket 10054 error which means Connection reset by peer.

Server worked on CentOS 5.4, client is windows7 virtual machine run in virtual box. client and server communicate via a virtual network interface. The command port(send log) is normal, but the data port(send file) have the problem. If it was caused by wrong configuration of socket buffer size or something else? If anyone can help me check the problem. Thanks.

Every time I call socket send a buffer equals 4096 byte send(socket, buffer, 4096, 0 )

CentOS socket config.

4 down vote accepted

It's almost definitely a bug in your code. Most likely, one side thinks the other side has timed out and so closes the connection abnormally. The most common way this happens it that you call a receive function to get data, but you actually already got that data and just didn't realize it. So you're waiting for data that you have already received and thus time out.

For example:

1) Client sends a message.

2) Client sends another message.

3) Server reads both messages but thinks it only got one, sends an acknowledge.

4) Client receives acknowledge, waits for second acknowledge which server will never send.

5) Server waits for second message which it actually already received.

Now the server is waiting for the client and the client is waiting for the server. The server was coded incorrectly and didn't realize that it actually got two messages in one go. TCP does not preserve message boundaries.

If you tell me more about your protocol, I can probably tell you in more detail what went wrong. What constitutes a message? Which side sends when? Are there any acknowledgements? And so on.

But the short version is that each side is probably waiting for the other.

Most likely, the connection reset by peer is a symptom. Your problem occurs, one side times out and aborts the connection. That causes the other side to get a connection reset because the other side aborted the connection.

你可能感兴趣的:(关于socket-error-10054的一点认知)