Learn socket

socket /套接字


    Sockets let you send raw streams of bytes back and forth between two computers, giving you fairly low-level access to the TCP/IP protocol. See the File I/O Amanuensis for sample code to do that. In TCP/IP each computer has a name, such as roedy.mindprod.com. However, various TCP/IP programs could be running on that computer. Each Socket gets a assigned a number called a port. The HTTP server would usually be assigned 80. DbAnywhere is usually 8889. This way you can specify which service on the local or remote machine you want to connect with. The Socket is specified like this: roedy.mindprod.com:8889.



     socket提供了在主机之间传递原始字节流的功能,以比较底层的方式访问tcp/ip协议层.可以类似访问文件i/o的方式实现这一功能(在unix 中,系统资源是以文件的方式进行访问的,其中也包括网络资源).tcp/ip协议规定,每台主机都有一个名称,例如 roedy.mindprod.com.然而,同一台主机上有可能同时运行很多tcp/ip程序.每个socket被指派了一个叫做端口的数字以加以区分不同的应用或者连接.http应用服务器的端口一般被指定为80,DbAnywhere通常指定为8889.我们通过这种方式区分你向远程或者本地主机请求连接的服务.一个socket被定义为 地址:端口, 例如 roedy.mindprod.com:8889


    Flush / 刷新


    If you write to a Socket, you usually need to call flush to force the data out onto the net. If you fail to do that, you could wait forever for a response because your complete query was never sent. You don’t need flush if you are sending a steady stream of data that will push earlier data out onto the net.



     如果向一个socket写入数据,通常需要调用flush方法去把数据发送到网络.如果操作失败,可能由于完整的请求信息未曾发送成功而导致持续等待响应.如果使用稳定的数据流的方式,不需要调用flush方法,因为数据流会自动把先前的数据发送到网络.



    Blocking Read / 读堵塞

   If you read from a Socket, you can hang waiting forever if you use a blocking read. Socket.setSoTimeout controls the timeout. The read will eventually die when the Socket connection fails. This will happen when:

        * You close the Socket at this end.
        * The far end sends a disconnect signal.
        * TCP cannot get an acknowlegement for packets it has sent, even after several retransmissions. These packets could either be data sent by the application, or keep-alive messages (if keep-alive has been turned on). Don’t confuse this with the meaningless HTTP Keep-Alive parameter.



      由socket读取数据时,如果使用堵塞的读操作,可能会导致永久地等待.Socket的setSoTimeout方法 控制了超时的期限.在socket连接失败的情况下,读取数据的操作最终会被停止.

这种情况通常发生在以下几种情况:

        1.本地关闭了socket,

        2.远程主机/终端发送了断开连接的信号,

        3.tcp协议实现在尝试多次重发数据仍无法获得对方针对已发送数据包的确认信息,或者无法获得keep-alive的信息(如果tcp协议的keep- alive选项已经被启用).另外不要和http协议的keep-alive参数相混淆.(http的keep-alive选项是指客户端与服务器之间建立有效的长连接,避免了重复建立连接的消耗,尤其对提供静态资源访问的网站能够很大的提高访问效率)



    Timeouts /超时


    Java offers Socket.setSoTimeout to control how long you are willing to wait for a read to complete and Socket.setSoLinger to control how long it lingers, (waits to close when there are still unsent data). When you shutdown, the other end should continue to read any buffered data to let the other end close before closing itself. setSoTimeout has no effect on how long you are willing to wait for a write (how long you are willing to wait for the other end to accept data), just on how long you are willing to wait for the other end to produce data.

    To add to the misery, Windows partially ignores the timeout. On connect, the JVM tries to resolve the hostname to IP/port. Windows tries a netbios ns query on UDP port 137 with a timeout of 1.5 seconds, ignores any ICMP port unreachable packets and repeats this two more times, adding up to a value of 4.5 seconds. I suggest putting critical hostnames in your HOSTS file to make sure they are resolved quickly. Another possibility is turning off NETBIOS altogether and running pure TCP/IP on your LAN.



    socket的java实现接口提供了setSoTimeout方法设置希望等待完成读取操作的时间期限,提供setSoLinger方法控制关闭等待期限(等待尚未发送的数据,然后关闭连接). 当一方关闭连接时,另一方仍会在读取到缓冲区中的通知关闭连接的数据以后关闭连接(这句话不知道这样翻译是否准确,不过实际操作应该是这样的,可以这样理解,当一端单方面关闭连接的时候,应该通知另一方你已经关闭连接,以便对方获悉并且关闭连接).setSoTimeout选项对等待完成写操作的期限没有影响(等待对方对方接收数据的期限),只和等待对方产生数据的期限有关.(setSoTimeout和对方发送响应数据是否超时有关和对方何时接收数据没有关系).



     比较令人苦闷的是,windows系统不负责任地忽略超时.对于一个连接.java虚拟机努力将域名解析为ip地址和端口号.而windows使用udp 的137端口向域名解析服务器发送域名解析查询,超时设为1.5秒.忽略了任何的icmp端口不可访问的数据包并且连续再重复发送两次相同的请求(一共是三次).总计需要等待4.5秒.因此强烈建议把常用的域名地址和对应的ip地址和端口写在hosts文件中以确保可以迅速解析.另外就是在局域网完全中关闭windows的NETBIOS服务,完全使用tcp/ip访问资源.



    Disconnet Detection / 探测连接关闭


    Since TCP/IP sends no packets except when there is traffic, without Socket.setKeepAlive( true ), it has no way of noticing a disconnect until you start trying to send (or to a certain extent receive) traffic again. Java has the Socket.setKeepAlive( true ) method to ask TCP/IP to handle heartbeat probing without any data packets or application programming. Unfortunately, you can’t tell it how frequently to send the heartbeat probes. If the other end does not respond in time, you will get a socket exception on your pending read. Heartbeat packets in both directions let the other end know you are still there. A heartbeat packet is just an ordinary TCP/IP ack packet without any piggybacking data.



    当网络繁忙的时候,tcp/ip无法发送数据包.如果没有设定socket的setKeepAlive为true,我们无法获悉一个连接已经关闭除非试图再次进行发送操作(或者进行某些接收操作).java通过设定socket的setKeepAlive为true的方式要求tcp/ip协议进行心跳检测,不需要发送任何数据包或者应用级别的编程.然而不幸地是你无法肯定tcp/ip以怎样的频率发送心跳探测信号.如果另一方无法及时响应,当你试图进行读取操作的时候就会产生socket的异常.心跳包使双方都能获知对方是否保持连接.心跳包只是一个普通的tcp/ip的ack报文不需要搭载任何的其他数据.


    When the applications are idling, your applications could periodically send tiny heartbeat messages to each other. The receiver could just ignore them. However, they force the TCP/IP protocol to check if the other end is still alive. These are not part of the TCP/IP protocol. You would have to build them into your application protocols. They act as are-you-still-alive? messages. I have found Java’s connection continuity testing to be less that 100% reliable. My bullet-proof technique to detect disconnect is to have the server send an application-level heartbeat packet if it has not sent some packet in the last 30 seconds. It has to send some message every 30 seconds, not necessarily a dummy heartbeat packet. The heartbeat packets thus only appear when the server is idling. Otherwise normal traffic acts as the heartbeat. The Applet detects the lack of traffic on disconnect and automatically restarts the connection. The downside is your applications have to be aware of these heartbeats and they have to fit into whatever other protocol you are using, unlike relying on TCP/IP level heartbeats.



     当应用处于空闲状态的时候,你的应用可以间断地向彼此发送小的心跳信息.接收者可以完全忽视它们,但是它们强制tcp/ip协议去核实另一方是否存活.这不是tcp/ip协议通信规范的一部分,你需要建立自己的心跳协议,例如 发送内容为' are-you-still-alive? '的信息,原作者通过测试发现java的连接持续性并非100%的可靠.他的银弹技术是通过服务端每隔30秒发送一个应用级别的心跳包,如果最近30秒内没有接收到任何数据包.服务器必须每隔30秒发送一个数据包,不一定必须是傀儡的心跳数据包.心跳数据包只当服务器空闲的时候才会产生.否则的话,普通的网络通信就可以替代心跳数据包的功能.applet探测发现由于断开连接导致的通信中断后就会重新建立连接.负面影响是你的应用必须时时关注这些心跳状态,并且如果你使用其它网络协议你也要实现相应的心跳协议,不同余依赖于tcp/ip层的心跳.


    However, it is simpler to use the built-in Socket.setKeepAlive( true ) method to ask TCP/IP to handle the heartbeat probing without any data packets or application programming. Each end with nothing to say just periodically sends an empty data packet with its current sequence, acknowledgement and window numbers.


   然而,使用socket内置的setKeepAlive(true)方法去要求tcp/ip进行心跳探测不使用任何数据包或者应用级别地编程实现看起来更加容易一些.每个终端只需间歇地发送一个包含当前序列的空的数据包,确认信息和滑动窗口号就可以了.


    The advantage of application level heartbeats is they let you know the applications at both ends are alive, not just the communications software.



    应用级别的心跳优点在于它们能够使你了解两端的应用都是否存活,而不在于只是通信软件.



Server Side Socketing /服务器端套接字


    For a server to accept connections from the outside world, first it opens a ServerSocket on a port, but not connected to any client in particular.



    对于一个接收外部连接的服务器,首先在某个没有连接任何客户端的端口上开启一个serversocket,代码如下


    ServerSocket serverSocket = new ServerSocket( port );

    Then it calls accept, which blocks until a call comes in.

    Socket clientSocket = serverSocket.accept();

    At that point a new ordinary Socket gets created that is connected to the incoming caller. Usually the server would spin off a Thread, or assign a Thread from a pool to deal with the new Socket and loop back to do another accept.



     当接收到一个请求时会新建一个的普通的socket,通常服务器会启动一个线程或者由线程池中取出一个线程处理新产生的socket,然后循环处理下一个请求.


    You can set up your miniature server even if you don’t have a domain name. They can get to you by name: ip:port e.g. 65.110.21.43:2222. Even if your are behind a firewall, you use the external facing IP of the firewall. You must then configure your firewall to let incoming calls through and to direct them to the correct server on the lan.



     即使你并不拥有一个域名,你也可以建立自己的服务器.他人可以通过ip地址和端口的方式( e.g. 65.110.21.43:2222)访问你的服务器(如果在广域网上这要求你拥有自己的固定ip,这一般比拥有域名的成本还要高,不过在局域网内你可以尝试局域网地址),如果你在处于防火墙保护的局域网内,你可以使用防火墙的对外ip.你必须配置你的防火墙以便请求数据包可以通过并且访问局域网内正确的服务器.



    Flow Control / 流控制


    With Socket.setReceiveBufferSize() you can hint to the underlying OS how much to buffer up incoming data. It is not obligated to listen to you. Don’t confuse this with the buffer on the BufferedInputStream. This is the lower level buffer on the raw socket. Large buffers are not always desirable. Using small buffers can tell the other end you are getting behind, and it won’t send data as quickly. If the data is real time, and the amount of data sent is variable depending on how fast you process it, large buffers mean you can get way behind and never catch up.



     使用socket的setReceiveBufferSize()方法你可以告诉底层的操作系统缓存多大的接收数据.但是这并非完全由你决定.不要将 socket的缓冲区和BufferedInputStream的缓冲区混淆.这是原始socket的底层的缓冲区.过大的缓冲区并不总能很好地满足需要.使用小的缓冲区能够通知另一端你的处理速度已经落后了,因此对方不会继续马上发送数据过来(大的缓冲区,对方发送过来的数据有可能还没有读取并被处理,但还留有很大的空间,因此对方会继续发送数据填满余下的空间,但是有可能导致大量的数据堆积在缓冲区中无法处理,理想状态是使用小的缓存区,处理完当前数据后在接收,处理下一个数据).如果数据不是实时的,发送过来的数据量动态地依赖于处理数据的速度,过大的缓冲区会导致你处理的数据量一直落后于接收的数据量,并且永远无法赶上.


    There is a mysterioous method Socket.setTcpNoDelay( true ) to "disable Nagle’s algorithm". As is typical, there is no explanation what Nagle’s algorinthm is. My TCP/IP text book makes no mention of it. If you are dealing with near real-time data then you may want to look into disabling Nagle’s algorithm. That algorithm attempts to ensure that TCP doesn’t send lots of undersized IP packets by buffering-up submitted data and keeping it for typically for a few milliseconds to see if you are going to give it some more data that could go into the same packet. I am not sure if flush is sufficient to send a packet on its way immediately.



     socket的setTcpNoDelay( true )很神秘地用来关闭Nagle算法.正如这里不解释Nagle算法一样,这里也不讨论这个setTcpNoDelay方法. 如果你处理近乎实时的数据,你可能会研究如何关闭Nagle算法.Nagle算法通过暂存已经提交发送的数据包许多毫秒的时间以便判断是否还需要向这个数据包写入更多数据,确保tcp不发送大量的长度过小的数据包.我不确定是否flush方法能够充分地立即发送一个数据包.



   Graceful Shutdown / 优雅地关闭


    If you simply close a socket, you can lose data that were previously sent but which have not yet been delivered. You may chop things off in mid message. So, how to shut down gracefully? My approach is this. When the client wants to shut down, it sends a close message. The server echos it back and on receipt of the close message, the client closes the socket. That way the client is guaranteed to be unhooked from waiting on a read, and you are guaranteed the server and client each recieved the last remaining messages before the socket was closed.



     如果你简单地关闭一个socket连接,你可能会丢失先前发送但并未抵达(交付)的数据.这可能会导致数据不完整.所以,如果优雅地关闭连接呢?作者的理论是:当客户端试图关闭连接时,它首先要发送一条关闭信息.服务器原样返回关闭信息内容和确认关闭信息(增加确认关闭信息的做法可能是为了避免发送超时的数据包返回给发送者,两者内容可能是相同的),客户端收到确认信息后关闭连接.这时客户端要确保解除等待读取操作的状态,并且你要确保客户端和服务器在关闭前都收到了最后的信息.

你可能感兴趣的:(应用服务器,算法,socket,网络应用,网络协议)