java网络编程笔记

1 TCP的开销

a  连接协商三次握手,c->syn->s,s->syn ack->c, c->ack->s

b  关闭协商四次握手,c->fin->s, s->ack-c,s->fin->c,c->ack->s

c  保持数据有序,响应确认等计算开销

d  网络拥塞引起的重试开销

java网络编程笔记_第1张图片

 


java网络编程笔记_第2张图片
 

2 使用知名端口初始化 serversocket可能需要超级权限。ServerSocket(int port, int backlog)参数backlog用来配置连接队列,在accept之前预先完成连接,加速连接TCP连接阶段,默认为50.

 

backlog表示ServerSocket可以接受的同时最大连接数量,超过这个连接数量,将会拒绝连接。

 

如果要提高吞吐量,可以通过设置更大的ServerSocket.setReceiveBufferSize来实现,但是必须在bind之前设置,也就是说要先调用无参构造,然后再调用ServerSocket.bind(SocketAddress endpoint)

 

3 网络io写操作,提高吞吐量较好的实践有使用java.io.BufferedOutputStream,作为缓冲,减少用户线程和内核线程的切换频率。缓冲区大小一般大于ServerSocket.setReceiveBufferSize。

 

4 避免对象流死锁,较好的实践是如果要在同一个socket上构建对象输入流和输出流,最好是先构造输出流,再构造输入流。

 

5 tcp半关闭,shut down output,完成后,对方的read收到eof,结束阻塞。

 

6 tcp关闭可以用socket.close,socket.getoutputstream.close,socket.getinputstream.close,较好的方式是调用socket.getoutpurtstream.close,它会把未flush的flush掉。三个方法只需调用其中一个即可。isClose方法只会告诉我们本地tcp是否关闭,但是不能告诉我们远程是否关闭。

 

7 socket read 设置timeout时间,防止无止境阻塞。一般来说,timeout时间会设定为预期时间的两倍。timeout时间设置只对之后的阻塞读有效。

 

8 每个socket都有send buffer和receive buffer,这个buffer在内核地址空间而非jvm。buffer的size由操作系统实现决定,一般来说是2kb。send buffer可以在tcp关闭前随时设定,通过java.net.Socket.setSendBufferSize(int)设置。但是size的设置只是一种hint,不是绝对值。size设得越大,减少网络写次数,减少拥塞控制,tcp效率、吞吐量越高,类似http://en.wikipedia.org/wiki/Nagle's_algorithm 原理。

一般设定为MSS的三倍;至少大于对方receive buffer;receive buffer也要设定大一点,不拖send buffer后腿;

bufferedoutputstream,bytebuffer一般也要设定为匹配的值;

buffersize(bits)=bandwidth(bits/sec)* delay(sec),有点类似于线程数量的控制,不让cpu闲下来。这边的白话是不让buffer空下来,随时处于最大填充状态。

 

9 nagle算法,为了提高网络传输效率,减少网络拥塞,延迟小包发送,组装为大包一起发送。默认为开,可以通过setTcpnodelay为true来关闭。一般来说,不会关闭,除非是需要实时交互的场景。另外如果真需要关闭,可以采用巧妙的方式,使用bufferedoutputstream,把buffer size设为大于最大请求或响应包,socket send buffer和receive buffer也设为此值,用一次操作写出请求或响应,bufferedoutputstream.flush,充分利用网络。

 

下面是nagle的伪代码。

if there is new data to send
     
if the window size >= MSS and available data is >= MSS
       send complete MSS segment now
     
else
       
if there is unconfirmed data still in the pipe
         enqueue data 
in the buffer until an acknowledge is received
       
else
         send data immediately
       end 
if
     end 
if
   end 
if 

 总得来说,当启用nagle之后,数据被发送的条件有以下几种:

  • 多个小包积累成大包;
  • 没有未确认的小包数据;
  • 小包在缓冲里面待了200ms;

10 setlinger,用于优雅关闭socket,在关闭之前会把缓冲区里面的数据都发出去;缺点是会导致关闭速度较慢

 

11 keep alive,是个鸡肋。用于检测连接是否处于连接状态,检测对方是否active。它比较有争议,不是tcp协议的标准内容。另外检测需要消耗网络,当检测对方无反应,socket会被置为reset状态,不可读写。一般不推荐使用。

可以考虑用应用层的心跳检测替代。

参考http://hi.baidu.com/tantea/blog/item/580b9d0218f981793812bb7b.html

 

12  settrafficclass,设置流量类别,只是hint作用,具体效果取决于实现。有这些类别  IPTOS_LOWCOST (0x02),IPTOS_RELIABILITY (0x04),IPTOS_THROUGHPUT (0x08),IPTOS_LOWDELAY (0x10)

 

13 接口中文翻译http://hi.baidu.com/%EC%C5%BF%E1%D0%A1%B7%E5/blog/item/5d8e0f58aee147471038c29d.html

 
14 java nio进入新时代,提供非阻塞和多路复用特性,就绪选择器,事件驱动,不再是一个线程处理一个请求,大大节约了线程数量和内存,提高了可伸缩性。
 
15 开发广域网网络应用程序需要考虑防火墙,防火墙分为传输防火墙和应用防火墙,传输防火墙一般会拦截对非知名端口的访问,开放知名端口,如80端口;而应用防火墙一般是代理,在服务端和客户端中间,如http代理 http://baike.baidu.com/view/1159398.htm
http隧道穿透防火墙,白话为露丝想写情书给他男朋友,但是他爸妈(防火墙)不允许,于是露丝就把情书包装起来写给她的闺蜜莉莉(http 代理服务器,这个代理服务器在防火墙之内),再由莉莉转交给他男朋友。
 
16 另外是NAT,network address translation。子网共用一个公共ip,对外界透明。 http://baike.baidu.com/view/16102.htm
 
17 UDP size比较受限(512kb),不可靠,无连接,但是成本低。丢失不重发,重发需要应用控制,要考虑发送消息是否幂等。UDP数据报是个独立传输单位,在java里UDP用java.net.DatagramPacket。适用于发送心跳场景。DatagramSocket的connect,close操作都是针对本地的,并无对连接产生什么效果,毕竟是无连接协议。
如果想提高可靠度,可以在应用实现,clinet维护一个序列号,等待server响应这个序列号,否则进行重发策略。/*
* ReliableDatagramSocket.java.
* Copyright © Esmond Pitt, 1997, 2005. All rights reserved.
* Permission to use is granted provided this copyright
* and permission notice is preserved.
*/
import java.io.*;
import java.net.*;
import java.text.*;
import java.util.*;
// All times are expressed in seconds.
// ReliabilityConstants interface, just defines constants.
interface ReliabilityConstants
{
// Timeout minima/maxima
public static final int MIN_RETRANSMIT_TIMEOUT = 1;
public static final int MAX_RETRANSMIT_TIMEOUT = 64;
// Maximum retransmissions per datagram, suggest 3 or 4.
public static final int MAX_RETRANSMISSIONS = 4;
}
The  D;; class manages current and smoothed round-trip timers
and the related timeouts:
// RoundTripTimer class.
class RoundTripTimer implements ReliabilityConstants
{
float roundTripTime = 0.0f;// most recent RTT
float smoothedTripTime = 0.0f;// smoothed RTT
float deviation = 0.75f; // smoothed mean deviation
short retransmissions = 0;// retransmit count: 0, 1, 2, …
// current retransmit timeout
float currentTimeout =
minmax(calculateRetransmitTimeout());
/** @return the re-transmission timeout. */
private int calculateRetransmitTimeout()
{
return (int)(smoothedTripTime+4.0*deviation);
}
/** @return the bounded retransmission timeout. */
private float minmax(float rto)
{
return Math.min
(Math.max(rto, MIN_RETRANSMIT_TIMEOUT),
MAX_RETRANSMIT_TIMEOUT);
}
/** Called before each new packet is transmitted. */
void newPacket()
{
retransmissions = 0;
}
/**
 * @return the timeout for the packet.
 */
float currentTimeout()
{
return currentTimeout;
}
/**
 * Called straight after a successful receive.
 * Calculates the round-trip time, then updates the
 * smoothed round-trip time and the variance (deviation).
 * @param ms time in ms since starting the transmission.
 */
void stoppedAt(long ms)
{
// Calculate the round-trip time for this packet.
roundTripTime = ms/1000;
// Update our estimators of round-trip time
// and its mean deviation.
double delta = roundTripTime − smoothedTripTime;
smoothedTripTime += delta/8.0;
deviation += (Math.abs(delta)-deviation)/4.0;
// Recalculate the current timeout.
currentTimeout = minmax(calculateRetransmitTimeout());
}
/**
 * Called after a timeout has occurred.
 * @return true if it's time to give up,
 * false if we can retransmit.
 */
boolean isTimeout()
{
currentTimeout *= 2; // next retransmit timeout
retransmissions++;
return retransmissions > MAX_RETRANSMISSIONS;
}
} // RoundTripTimer class
The D



" class exports a D  method like the ones
we have already seen.
// ReliableDatagramSocket class
public class ReliableDatagramSocket
extends DatagramSocket
implements ReliabilityConstants
{
RoundTripTimer roundTripTimer = new RoundTripTimer();
private boolean reinit = false;
private long sendSequenceNo = 0; // send sequence #
private long recvSequenceNo = 0; // recv sequence #
/* anonymous initialization for all constructors */
{
init();
}
/**
 * Construct a ReliableDatagramSocket
 * @param port Local port: reeive on any interface/address
 * @exception SocketException can't create the socket
 */
public ReliableDatagramSocket(int port)
throws SocketException
{
super(port);
}
/**
 * Construct a ReliableDatagramSocket
 * @param port Local port
 * @param localAddr local interface address to use
 * @exception SocketException can't create the socket
 */
public ReliableDatagramSocket
(int port, InetAddress localAddr) throws SocketException
{
super(port, localAddr);
}
/**
 * Construct a ReliableDatagramSocket, JDK >= 1.4.
 * @param localAddr local socket address to use
 * @exception SocketException can't create the socket
 */
public ReliableDatagramSocket(SocketAddress localAddr)
throws SocketException
{
super(localAddr);
}
/**
 * Overrides DatagramSocket.connect():
 * Does the connect, then (re-)initializes
 * the statistics for the connection.
 * @param dest Destination address
 * @param port Destination port
 */
public void connect(InetAddress dest, int port)
{
super.connect(dest, port);
init();
}
/**
 * Overrides JDK 1.4 DatagramSocket.connect().
 * Does the connect, then (re-)initializes
 * the statistics for the connection.
 * @param dest Destination address
 */
public void connect(SocketAddress dest)
{
super.connect(dest);
init();
}
/** Initialize */
private void init()
{
this.roundTripTimer = new RoundTripTimer();
}
/**
 * Send and receive reliably,
 * retrying adaptively with exponential backoff
 * until the response is received or timeout occurs.
 * @param sendPacket outgoing request datagram
 * @param recvPacket incoming reply datagram
 * @exception IOException on any error
 * @exception InterruptedIOException on timeout
 */
public synchronized void sendReceive
(DatagramPacket sendPacket, DatagramPacket recvPacket)
throws IOException, InterruptedIOException
{
// re-initialize after timeout
if (reinit)
{
init();
reinit = false;
}
roundTripTimer.newPacket();
long start = System.currentTimeMillis();
long sequenceNumber = getSendSequenceNo();
// Loop until final timeout or some unexpected exception
for (;;)
{
// keep using the same sequenceNumber while retrying
setSendSequenceNo(sequenceNumber);
send(sendPacket);// may throw
int timeout =
(int)(roundTripTimer.currentTimeout()*1000.0+0.5);
long soTimeoutStart = System.currentTimeMillis();
try
{
for (;;)
{
// Adjust socket timeout for time already elapsed
int soTimeout = timeout−(int)
(System.currentTimeMillis()−soTimeoutStart);
setSoTimeout(soTimeout);
receive(recvPacket);
long recvSequenceNumber = getRecvSequenceNo();
if (recvSequenceNumber == sequenceNumber)
{
// Got the correct reply:
// stop timer, calculate new RTT values
long ms = System.currentTimeMillis()-start;
roundTripTimer.stoppedAt(ms);
return;
}
}
}
catch (InterruptedIOException exc)
{
// timeout: retry?
if (roundTripTimer.isTimeout())
{
reinit = true;
// rethrow InterruptedIOException to caller
throw exc;
}
// else continue 
}
// may throw other SocketException or IOException
} // end re-transmit loop
} // sendReceive()
/**
 * @return the last received sequence number;
 * used by servers to obtain the reply sequenceNumber.
 */
public long getRecvSequenceNo()
{
return recvSequenceNo;
}
/** @return the last sent sequence number */
private long getSendSequenceNo()
{
return sendSequenceNo;
}
/**
 * Set the next send sequence number.
 * Used by servers to set the reply
 * sequenceNumber from the received packet:
 *
.  * socket.setSendSequenceNo(socket.getRecvSequenceNo());
 *
 * @param sendSequenceNo Next sequence number to send.
 */
public void setSendSequenceNo(long sendSequenceNo)
{
this.sendSequenceNo = sendSequenceNo;
}
/**
 * override for DatagramSocket.receive:
 * handles the sequence number.
 * @param packet DatagramPacket
 * @exception IOException I/O error
 */
public void receive(DatagramPacket packet)
throws IOException
{
super.receive(packet);
// read sequence number and remove it from the packet
ByteArrayInputStream bais = new ByteArrayInputStream
(packet.getData(), packet.getOffset(),
packet.getLength());
DataInputStream dis = new DataInputStream(bais);
recvSequenceNo = dis.readLong();
byte[] buffer = new byte[dis.available()];
dis.read(buffer);
packet.setData(buffer,0,buffer.length);
}
/**
 * override for DatagramSocket.send:
 * handles the sequence number.
 * @param packet DatagramPacket
 * @exception IOException I/O error
 */
public void send(DatagramPacket packet)
throws IOException
{
ByteArrayOutputStreambaos = new ByteArrayOutputStream();
DataOutputStreamdos = new DataOutputStream(baos);
// Write the sequence number, then the user data.
dos.writeLong(sendSequenceNo++);
dos.write
(packet.getData(), packet.getOffset(),
packet.getLength());
dos.flush();
// Construct a new packet with this new data and send it.
byte[]data = baos.toByteArray();
packet = new DatagramPacket
(data, baos.size(), packet.getAddress(),
packet.getPort());
super.send(packet);
}
} // end of ReliableDatagramSocket class
public class ReliableEchoServer implements Runnable
{
ReliableDatagramSocket
socket;
byte[] buffer = new byte[1024];
DatagramPacket recvPacket =
new DatagramPacket(buffer, buffer.length);
ReliableEchoServer(int port) throws IOException
{
this.socket = new ReliableDatagramSocket(port);
}
public void run()
{
for (;;)
{
try
{
// Restore the receive length to the maximum
recvPacket.setLength(buffer.length);
socket.receive(recvPacket);
// Reply must have same seqno as request
long seqno = socket.getRecvSequenceNo();
socket.setSendSequenceNo(seqno);
// Echo the request back as the response
socket.send(recvPacket);
}
catch (IOException exc)
{
exc.printStackTrace();
}
} // for (;;)
} // run()
} // class

UDP支持多播和广播(广播是一种特殊的多播,尽量不使用广播,广播产生更多没必要的网络流量),而TCP只支持单播。一般多播用于服务发现,如jini look up。多播与多次单播相比,好处是减少开销、减小网络流量、减少服务器负载,而且速度更快,并且接受者接收到消息的时间更接近,对于某些场景来说很重要。

多播的缺点是继承了udp,不可靠网络,依赖路由器,安全问题更加复杂。并且多播并不知道多播消息会被哪些接受者接收,也不知道接受者是否接收到,设计协议的时候需要考虑这点。

发送多播消息,发送端可以用MulticastSocket和DatagramSocket,而接收端只能用MulticastSocket。

 

 

多播使用场景

 

(a) Software distribution

(b) Time services

(c) Naming services like

(d) Stock-market tickers, race results, and the like

(e) Database replication

(f) Video and audio streaming: video conferencing, movie shows, etc

(g) Multi-player gaming

(h) Distributed resource allocation

(i) Service discovery.

 

 
18 设计server需要考虑两点:同时连接的客户数量,每个连接的持续时间。当客户超过一个的时候,我们就要考虑用多线程,这个时候就涉及到线程如何创建、线程运行、线程销毁。服务器端由等待连接的线程和处理连接的线程组成。
服务器模型进化趋势:单线程接收连接、处理连接,无法同时处理多个客户,淘汰;每接收一个请求,创建一个线程对请求处理,可以并发,但是会耗尽服务器资源;采用线程池方式,并进行阀值控制,保护服务器,并进行优雅降级。
关于线程池线程数量的控制,一般是预创建N个线程,当峰值访问来临时,临时创建M个动态线程,一旦访问峰值降下来,再释放动态线程。
连接模型可以分为一个连接一个对话(请求-响应);一个连接多次对话。不同的模型,连接释放的方式不一样。
代码如下
public void processSession(Socket socket)
{
receive(request);
// process request and construct reply, not shown …
send(reply);
// close connection
socket.close();// exception handling not shown
}
 
void processSession(Socket socket)
{
while (receive(request)) // i.e. while not end-of-stream
{
// process request and construct reply, not shown …
send(reply);
}
// close connection
socket.close();// exception handling not shown
}
 多次对话的连接释放方式,可以根据输入流的返回结果,或者遇到eof来关闭连接。
归结点
(a) On receipt of an end-of-stream when reading the connection.
(b) If the request or the client is deemed invalid.
(c) On detection of a read timeout or idle timeout on the connection.
(d) After writing a reply
 
19 设计客户端,一般需要考虑连接失败和读数据超时。为了减少创建连接的开销,一般还会使用线程池,如rmi。
在请求-响应事务中,一般会采取header_body_trailler的结构。结合使用gathering、scattering io来较少内存和cpu开销
// Initialization - common to both ends
static final int HEADER_LENGTH = 16;
static final int BODY_LENGTH = 480;
static final int TRAILER_LENGTH = 16;
ByteBuffer header = ByteBuffer.allocate(HEADER_LENGTH);
ByteBuffer body = ByteBuffer.allocate(BODY_LENGTH);
ByteBuffer trailer = ByteBuffer.allocate(TRAILER_LENGTH);
ByteBuffer[]
buffers = new ByteBuffer[]
{ header, body, trailer };
// sending end - populate the buffers, not shown
long count  = channel.write(buffers);
// repeat until all data sent
// receiving end
long count = channel.read(buffers);
// repeat until all data read
 
对于浏览器加载页面的过程,由于加载对交互顺序不敏感,所以client可以同时并发多个连接、多个线程并行从服务端获取数据
 
20 jdk为编写并发服务器提供了很好的支持。如Executors提供了线程池,java.util.concurrent.ThreadPoolExecutor.DiscardPolicy提供了阀值控制,ThreadFactory提供了创建线程的方式。
 
21 客户端技术一般来用连接池,如memcache client每个连接某时刻只在一个request-reply事务中。或者多个事务公用一个连接,比如tair client,需要在协议上维护request-reply的匹配关系。
 
22 网络编程的八个谬论
a 网络是可靠的
b 网络没有延迟
c 带宽是无限的
d 网络是安全的
e 网络拓扑不会变
f  只有一个管理员
g 传输开销为0
h  网络是均匀的,网络由不同带宽的节点组成,木桶理论,以最小的那个为带宽。
i 网络io如同磁盘io。网络io更容易出错,不如磁盘稳定
j 和peer的状态是同步的。除非在应用层接收到ack,否则不要假定对方收到你的数据。
k 所有的网络失败都是可以检测的。
l  资源是无限的。其实网络编程涉及的资源包括端口、缓冲都是有限的
m 应用可以无限等待远程服务。任何远程调用都应该设定超时时间。
n 远程服务的响应是及时的
o 有单点失败。在分布式系统中,一般一个host的失败不会引发整个系统的崩溃。除非有一个中心节点。
p 只有一个资源分配器。每个host的资源都可以独立分配。
q 时间是完全统一的
 23 关闭tcp连接时,被动关闭方一旦检测到tcp连接处于半关闭状态一定要显示调用socket.close(),完成tcp关闭四次握手的过程,否则会导致大量的半关闭状态的socket,最后引起服务器崩溃。
 24 基于tcp实现应用层协议一定要注意处理半包和粘包问题,因为一个TCP段可能包含多个消息,一个消息也可以需要多个tcp段才能传输完毕。
参考:
 
 
 

你可能感兴趣的:(java,socket,网络编程)