IO操作指的是数据从磁盘或者(socket)加载到操作系统可以直接操作的地区,比如内存,缓存等。现代IO模型主要有以下五种:
read/write
CPU会通过指定的文件描述符找到文件,然后把数据从磁盘搬过来。
后来发现这种苦力活可以直接交给其他人来做:DMA(Direct Memory Access 直接内存访问)。有了DMA之后,CPU发起系统调用,DMA就去搬数据,这中间CPU可以做其他事情也可以等着它。我们在NIO讨论的话题都是默认系统是具有DMA设备的。
有了上面的讨论,首先看一下什么是阻塞IO
可能最开始学网络编程的时候,都是这样的。
(1)打开浏览器,搜索网络编程
(2)点开博客《xxxx从入门到精通》
(3)复制代码
client
package Basic.block;
import java.io.OutputStream;
import java.net.InetSocketAddress;
import java.net.Socket;
import java.net.SocketAddress;
public class Client {
public static void main(String[] args) throws Exception{
Socket socket=new Socket();
socket.connect(new InetSocketAddress("127.0.0.1",5555));
OutputStream outputStream = socket.getOutputStream();
outputStream.write("Hello,Server".getBytes());
outputStream.flush();
}
}
server
package Basic.block;
import java.io.InputStream;
import java.net.ServerSocket;
import java.net.Socket;
public class BlockServer {
public static void main(String[] args) throws Exception {
ServerSocket socket = new ServerSocket(5555);
while(true) {
Socket client = socket.accept();
InputStream inputStream = client.getInputStream();
byte[] buffer = new byte[1024];
int len = 0;
while ((len = inputStream.read(buffer)) != 0) {
String msg = new String(buffer);
System.out.println(msg);
}
}
}
}
(4)好了,学废了。去面试
(5)面试官:能不能讲讲什么是NIO?
你:NIO?这是啥?不是很了解?是我写的那个嘛?
面试官:好的,我们的面试就到这里。
哈哈,开个玩笑,其实大多数人学socket编程的时候确实是学到了这里就结束了,其实这才是个开始,我们来分析一下这个东西的问题。
首先先看客户端:
客户端代码其实挺简单的,就做了三件事
/**
* Connects this socket to the server.
*
* @param endpoint the {@code SocketAddress}
* @throws IOException if an error occurs during the connection
* @throws java.nio.channels.IllegalBlockingModeException
* if this socket has an associated channel,
* and the channel is in non-blocking mode
* @throws IllegalArgumentException if endpoint is null or is a
* SocketAddress subclass not supported by this socket
* @since 1.4
* @spec JSR-51
*/
public void connect(SocketAddress endpoint) throws IOException {
connect(endpoint, 0);
}
继续点进去看一下:
/**
* Connects this socket to the server with a specified timeout value.
* A timeout of zero is interpreted as an infinite timeout. The connection
* will then block until established or an error occurs.
*
* @param endpoint the {@code SocketAddress}
* @param timeout the timeout value to be used in milliseconds.
* @throws IOException if an error occurs during the connection
* @throws SocketTimeoutException if timeout expires before connecting
* @throws java.nio.channels.IllegalBlockingModeException
* if this socket has an associated channel,
* and the channel is in non-blocking mode
* @throws IllegalArgumentException if endpoint is null or is a
* SocketAddress subclass not supported by this socket
* @since 1.4
* @spec JSR-51
*/
public void connect(SocketAddress endpoint, int timeout) throws IOException {
if (endpoint == null)
throw new IllegalArgumentException("connect: The address can't be null");
if (timeout < 0)
throw new IllegalArgumentException("connect: timeout can't be negative");
if (isClosed())
throw new SocketException("Socket is closed");
if (!oldImpl && isConnected())
throw new SocketException("already connected");
if (!(endpoint instanceof InetSocketAddress))
throw new IllegalArgumentException("Unsupported address type");
InetSocketAddress epoint = (InetSocketAddress) endpoint;
InetAddress addr = epoint.getAddress ();
int port = epoint.getPort();
checkAddress(addr, "connect");
SecurityManager security = System.getSecurityManager();
if (security != null) {
if (epoint.isUnresolved())
security.checkConnect(epoint.getHostName(), port);
else
security.checkConnect(addr.getHostAddress(), port);
}
if (!created)
createImpl(true);
if (!oldImpl)
impl.connect(epoint, timeout);
else if (timeout == 0) {
if (epoint.isUnresolved())
impl.connect(addr.getHostName(), port);
else
impl.connect(addr, port);
} else
throw new UnsupportedOperationException("SocketImpl.connect(addr, timeout)");
connected = true;
/*
* If the socket was not bound before the connect, it is now because
* the kernel will have picked an ephemeral port & a local address
*/
bound = true;
}
可以看到最开始的注释:
/**
* Connects this socket to the server with a specified timeout value.
* A timeout of zero is interpreted as an infinite timeout. The connection
* will then block until established or an error occurs.
*/
他说你这个连接如果没有timeout,默认是阻塞的,直到服务端准备好。
这里就有问题了,如果服务端此时没有准备好或者网络很差,那client代码就执行不下去了。
然后就是write函数,其实write是一个系统调用:当前数据在虚拟机堆里面,首先copy到socket的buffer里面去,然后再发送到网络中。这个过程也是阻塞的。
其实在上面简单的一个小程序中就可以看到基本处处都是阻塞操作,也就是你你要等待一个操作完成后就才能继续之后的逻辑,这在比较高的并发中是不合理的。那么简单改进一下?
既然上面说到了阻塞,那我并发处理一下不久好了?
server做一下改进:
import java.io.InputStream;
import java.net.ServerSocket;
import java.net.Socket;
public class BlockServer {
public static void main(String[] args) throws Exception {
ServerSocket socket = new ServerSocket(5555);
while(true) {
Socket client = socket.accept();
new Thread(()->{
try {
BlockServer.handleConnect(client);
}catch (Exception e){
e.printStackTrace();
}
}).start();
}
}
public static void handleConnect(Socket socket)throws Exception{
InputStream inputStream = socket.getInputStream();
byte[] buffer = new byte[1024];
int len = 0;
while ((len = inputStream.read(buffer)) != 0) {
String msg = new String(buffer);
System.out.println(msg);
}
}
}
每一个socket都开启一个线程处理,这样就不会阻塞后面的逻辑。
这种虽然一定程度的解决了阻塞的问题,但是每个socket连接都开启一个线程,这谁顶得住?(PS:这里提一下,Linux系统最大线程数有配置文件决定,JVM最大线程的数量其实是根据你给JVM分配多少空间动态指定的)。那我再优化一下:使用线程池。
首先写一个线程池:
package Basic.block;
import java.util.concurrent.*;
public class ThreadPoolUtils {
private static int CORE_NUM = Runtime.getRuntime().availableProcessors();
private static ThreadPoolExecutor executor = new ThreadPoolExecutor(CORE_NUM * 2, CORE_NUM * 4,
10, TimeUnit.SECONDS, new LinkedBlockingQueue<>(1024), null, new ThreadPoolExecutor.AbortPolicy());
public static void execute(Runnable runnable) {
executor.execute(runnable);
}
}
然后server端:
import java.io.InputStream;
import java.net.ServerSocket;
import java.net.Socket;
public class BlockServer {
public static void main(String[] args) throws Exception {
ServerSocket socket = new ServerSocket(5555);
while(true) {
Socket client = socket.accept();
// new Thread(()->{
// try {
// BlockServer.handleConnect(client);
// }catch (Exception e){
// e.printStackTrace();
// }
// }).start();
ThreadPoolUtils.execute(()->BlockServer.handleConnect(client));
}
}
public static void handleConnect(Socket socket) {
try {
InputStream inputStream = socket.getInputStream();
byte[] buffer = new byte[1024];
int len = 0;
while ((len = inputStream.read(buffer)) != 0) {
String msg = new String(buffer);
System.out.println(msg);
}
} catch (Exception e) {
e.printStackTrace();
}
}
}
这里的多线程我们采用线程池的模式,需要说明的是阻塞队列设置的是1024,所以我们可以处理1024+4*core的连接数。这个队列还可以继续增大,这在一定程度上解决了线程数过多的问题。
话锋一转,突然就说到了mmap。其实也是IO里面一个常见的知识。首先说明一下什么是用户态和内核态,读取数据一般是:
磁盘–》kernel-》用户缓冲区-》kernel 。。。
也就是说在读取磁盘数据的时候,首先是DMA把数据拷贝到内核缓冲区,CPU再把数据拷贝到虚拟机堆里面。在用户态如果要操作文件数据就需要把文件copy过来,这其实是一种资源的浪费。mmap是一个系统调用。可以把文件在内核态映射到用户态,用户态操作的时候不用copy。
public class MapedByteBuffer {
public static void main(String[] args) throws Exception{
RandomAccessFile randomAccessFile=new RandomAccessFile("file1.txt","rw");
FileChannel fileChannel=randomAccessFile.getChannel();
MappedByteBuffer mappedByteBuffer = fileChannel.map(FileChannel.MapMode.READ_WRITE, 0, 5);
mappedByteBuffer.put(0,(byte)'H');
mappedByteBuffer.put(3,(byte)'9');
}
}
其实mmap函数解决了一部分问题,但是如果文件很大的时候,使用mmap映射也不是一个很好的方案,因为mmap只能映射部分数据到用户态空间,如果映射很多那么效率并不高(至于原因我没有深究,但是可以做一个猜测,这种映射应该是和操作系统的内存页映射类似,如果文件很大就会频繁中断)。Linux提供了sendFile函数,可以来完成直接文件发送。
/**
* file transform
*/
public class FileChannelDemoTransform {
public static void main(String[] args) throws Exception{
File file=new File("IntBufffer.png");
FileInputStream fileInputStream=new FileInputStream(file);
FileChannel pngReadChannel = fileInputStream.getChannel();
// 快速拷贝
FileOutputStream fileOutputStream=new FileOutputStream("copy.png");
FileChannel writeChannel = fileOutputStream.getChannel();
writeChannel.transferFrom(pngReadChannel,0,pngReadChannel.size());
pngReadChannel.close();
writeChannel.close();
}
}
这是一个简单的demo,transferFrom底层就是调用了sendFile函数。
以上其实就是说明了IO里面常见的一些问题,其实主要想说NIO的,下一节会着重分享一下NIO技术。