谈到java BIO中的性能优化,大部分人都会说使用BufferedInputStream BufferedOutputStream,理由是IO是跟硬件交互,是耗时操作,使用BufferedInputStream减少IO交互次数能大量提升IO性能。
查看BufferedInputStream 源码,BufferedInputStream 有一个缓存数组
protected volatile byte buf[];
缓存数组大小默认是8192,也就是8K,(网上好多文章都是8M....文章一大抄 0.0)
private static int defaultBufferSize = 8192;
调用BufferedInputStream的读取方法时,会先判断缓存数组有没有可用数据,如果没有会先调用fill()方法将数据从硬盘加载到缓存中,然后从缓存数据中取数据返回。(调用fill前有个判断(第八行)如果要求的数据长度比缓存的数组容器长度(不是指有效缓存长度)大,那将直接从硬盘读取加载,不再走BufferedInputStream的内存缓存)
private int read1(byte[] b, int off, int len) throws IOException {
int avail = count - pos;
if (avail <= 0) {
/* If the requested length is at least as large as the buffer, and
if there is no mark/reset activity, do not bother to copy the
bytes into the local buffer. In this way buffered streams will
cascade harmlessly. */
if (len >= getBufIfOpen().length && markpos < 0) {
return getInIfOpen().read(b, off, len);
}
fill();
avail = count - pos;
if (avail <= 0) return -1;
}
int cnt = (avail < len) ? avail : len;
System.arraycopy(getBufIfOpen(), pos, b, off, cnt);
pos += cnt;
return cnt;
}
private void fill() throws IOException {
byte[] buffer = getBufIfOpen();
if (markpos < 0)
pos = 0; /* no mark: throw away the buffer */
else if (pos >= buffer.length) /* no room left in buffer */
if (markpos > 0) { /* can throw away early part of the buffer */
int sz = pos - markpos;
System.arraycopy(buffer, markpos, buffer, 0, sz);
pos = sz;
markpos = 0;
} else if (buffer.length >= marklimit) {
markpos = -1; /* buffer got too big, invalidate mark */
pos = 0; /* drop buffer contents */
} else { /* grow buffer */
int nsz = pos * 2;
if (nsz > marklimit)
nsz = marklimit;
byte nbuf[] = new byte[nsz];
System.arraycopy(buffer, 0, nbuf, 0, pos);
if (!bufUpdater.compareAndSet(this, buffer, nbuf)) {
// Can't replace buf if there was an async close.
// Note: This would need to be changed if fill()
// is ever made accessible to multiple threads.
// But for now, the only way CAS can fail is via close.
// assert buf == null;
throw new IOException("Stream closed");
}
buffer = nbuf;
}
count = pos;
int n = getInIfOpen().read(buffer, pos, buffer.length - pos);
if (n > 0)
count = n + pos;
}
fill()方法的重点在倒数第四行,BufferedInputStream是一个包装类,getInIfOpen()返回对象就是InputStream,由此可以看到BufferedInputStream的本质其实就是新增了一层内存缓存机制。
结论:只有两个情况下BufferedInputStream能优化io性能。
1.需要频繁读取小片数据流(一个字节或者几个,几十个字节)的情况。典型的就是java字符流的Writer 跟Reader了,字符串的大小都很小,频繁得与硬件打交道就会非常慢,一次性多加载点到内存中,再进行读取就快了,这也是为什么Writer跟Reader自带Buffer缓冲区,字节流不带的原因,字节流通常不需要频繁读取小片数据流来处理。
2.需要用到BufferedInputStream API的情况。
其他情况下,使用InputStream的read(byte[]),read(byte[], int, int)方法操作大片数据流(大于8k)即可,这个时候再用BufferedInputStream没有任何意义。当然,无脑用BufferedInputStream,通常也不会产生危害。