DirectByteBuffer

     如果想使用堆外内存,那么可以使用DirectByteBuffer。

     主要用途:像Terracotta的BigMemory,既想要跟JVM相同进程内的存取,又希望不占用堆内存(因为对于需要长久保持的大数据占用过多的Heap会造成很多无用的Full GC,影响性能),那么就可以利用DirectByteBuffer。

     实现原理,归根结底就是JNI。

 

    上源代码:

    http://www.docjar.org/html/api/java/nio/DirectByteBuffer.java.html

     DirectByteBuffer(int capacity) {

            this(PlatformAddressFactory.alloc(capacity, (byte) 0), capacity, 0);
             address.autoFree();
       }

   层层跟进这个PlatformAddressFacotry,最后跟进到OSMemory这个类

http://www.docjar.org/html/api/org/apache/harmony/luni/platform/OSMemory.java.html

    public native long malloc(long length) throws OutOfMemoryError;

 

   这个是分配内存。

   获得buffer之后,可以调用put放入bytes,调用get重新取回byte。而这些对应了:

public native byte getByte(long address);
public native void setByte(long address, byte value);
   下面是测试:
public class DirectByteBufferTest {

	@Test
	public void test1() {
		int count = 100000;
		int cap = 1024 * 1024;
		testDirectBuf(count, cap);
		testNonDirectBuf(count, cap);

	}

	private void testDirectBuf(int count, int cap) {
		long st;
		long ed;
		ByteBuffer byteBuf = null;
		st = System.currentTimeMillis();

		for (int i = 0; i < count; i++) {
			byteBuf = allocDirectByteBuffer(cap);

		}
		ed = System.currentTimeMillis();
		System.out.println("alloc directByteBuffer for " + count
				+ " times spends " + (ed - st) + "ms");

		st = System.currentTimeMillis();

		for (int i = 0; i < count; i++) {
			processBuf(byteBuf);
		}
		ed = System.currentTimeMillis();
		System.out.println("directByteBuffer process " + count
				+ " times spends " + (ed - st) + "ms");
	}

	private ByteBuffer testNonDirectBuf(int count, int cap) {
		long st = System.currentTimeMillis();
		ByteBuffer byteBuf = null;
		for (int i = 0; i < count; i++) {
			byteBuf = allocNonDirectByteBuffer(cap);
		}
		long ed = System.currentTimeMillis();
		System.out.println("alloc nonDirectByteBuffer for " + count
				+ " times spends " + (ed - st) + "ms");
		st = System.currentTimeMillis();
		for (int i = 0; i < count; i++) {
			processBuf(byteBuf);

		}
		ed = System.currentTimeMillis();
		System.out.println("nonDirectByteBuffer process " + count
				+ " times spends " + (ed - st) + "ms");
		return byteBuf;
	}

	private ByteBuffer allocNonDirectByteBuffer(int cap) {
		ByteBuffer byteBuf = ByteBuffer.allocate(cap);
		return byteBuf;
	}

	private ByteBuffer allocDirectByteBuffer(int cap) {
		ByteBuffer directBuf = ByteBuffer.allocateDirect(cap);
		return directBuf;
	}

	private void processBuf(ByteBuffer buf) {
		byte[] bytes = "assfasf".getBytes();
		buf.put(bytes);
		for (int i = 0; i < bytes.length; i++) {
			byte b = buf.get(i);
			byte[] bytes2 = new byte[] { b };
			// System.out.print(new String(bytes2));
		}
		// System.out.println();
		// System.out.println(buf.capacity());
	}

}
   当int count = 100000;int cap = 1024 * 1024;时,结果如下:
  alloc directByteBuffer for 100,000 times spends 205,610ms

directByteBuffer process 100000 times spends 271ms

alloc nonDirectByteBuffer for 100000 times spends 35,283ms

nonDirectByteBuffer process 100000 times spends 71ms

  这个测试相当于创建100,000个ByteBuffer,耗时时间相差如此之大。创建比较大的内存时,direct memory 比 heap里慢一个数量级。

  而使用最后一个ByteBuffer做存取。可以看出对于存取而已,direct这种方式由于有jni,慢了10倍。

  我又固定了vm参数:  当参数为:-XX:MaxDirectMemorySize=1024m -Xmx1024m -Xms1024m时

alloc directByteBuffer for 100000 times spends 125,305ms

directByteBuffer process 100000 times spends 240ms

alloc nonDirectByteBuffer for 100000 times spends 28,408ms

nonDirectByteBuffer process 100000 times spends 56ms


  将cap改成1024*10,由于每个buffer变小了,因此上面那个测试的写法就不ok了,于是搞一个Buffer池。
  此时需要修改junit vm的参数,如果DirectMemory大小不够会报OutofMemoryError:Direct buffer memory。如果Heap不足,就是java heap 不足的错误了。
  当参数为:-XX:MaxDirectMemorySize=1024m -Xmx1024m时

alloc directByteBuffer for 100000 times spends 1,722ms

directByteBuffer process 100000 times spends 415ms

alloc nonDirectByteBuffer for 100000 times spends 107,563ms

nonDirectByteBuffer process 100000 times spends 4,149ms

 

  当参数为:-XX:MaxDirectMemorySize=1024m -Xmx1024m -Xms1024m时

 

alloc directByteBuffer for 100000 times spends 1,177ms

directByteBuffer process 100000 times spends 260ms

alloc nonDirectByteBuffer for 100000 times spends 63,807ms

nonDirectByteBuffer process 100000 times spends 4,523ms

 

这个测试是创建100,000个ByteBuffer,当创建比较小的内存时,direct buffer memory比在heap里创建效率快10倍。

而这个测试direct比heap里的慢几十倍。

 

 

对于Heap buffer而言,采用两种方式进行测试时,创建时间差距如此巨大的原因有可能是因为full gc,或者是因为操作系统内存不足(因为direct占用了太多的内存)启动了swap。

对于测试一,我单独测试heap buffer,注释掉了对于direct的测试,结果如下:

 

alloc nonDirectByteBuffer for 100000 times spends 28,141ms

nonDirectByteBuffer process 100000 times spends 237ms

这样就跟之前的结果很接近了。

 

而对于测试二而言,结果变成了:

 

alloc nonDirectByteBuffer for 100000 times spends 2,033ms

nonDirectByteBuffer process 100000 times spends 800ms

给测试二加上了-server

 

alloc nonDirectByteBuffer for 100000 times spends 1849ms

nonDirectByteBuffer process 100000 times spends 771ms

 

但是,还是比测试二的direct的1,177,260慢了不少。

主要原因可能是gc。

这也是BigMemory价值所在。

 

 

下面测试对象的情况,这个要对对象进行序列化和反序列化。

@Test
	public void test() {
		for(int i=0;i<100;i++){
			System.out.println("===="+i+"=====");
			testOutofHeapCache();

		}
		
	}

	private void testOutofHeapCache() {
		int cap=1000000;
		Foo foo=new Foo();
		foo.setF1(String.valueOf(System.currentTimeMillis()));
		foo.setF2("f2");
		long st=System.currentTimeMillis();
		ByteBuffer directBuf = ByteBuffer.allocateDirect(cap);
		long ed=System.currentTimeMillis();
		System.out.println("allocate cache spends "+(ed-st)+"ms");

		 st=System.currentTimeMillis();
		byte[] bytesFromObject = getBytesFromObject(foo);
		 ed=System.currentTimeMillis();
		System.out.println("serialize spends "+(ed-st)+"ms");
		st=System.currentTimeMillis();

		directBuf.put(bytesFromObject);
		ed=System.currentTimeMillis();
		System.out.println("put cache spends "+(ed-st)+"ms");

		byte[] result=new byte[bytesFromObject.length];
		st=System.currentTimeMillis();
		for(int i=0,size=bytesFromObject.length;i<size;i++){
			result[i]=directBuf.get(i);

		}
		ed=System.currentTimeMillis();
		System.out.println("get cache spends "+(ed-st)+"ms");

		st=System.currentTimeMillis();
		Foo resultFoo=(Foo)this.getObjectFromBytes(result);
		ed=System.currentTimeMillis();
		System.out.println("deserialize spends "+(ed-st)+"ms");
		assertEquals(foo.getF1(),resultFoo.getF1());
		assertEquals(foo.getF2(),resultFoo.getF2());
		directBuf.clear();
	}
	
	public static byte[] getBytesFromObject(Serializable obj) {
        if (obj == null) {
            return null;
        }
        ByteArrayOutputStream bo = new ByteArrayOutputStream();
        ObjectOutputStream oo;
		try {
			oo = new ObjectOutputStream(bo);
			oo.writeObject(obj);
		} catch (IOException e) {
			// TODO Auto-generated catch block
			e.printStackTrace();
		}
       
        return bo.toByteArray();
    }
	
	public static Object getObjectFromBytes(byte[] objBytes) {
        if (objBytes == null || objBytes.length == 0) {
            return null;
        }
        ByteArrayInputStream bi = new ByteArrayInputStream(objBytes);
        ObjectInputStream oi = null;
		try {
			oi = new ObjectInputStream(bi);
		} catch (IOException e) {
			// TODO Auto-generated catch block
			e.printStackTrace();
		}
        try {
        	
			return oi==null?null:oi.readObject();
		} catch (IOException e) {
			// TODO Auto-generated catch block
			e.printStackTrace();
		} catch (ClassNotFoundException e) {
			// TODO Auto-generated catch block
			e.printStackTrace();
		}
        return null;
    }


	static class Foo implements Serializable{
		/**
		 * 
		 */
		private static final long serialVersionUID = 1L;
		private String f1;
		private String f2;
		public String getF1() {
			return f1;
		}
		public void setF1(String f1) {
			this.f1 = f1;
		}
		public String getF2() {
			return f2;
		}
		public void setF2(String f2) {
			this.f2 = f2;
		}
		
	}

 结论是:

1 序列化和反序列化耗费了大量时间

2 多次执行,时间消耗为0;

====0=====

allocate cache spends 2ms

serialize spends 11ms

put cache spends 0ms

get cache spends 0ms

deserialize spends 3ms

====1=====

allocate cache spends 1ms

serialize spends 0ms

put cache spends 0ms

get cache spends 0ms

deserialize spends 0ms

====2=====

allocate cache spends 1ms

serialize spends 0ms

put cache spends 0ms

get cache spends 0ms

deserialize spends 1ms

====3=====

allocate cache spends 1ms

serialize spends 0ms

put cache spends 0ms

get cache spends 0ms

deserialize spends 0ms

你可能感兴趣的:(ByteBuffer)