你是否希望在java中转换对象到字节流的时候让性能赶得上本地C++代码的处理速度?
如果使用标准的java序列化机制的话性能会使你大失所望。Java序列化
的实现是为其他需要服务而不是为了尽快和紧凑的序列化对象。
为什么我们需要快速和紧凑的序列化?首先系统很多都是分布式的,这就要求高校的在
服务节点之间进行状态通信。状态信息是被包装成对象形式的。从调优过的大多数系统中
可以得出开销主要在用于状态传输的字节缓存的序列化上。现在也有不少协议和机制可以
解决这个问题。你可以使用低效的协议比如Java序列化、XML或者JSON。也可以选择使用二进制协议,
这样可以快速和有效的多,但是前提是你需要对机制理解的很清楚并且有足够的能力去实现。
下面举例示范怎样通过简单二进制协议和Java中提供的机制实现堪比C/C++性能的代码
有三种方式可以实现?
1.Java序列化:使用实现了Java中标准Serializable接口
2.二进制或者ByteBuffer:通过使用ByteBuffer API以二进制的格式写字段。
3.二进制或者Unsafe:使用Unsafe类或者它的方法使用直接内存操作。
import sun.misc.Unsafe; import java.io.ByteArrayInputStream; import java.io.ByteArrayOutputStream; import java.io.ObjectInputStream; import java.io.ObjectOutputStream; import java.io.Serializable; import java.lang.reflect.Field; import java.nio.ByteBuffer; import java.util.Arrays; public final class TestSerialisationPerf { public static final int REPETITIONS = 1 * 1000 * 1000; private static ObjectToBeSerialised ITEM = new ObjectToBeSerialised( 1010L, true, 777, 99, new double[]{0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1.0}, new long[]{1, 2, 3, 4, 5, 6, 7, 8, 9, 10}); public static void main(final String[] arg) throws Exception { for (final PerformanceTestCase testCase : testCases) { for (int i = 0; i < 5; i++) { testCase.performTest(); System.out.format("%d %s\twrite=%,dns read=%,dns total=%,dns\n", i, testCase.getName(), testCase.getWriteTimeNanos(), testCase.getReadTimeNanos(), testCase.getWriteTimeNanos() + testCase.getReadTimeNanos()); if (!ITEM.equals(testCase.getTestOutput())) { throw new IllegalStateException("Objects do not match"); } System.gc(); Thread.sleep(3000); } } } private static final PerformanceTestCase[] testCases = { new PerformanceTestCase("Serialisation", REPETITIONS, ITEM) { ByteArrayOutputStream baos = new ByteArrayOutputStream(); public void testWrite(ObjectToBeSerialised item) throws Exception { for (int i = 0; i < REPETITIONS; i++) { baos.reset(); ObjectOutputStream oos = new ObjectOutputStream(baos); oos.writeObject(item); oos.close(); } } public ObjectToBeSerialised testRead() throws Exception { ObjectToBeSerialised object = null; for (int i = 0; i < REPETITIONS; i++) { ByteArrayInputStream bais = new ByteArrayInputStream(baos.toByteArray()); ObjectInputStream ois = new ObjectInputStream(bais); object = (ObjectToBeSerialised)ois.readObject(); } return object; } }, new PerformanceTestCase("ByteBuffer", REPETITIONS, ITEM) { ByteBuffer byteBuffer = ByteBuffer.allocate(1024); public void testWrite(ObjectToBeSerialised item) throws Exception { for (int i = 0; i < REPETITIONS; i++) { byteBuffer.clear(); item.write(byteBuffer); } } public ObjectToBeSerialised testRead() throws Exception { ObjectToBeSerialised object = null; for (int i = 0; i < REPETITIONS; i++) { byteBuffer.flip(); object = ObjectToBeSerialised.read(byteBuffer); } return object; } }, new PerformanceTestCase("UnsafeMemory", REPETITIONS, ITEM) { UnsafeMemory buffer = new UnsafeMemory(new byte[1024]); public void testWrite(ObjectToBeSerialised item) throws Exception { for (int i = 0; i < REPETITIONS; i++) { buffer.reset(); item.write(buffer); } } public ObjectToBeSerialised testRead() throws Exception { ObjectToBeSerialised object = null; for (int i = 0; i < REPETITIONS; i++) { buffer.reset(); object = ObjectToBeSerialised.read(buffer); } return object; } }, }; } abstract class PerformanceTestCase { private final String name; private final int repetitions; private final ObjectToBeSerialised testInput; private ObjectToBeSerialised testOutput; private long writeTimeNanos; private long readTimeNanos; public PerformanceTestCase(final String name, final int repetitions, final ObjectToBeSerialised testInput) { this.name = name; this.repetitions = repetitions; this.testInput = testInput; } public String getName() { return name; } public ObjectToBeSerialised getTestOutput() { return testOutput; } public long getWriteTimeNanos() { return writeTimeNanos; } public long getReadTimeNanos() { return readTimeNanos; } public void performTest() throws Exception { final long startWriteNanos = System.nanoTime(); testWrite(testInput); writeTimeNanos = (System.nanoTime() - startWriteNanos) / repetitions; final long startReadNanos = System.nanoTime(); testOutput = testRead(); readTimeNanos = (System.nanoTime() - startReadNanos) / repetitions; } public abstract void testWrite(ObjectToBeSerialised item) throws Exception; public abstract ObjectToBeSerialised testRead() throws Exception; } class ObjectToBeSerialised implements Serializable { private static final long serialVersionUID = 10275539472837495L; private final long sourceId; private final boolean special; private final int orderCode; private final int priority; private final double[] prices; private final long[] quantities; public ObjectToBeSerialised(final long sourceId, final boolean special, final int orderCode, final int priority, final double[] prices, final long[] quantities) { this.sourceId = sourceId; this.special = special; this.orderCode = orderCode; this.priority = priority; this.prices = prices; this.quantities = quantities; } public void write(final ByteBuffer byteBuffer) { byteBuffer.putLong(sourceId); byteBuffer.put((byte)(special ? 1 : 0)); byteBuffer.putInt(orderCode); byteBuffer.putInt(priority); byteBuffer.putInt(prices.length); for (final double price : prices) { byteBuffer.putDouble(price); } byteBuffer.putInt(quantities.length); for (final long quantity : quantities) { byteBuffer.putLong(quantity); } } public static ObjectToBeSerialised read(final ByteBuffer byteBuffer) { final long sourceId = byteBuffer.getLong(); final boolean special = 0 != byteBuffer.get(); final int orderCode = byteBuffer.getInt(); final int priority = byteBuffer.getInt(); final int pricesSize = byteBuffer.getInt(); final double[] prices = new double[pricesSize]; for (int i = 0; i < pricesSize; i++) { prices[i] = byteBuffer.getDouble(); } final int quantitiesSize = byteBuffer.getInt(); final long[] quantities = new long[quantitiesSize]; for (int i = 0; i < quantitiesSize; i++) { quantities[i] = byteBuffer.getLong(); } return new ObjectToBeSerialised(sourceId, special, orderCode, priority, prices, quantities); } public void write(final UnsafeMemory buffer) { buffer.putLong(sourceId); buffer.putBoolean(special); buffer.putInt(orderCode); buffer.putInt(priority); buffer.putDoubleArray(prices); buffer.putLongArray(quantities); } public static ObjectToBeSerialised read(final UnsafeMemory buffer) { final long sourceId = buffer.getLong(); final boolean special = buffer.getBoolean(); final int orderCode = buffer.getInt(); final int priority = buffer.getInt(); final double[] prices = buffer.getDoubleArray(); final long[] quantities = buffer.getLongArray(); return new ObjectToBeSerialised(sourceId, special, orderCode, priority, prices, quantities); } @Override public boolean equals(final Object o) { if (this == o) { return true; } if (o == null || getClass() != o.getClass()) { return false; } final ObjectToBeSerialised that = (ObjectToBeSerialised)o; if (orderCode != that.orderCode) { return false; } if (priority != that.priority) { return false; } if (sourceId != that.sourceId) { return false; } if (special != that.special) { return false; } if (!Arrays.equals(prices, that.prices)) { return false; } if (!Arrays.equals(quantities, that.quantities)) { return false; } return true; } } class UnsafeMemory { private static final Unsafe unsafe; static { try { Field field = Unsafe.class.getDeclaredField("theUnsafe"); field.setAccessible(true); unsafe = (Unsafe)field.get(null); } catch (Exception e) { throw new RuntimeException(e); } } private static final long byteArrayOffset = unsafe.arrayBaseOffset(byte[].class); private static final long longArrayOffset = unsafe.arrayBaseOffset(long[].class); private static final long doubleArrayOffset = unsafe.arrayBaseOffset(double[].class); private static final int SIZE_OF_BOOLEAN = 1; private static final int SIZE_OF_INT = 4; private static final int SIZE_OF_LONG = 8; private int pos = 0; private final byte[] buffer; public UnsafeMemory(final byte[] buffer) { if (null == buffer) { throw new NullPointerException("buffer cannot be null"); } this.buffer = buffer; } public void reset() { this.pos = 0; } public void putBoolean(final boolean value) { unsafe.putBoolean(buffer, byteArrayOffset + pos, value); pos += SIZE_OF_BOOLEAN; } public boolean getBoolean() { boolean value = unsafe.getBoolean(buffer, byteArrayOffset + pos); pos += SIZE_OF_BOOLEAN; return value; } public void putInt(final int value) { unsafe.putInt(buffer, byteArrayOffset + pos, value); pos += SIZE_OF_INT; } public int getInt() { int value = unsafe.getInt(buffer, byteArrayOffset + pos); pos += SIZE_OF_INT; return value; } public void putLong(final long value) { unsafe.putLong(buffer, byteArrayOffset + pos, value); pos += SIZE_OF_LONG; } public long getLong() { long value = unsafe.getLong(buffer, byteArrayOffset + pos); pos += SIZE_OF_LONG; return value; } public void putLongArray(final long[] values) { putInt(values.length); long bytesToCopy = values.length << 3; unsafe.copyMemory(values, longArrayOffset, buffer, byteArrayOffset + pos, bytesToCopy); pos += bytesToCopy; } public long[] getLongArray() { int arraySize = getInt(); long[] values = new long[arraySize]; long bytesToCopy = values.length << 3; unsafe.copyMemory(buffer, byteArrayOffset + pos, values, longArrayOffset, bytesToCopy); pos += bytesToCopy; return values; } public void putDoubleArray(final double[] values) { putInt(values.length); long bytesToCopy = values.length << 3; unsafe.copyMemory(values, doubleArrayOffset, buffer, byteArrayOffset + pos, bytesToCopy); pos += bytesToCopy; } public double[] getDoubleArray() { int arraySize = getInt(); double[] values = new double[arraySize]; long bytesToCopy = values.length << 3; unsafe.copyMemory(buffer, byteArrayOffset + pos, values, doubleArrayOffset, bytesToCopy); pos += bytesToCopy; return values; } }
结果是:
2.8GHz Nehalem - Java 1.7.0_04 ============================== 0 Serialisation write=2,517ns read=11,570ns total=14,087ns 1 Serialisation write=2,198ns read=11,122ns total=13,320ns 2 Serialisation write=2,190ns read=11,011ns total=13,201ns 3 Serialisation write=2,221ns read=10,972ns total=13,193ns 4 Serialisation write=2,187ns read=10,817ns total=13,004ns 0 ByteBuffer write=264ns read=273ns total=537ns 1 ByteBuffer write=248ns read=243ns total=491ns 2 ByteBuffer write=262ns read=243ns total=505ns 3 ByteBuffer write=300ns read=240ns total=540ns 4 ByteBuffer write=247ns read=243ns total=490ns 0 UnsafeMemory write=99ns read=84ns total=183ns 1 UnsafeMemory write=53ns read=82ns total=135ns 2 UnsafeMemory write=63ns read=66ns total=129ns 3 UnsafeMemory write=46ns read=63ns total=109ns 4 UnsafeMemory write=48ns read=58ns total=106ns 2.4GHz Sandy Bridge - Java 1.7.0_04 =================================== 0 Serialisation write=1,940ns read=9,006ns total=10,946ns 1 Serialisation write=1,674ns read=8,567ns total=10,241ns 2 Serialisation write=1,666ns read=8,680ns total=10,346ns 3 Serialisation write=1,666ns read=8,623ns total=10,289ns 4 Serialisation write=1,715ns read=8,586ns total=10,301ns 0 ByteBuffer write=199ns read=198ns total=397ns 1 ByteBuffer write=176ns read=178ns total=354ns 2 ByteBuffer write=174ns read=174ns total=348ns 3 ByteBuffer write=172ns read=183ns total=355ns 4 ByteBuffer write=174ns read=180ns total=354ns 0 UnsafeMemory write=38ns read=75ns total=113ns 1 UnsafeMemory write=26ns read=52ns total=78ns 2 UnsafeMemory write=26ns read=51ns total=77ns 3 UnsafeMemory write=25ns read=51ns total=76ns 4 UnsafeMemory write=27ns read=50ns total=77ns
分析:
在我的2.4GHZ速度的Sandy Bridage笔记本上使用java序列化读写一个小对象的时间花费大概10000纳秒上下。
然而即使我们使用Unsafe类来实现也最多会快个100ns。在上下文环境中,java序列化的花销跟网络节点跃迁的性能
不相上下。如果是在同构系统下使用快速IPC机制进行传输那性能开销是很大的。
造成java序列化高开销的原因有很多。比如说它写的时候会包含全限定类和每个对象字段的版本信息。还有ObjectOutputStream保存着若干被写进流的对象而且可以通过调用close方法进行合并操作。二进制格式会比java序列化在对象保存上少好多空间。使用基于数组的格式会比java序列化节省更多的空间。因为会略过字段名字。通畅情况下基于文本协议的例如XML或者JSON的效率比java序列化更差。而且java序列化是RMI实现的标准机制。最根本的原因是执行指令的多少。在虚拟机中Unsafe方法是很有优势的,其他厂商的虚拟机也是如此,优化器会把这些实现用汇编语言指令替换掉来进行内存操作。对于基础类型来说,在单核X86上的MOV指令只是单个循环。
你会说“Goole的 Protocol Buffers怎么样?”它确实很有用并且性能和灵活性也比java序列化好很多。但是还是很难和使用Unsafe相提并论。Protocol Buffers解决问题的类型和提供的优雅的自描述信息是为了扩语言调用服务的。
当不同系统的字节保存顺序不同时,Unsafe是按照本地系统的顺序来写的。对于相同类型的或者IPC来说是很有用的。dns如果使用的字节保存顺序不同的时候就要进行转换了。
我们应该怎么处理多个版本的类或者决定对象属于哪个类?
通过一个整形的header值来指示类的实现版本和隶属关系。
经常听到针对二进制协议的反对意见,怎么样使二进制协议变得可读和便于调试?最简单的方式就是开发一个二进制格式的工具。
结论
可以通过同样的技术高效的使用二进制流来序列化对象取得和本地C/C++代码相当的性能。