Hessian3.2.1在序列化32.5k字符串时的问题

转于自己在公司的Blog:
http://pt.alibaba-inc.com/wp/experience_929/hessian-big-string-serialize-problems.html

网站出现比较奇怪的现象,线上总有些Offer信息反序化时出错,而测试环境却没有出现过,

通过远程调试线上环境,发现Hessian3.2.1在处理0x33标记时,会出错,跟进去发现:

Hessian3.2.1在处理大String时,以32k为一个块,最后一个不满32k的块分三种情况处理:

(1) 块大小为1到31个byte时:
用一个byte表示长度(一个byte最多表示31的长度),后面跟具体数据。

(2) 块大小为32到1023个byte时:
用两个byte表示长度,后面跟具体数据,因只需要高位byte的4个bit位,加低位byte就能够表示1023的长度,为了不浪费,高位byte的另外4个bit位被压缩用于flag。

(3) 块大小为1023到32k-1个byte时:
用’s’前缀标识为小块,进行块读取。

问题出在第二种情况,Hessian2Input没有还原压缩的4个bit位。

(1) 异常信息: (出错位置上的串已改为xxx表示)
expected string at 0x33 java.lang.String (xxx)

at com.caucho.hessian.io.Hessian2Input.error(Hessian2Input.java:2714)

at com.caucho.hessian.io.Hessian2Input.expect(Hessian2Input.java:2685)

at com.caucho.hessian.io.Hessian2Input.parseChar(Hessian2Input.java:2442)

at com.caucho.hessian.io.Hessian2Input.readString(Hessian2Input.java:1285)

at com.caucho.hessian.io.JavaDeserializer$StringFieldDeserializer.deserialize(JavaDeserializer.java:580)

... 21 more

at com.caucho.hessian.io.JavaDeserializer.logDeserializeError(JavaDeserializer.java:671)

at com.caucho.hessian.io.JavaDeserializer$StringFieldDeserializer.deserialize(JavaDeserializer.java:584)

at com.caucho.hessian.io.JavaDeserializer.readObject(JavaDeserializer.java:233)

at com.caucho.hessian.io.JavaDeserializer.readObject(JavaDeserializer.java:157)

at com.caucho.hessian.io.Hessian2Input.readObjectInstance(Hessian2Input.java:2067)

at com.caucho.hessian.io.Hessian2Input.readObject(Hessian2Input.java:1592)

at com.caucho.hessian.io.Hessian2Input.readObject(Hessian2Input.java:1576)

... 14 more

(2) 问题代码: (Hessian2Input)
private int parseChar() throws IOException
{

while (_chunkLength <= 0) {

if (_isLastChunk)

return -1;

int code = _offset < _length ? (_buffer[_offset++] & 0xff) : read();

switch (code) {

case BC_STRING_CHUNK:

_isLastChunk = false;

_chunkLength = (read() << 8 ) + read();

break;

case 'S':

_isLastChunk = true;

_chunkLength = (read() << 8 ) + read();

break;

case 0x00: case 0x01: case 0x02: case 0x03:

case 0x04: case 0x05: case 0x06: case 0x07:

case 0x08: case 0x09: case 0x0a: case 0x0b:

case 0x0c: case 0x0d: case 0x0e: case 0x0f:

case 0x10: case 0x11: case 0x12: case 0x13:

case 0x14: case 0x15: case 0x16: case 0x17:

case 0x18: case 0x19: case 0x1a: case 0x1b:

case 0x1c: case 0x1d: case 0x1e: case 0x1f:

_isLastChunk = true;

_chunkLength = code - 0x00;

break;

// 问题所在,没有处理结尾块在1F到1K范围内时压缩的4个bit位
// 下面四行是新增的修复代码:
case 0x30: case 0x31: case 0x32: case 0x33:
_isLastChunk = true;
_chunkLength = ((code - 0x30) << 8 ) + read();
break;

default:

throw expect("string", code);

}

}

_chunkLength--;

return parseUTF8Char();

}

(3) 测试代码:
public static void main(String[] args) throws IOException {

test(1024 * 32); // OK

test(1024 * 32 + 1); // OK

test(1024 * 32 + 31); // OK

test(1024 * 32 + 32); // ERROR

test(1024 * 32 + 512); // ERROR

test(1024 * 32 + 1023); // ERROR

test(1024 * 33); // OK

}

public static void test(int size) throws IOException {

SerializerFactory reponseSerializerFactory = new SerializerFactory();

StringBuilder buf = new StringBuilder();

for (int i = 0; i < size; i ++) {

buf.append('A');

}

String str = buf.toString();

System.out.println("length: " + str.getBytes().length);

ByteArrayOutputStream byteBuffer = new ByteArrayOutputStream(2048);

Hessian2Output hessianOutput = new Hessian2Output(byteBuffer);

hessianOutput.setSerializerFactory(reponseSerializerFactory);

hessianOutput.writeObject(str);

hessianOutput.flush();

byte[] bytes = byteBuffer.toByteArray();

ByteArrayInputStream input = new ByteArrayInputStream(bytes);

Hessian2Input hessianInput = new Hessian2Input(input);

hessianInput.setSerializerFactory(reponseSerializerFactory);

String result = (String)hessianInput.readObject(String.class);

System.out.println("result: " + result);

}

你可能感兴趣的:(java,Blog)