java压缩编码之GZIP

逆向分析中将分析结果的 byte[ ] 以十六进制的形式打印出来,常常会遇到这样的格式:

1F8B08000000000000002597C712ABBC0E809FE69C。。。。。。省略N多。。。。。。。

看其格式就知道是GZIP压缩的格式。

根据目前我逆向分析的好几个APP应用,它们使用网络请求时,基本上都使用了GZIP压缩技术对其请求返回数据进行GZIP压缩或解压处理。


压缩和解压分别使用了两个IO流:

GZIP压缩:GZIPOutputStream

GZIP解压:GZIPInputStream


示例代码:

public class Gzip {

	public static void main(String[] args) throws IOException {
		// TODO Auto-generated method stub

		String str="hello world";
		
		byte[ ] bytes=str.getBytes();
		System.out.println("压缩前长度:"+bytes.length);
		byte[ ] gzipBytes=gzip(bytes);
		System.out.println("压缩后长度:"+gzipBytes.length);
		System.out.println("压缩后:"+byteToHexString(gzipBytes));
		byte[ ] unGzipBytes=unGzip(gzipBytes);
		System.out.println("解压后:"+byteToHexString(unGzipBytes));
	}
	
	public static byte[] gzip(byte[] content) throws IOException{
		ByteArrayOutputStream baos=new ByteArrayOutputStream();
		GZIPOutputStream gos=new GZIPOutputStream(baos);
		
		ByteArrayInputStream bais=new ByteArrayInputStream(content);
		byte[ ] buffer=new byte[1024];
		int n;
		while((n=bais.read(buffer))!=-1){
			gos.write(buffer, 0, n);
		}
		gos.flush();
		gos.close();
		return baos.toByteArray();
	}
	
	public static byte[] unGzip(byte[] content) throws IOException{
		ByteArrayOutputStream baos=new ByteArrayOutputStream();
		GZIPInputStream gis=new GZIPInputStream(new ByteArrayInputStream(content));
		byte[] buffer=new byte[1024];
		int n;
		while((n=gis.read(buffer))!=-1){
			baos.write(buffer, 0, n);
		}
		
		return baos.toByteArray();
	}
	
	public static String byteToHexString(byte[] bytes) {
        StringBuffer sb = new StringBuffer(bytes.length);
        String sTemp;
        for (int i = 0; i < bytes.length; i++) {
            sTemp = Integer.toHexString(0xFF & bytes[i]);
            if (sTemp.length() < 2)
                sb.append(0);
            sb.append(sTemp.toUpperCase());
        }
        return sb.toString();
    }

}

运行结果:

java压缩编码之GZIP_第1张图片

上面的1F8B0800000000000000是10个字节的固定GZIP格式,最后8个字节表示压缩前的数据长度。


再来看“a”,“abc”,“abcde”,“asdfghjk” 这四个字符串的压缩结果:

java压缩编码之GZIP_第2张图片

基本都是一样的格式。


最后一个例子,输入一个比较长的数据,再看看结果:

----------------------数据-------------------

"lng=113.902302&cx=d41bgtHhx7iFJPuvzpwy8o7tNKB4l%2BDauZTUvdnDoEjlplqvzoXiGkVpWGf9MpgWiLOwsj5gMdDH%0AZ%2FpwELvq67gex7Y6gtgOs4mu3JJctJ0agflWYuan9qX7aZufh%2FA4E2lsouyvze344TMxzjfpnFMV%0AwNJ%2B8TIUe6qg2vC1osSpQXirubdyOr3j1pJyShr4sogM7zkiJAdrynp0arvp%2Fx64DvnjxoEsThjw%0Aq6Ma7Eb%2FvhXBGq7fEAecLtPZvQCWyFRWk5NqrSCr6KaD90ACcaDuOYZ50%2BUQJDBJ4dovO%2FuVCFnO%0AXqEecWLsJkzLyQeq2CL2u5YUcWXN3WcJwLNRxQEK%2BVr98SNeDAkVA6SCGKYz6re5dCeBJiNf2aZT%0ARhI32h91JVZOvifG2nCCduaAUjxxMa1WVBT7EdGwgalmzo5Jnhop4zIxsiurD9ZR0nheGjnqDD05%0AsgRJzxCGIW%2BWqsrWPb3omJ9dSpSeBWShKN2cn4YVMNUQZChei1ggVtjcmfq0QCLFJVT4JaUmx%2BEL%0ACeN%2Bnd4bakTOGwehwz5QYxWJKey4Bx7wScyFbdCBM4H6tLSIHW3bFumlP4Jrj4cB8FL3g%2BNI2mmq%0A8el5wmFiJ8opoyEHzVYh8uMELV6PkqrEUrblRnG%2BYIRjkNo00ZUW6e%2FqAR%2Fku%2FIjggBWLrvwYbaF%0A6JQELcshV5eBkzC%2FUWZUvnWHx4fQ4rygmAiH0rgJkCRXugTb9b1LvM7Qh0VzqLeMlOhHUeY95Y0n%0ADmVU%2B3SwXDcqXGV7xAYN%2BYtIpaXUUE0Ym4S2t0RFqoi8c6QG78CBwmLbC8iIHV%2Bqed%2BaH7wj%2B%2FgC%0AB6Np%2BBa3nYsxHYO%2Be44k0vb4FMZfjTKcbGbX2oCOR2dxCyevR48%2BnN4TaTy7af17LK00qRKoabBS%0APMBLkLo74Ay2gVuBkRJ2m8WvGDjbjql5ECuXXX3xJqKOgmb8w6TVj%2FULXqzPbwWamInoCYg8Icke%0AAxklN8GBY9Goa%2Fe2oARB7us8wfNDnA6uHBFBqKfUwk08f5TqWEJQnx7DGB1H9NYJnRryAOWnLDK1%0APGRU%2Blxa98Tc1I9WxHjuNptVfbirLfRUkYJ7JHQefrJH0NEGVmafodDfz2minx1veSCn6dQD5X3z%0AkXEixFLLQqg1HTU4QhS53RMiVaJqRfOjFXlkn%2BP36XvqUZuY%2F4QSUHKm4CxYm8Mu2L8Mwh3xouXm%0AhOP9nfxGq3N5n8eRshksxHJap1fS4s7z843hzepKgo7rXspRj%2FqWJNFKW1%2BK8UCjlXm0A7Maoome%0A5QNVa5WgLVuLuCcpZQ7u1QV%2FD%2FL2nPy%2BMhhFya35A%2F0hHULiIZlIiFomiPl6wAststTPu7LU1MyY%0ArjvLV2ImEVCyd4RmfYEmNp5EtI6LnQL66zHQnQrWEz0hfccJLn7drpemlV%2F2bj6qa2MBp9tkODnM%0Ax7jHEVgWVoDdN0Rj0eobgRHZjeuGkO04%2F%2FS1qvAlMHy4ewcnSIgOyYzz6BpHXKXXx9hY2xqj5yp%2F%0A7Tje81R6hSK1BeBdS7Wz1gw7XmYWKiezW9F6XBw%2BQ0L5vR1F%2Bf%2B83v32hS8HDKmz5e8%2FhaVZC1So%0AX5wL2HAZe8wcYfzDYbEHTSjHaz2d5AdxuwdtOu99UZkjNm9rbIkscelofvBcMuDW6MK2ojCd%2FSnS%0AJlAmMQ0nn3sVZYwCBLiXKlee8IqoDgGuUhiA19RUkNYtMg6pk7%2BieLFkKy5z7yPerpy6Jt7PTN59%0ACqYWFP9fyCP44DNU5nL0Z04UTNQqyZl%2Ff7LDVzlXygXxmVdbmgfwo%2FahUpLpQyU5KO%2BQwipRK4mr%0AGqssLcIvYaJxtATfXqGW1kUzysJFqXKeVY4aXk8DU%2FAsX91XXtBqzcLT%2Fjke1xZNxgKicF3Kte71%0AbkeUJl39n85xVNpJ5Xn%2FRqO%2F1uG69Svj3F3ShHcj4danJUX7b2NDOk5rKbVZSfefZsPJXNCB0lR2%0AQQGuMMYR4F1gkUZElgyidnPr7R4cOa9%2BuGf7txOYGSfXPyH1N8lXEd1A1Iyt1eDwaIkzbH4rGcVQ%0AK%2ByOWmHKeqJOBmWXfBm5ZHxJ0zBtAsbc2%2BFQbttZ17eE7%2BpvOvWiS9Enqy90vUQSaLdzmvQdlGYI%0AYlHijyZuDYCKgPibV1HtU6Iz%2Fes%2FKkGNCR64Y9r0E00zqw8PLSqCjdClWOZYv9K6xClLOApMMUdw%0AdPZ9UUQZ2ihOtPtd%2BP4U9DPIIhlHG4LRu%2B6sL2hMaV7PqbBNYA9EbgP41R2aSRdia4Nc0z95EUZW%0AdBrUcPdUl5FTe8CYobLYolevokSwVstKsLXO0vMWWbgHA2sXzruKVm4pGudizM4%2FL9TIBY8DTIxy%0AAOh2sVZNxM3DjDenWd0xviDByluT7SlU3BmWG27%2BGAPCsLpPdNkjp0dGSaLtf6c1ivQpzUSP0CtX%0ADD53lxukU5B%2FR9XNQpgkJ1OaH5DCu%2BuTr2zH3oUG3O2NVhEuE660c09ABtWG%2FM7u4KxOMzDSjL4%2F%0AgToQ8GDopo8reZWzwgfUvsXFYxPZIP0ZyZ3%2BzI4oBDo3cqbNaotdyxTNq4zDGgRS%2BSZ26xR%2FAzv4%0AmkYUrouEd6toPRSS439Acjv3vc5zWaW5yXmKXwfWSuJ5ETpgGssBaqvErrFJCZwFEJhj91eAE4uT%0AMtwPmugZCINSom8ogkSIey84Y5%2FeiOga7ZSqRDr6e8Vop8wjEEkQz8fpxb3movIQZaDcHVQdi2GJ%0ANLojDQ%2FyyYo3lQExBMLVZJwJiOCS%2BcKSyYGTAL3P4bVdrn2v%2Be%2FddHk3ylI%3D%0A&countrycode=86&type=0&lat=22.552802"

----------------------数据-------------------

然后得到结果:

截图一

java压缩编码之GZIP_第3张图片

截图二

java压缩编码之GZIP_第4张图片

截图二种的最后8个字节500C0000代表长度,计算的规则是从左到右,每256就向着右边相邻的低位字节进1。

也就是 500C0000= (0x0C)*16^2 +(0x50)= 12*256+80=3152。

刚好等于原始数据的长度。

2371/3152=75.2%,压缩后,长度为原始数据长度的75.2%。


你可能感兴趣的:(java加解密与编码)