Java压缩流GZIPStream导致的内存泄露

我们来聊聊GZIPOutputStreamGZIPInputStream, 如果不关闭流会引起的问题,以及GZIPStream申请和释放堆外内存的流程, Let's do it!

引子

在我的工程里面又一个工具类 ZipHelper 用来压缩和解压 String

import java.io.ByteArrayInputStream;
import java.io.ByteArrayOutputStream;
import java.util.zip.GZIPInputStream;
import java.util.zip.GZIPOutputStream;
/**
* 用来压缩和解压字符串
*/
public class ZipHelper {

    // 压缩
    public static String compress(String str) throws Exception {
        if (str == null || str.length() == 0) {
            return str;
        }
        ByteArrayOutputStream out = new ByteArrayOutputStream();
        GZIPOutputStream gzip = new GZIPOutputStream(out);
        gzip.write(str.getBytes());
        gzip.close();
        return out.toString("ISO-8859-1");
    }

    // 解压缩
    public static String uncompress(String str) throws Exception {
        if (str == null || str.length() == 0) {
            return str;
        }
        ByteArrayOutputStream out = new ByteArrayOutputStream();
        ByteArrayInputStream in = new ByteArrayInputStream(str.getBytes("ISO-8859-1"));
        GZIPInputStream gunzip = new GZIPInputStream(in);
        byte[] buffer = new byte[1024];
        int n;
        while ((n = gunzip.read(buffer)) >= 0) {
            out.write(buffer, 0, n);
        }
        return out.toString();
    }
}

最近服务出现了占用swap空间的问题,初步定位为内存泄漏,最后通过分析定位到是 Native 方法Java_java_util_zip_Inflater_init一直在申请内存(关于分析方法可以查阅这篇博客内存泄露分析实战)但是没有释放,很有可能就是流没有关闭造成的,而这部分代码最大的问题就是没有在finally里面去关闭流,于是乎我打算改造这部分代码,利用 try-with-resource 语法糖,然后代码就被修改成了这样:

import java.io.ByteArrayInputStream;
import java.io.ByteArrayOutputStream;
import java.util.zip.GZIPInputStream;
import java.util.zip.GZIPOutputStream;

/**
 * Created by jacob.
 *
 * 用来压缩和解压字符串
 */
public class ZipHelper {

    /**
     * 压缩字符串
     *
     * @param str 待压缩的字符串
     * @return 压缩后的字符串
     * @throws Exception 压缩过程中的异常
     */
    public static String compress(String str) throws Exception {
        if (str == null || str.length() == 0) {
            return str;
        }
        // ByteArrayOutputStream 和 ByteArrayInputStream 是一个虚拟的流,
        // JDk源码中关闭方法是空的, 所以无需关闭, 为了代码整洁,还是放到了try-with-resource里面
        try (ByteArrayOutputStream out = new ByteArrayOutputStream();
             GZIPOutputStream gzip = new GZIPOutputStream(out)) {
            gzip.write(str.getBytes());
//            gzip.finish();
            return out.toString("ISO-8859-1");
        }
    }

    /**
     * 解压字符串
     *
     * @param str 待解压的字符串
     * @return 解压后的字符串
     * @throws Exception 解压过程中的异常
     */
    public static String uncompress(String str) throws Exception {
        if (str == null || str.length() == 0) {
            return str;
        }
        try (ByteArrayOutputStream out = new ByteArrayOutputStream();
             ByteArrayInputStream in = new ByteArrayInputStream(str.getBytes("ISO-8859-1"));
             GZIPInputStream gunzip = new GZIPInputStream(in)) {
            byte[] buffer = new byte[1024];
            int n;
            while ((n = gunzip.read(buffer)) >= 0) {
                out.write(buffer, 0, n);
            }
            return out.toString();
        }
    }
}

是不是顺眼多了呐,可是这样的代码可以压缩的,在解压的时候会报错。一开始我以为是解压的代码出现了问题,最后才发现是因为压缩的时候没有成功压缩,导致解压的时候无法解压。报以下错误

Exception in thread "main" java.io.EOFException: Unexpected end of ZLIB input stream
    at java.util.zip.InflaterInputStream.fill(InflaterInputStream.java:240)
    at java.util.zip.InflaterInputStream.read(InflaterInputStream.java:158)
    at java.util.zip.GZIPInputStream.read(GZIPInputStream.java:117)
    at java.io.FilterInputStream.read(FilterInputStream.java:107)
    at coderbean.ZipHelper.uncompress(ZipHelper.java:52)
    at coderbean.Main.main(Main.java:12)

好好的代码怎么会突然压缩失败,后来发现的问题是在GZIPOutputStream中,在close()方法中会主动调用finish()方法。

/**
* Writes remaining compressed data to the output stream and closes the
* underlying stream.
* @exception IOException if an I/O error has occurred
*/
public void close() throws IOException {
   if (!closed) {
       finish();
       if (usesDefaultDeflater)
           def.end();
       out.close();
       closed = true;
   }
}

在下面的方法中才会将压缩后的数据输出到输入流,由于原来的代码会调用 close()方法,从而间接调用了 finish() 方法。那我我们的try-with-resource到底出了什么问题,其实问题就在于执行close()的时间。

/**
 * Finishes writing compressed data to the output stream without closing
 * the underlying stream. Use this method when applying multiple filters
 * in succession to the same output stream.
 * 在该方法中才会将压缩后的数据输出到输入流,由于原来的代码会调用 close()方法,从而
 * 间接调用了 finish() 方法。
 * @exception IOException if an I/O error has occurred
 */
public void finish() throws IOException {
    if (!def.finished()) {
        def.finish();
        while (!def.finished()) {
            int len = def.deflate(buf, 0, buf.length);
            if (def.finished() && len <= buf.length - TRAILER_SIZE) {
                // last deflater buffer. Fit trailer at the end
                writeTrailer(buf, len);
                len = len + TRAILER_SIZE;
                out.write(buf, 0, len);
                return;
            }
            if (len > 0)
                out.write(buf, 0, len);
        }
        // if we can't fit the trailer at the end of the last
        // deflater buffer, we write it separately
        byte[] trailer = new byte[TRAILER_SIZE];
        writeTrailer(trailer, 0);
        out.write(trailer);
    }
}

try-with-resource 执行时机和条件

try-with-resource 是在 JDK7 中新增加的语法糖(其实就是抄的C#),用来自动执行流的关闭操作,只要该类实现了AutoCloseableclose()方法。


package java.lang;

public interface AutoCloseable {
    /**
     * @throws Exception if this resource cannot be closed
     */
    void close() throws Exception;
}

实现了这个接口之后,我们可以将会在try代码块执行结束之后自动关闭流

try(/* 在此处初始化资源 */){
  // do something
} //在代码块执行结束前最后一步关闭流

由于在GZIPOutputStream执行了finish()方法或者close()方法之后才会真正的将压缩后的数据写入流,在上文我改造的代码中并没有首先执行finish()方法,而是直接在try代码块执行完之后关闭了流 GZIPOutputStream, 由于close()方法执行在out.toString("ISO-8859-1")之后,因此压缩并没有真正的被执行,然而对于ZipHelper.compress()方法并没有感知,而是返回了没有压缩成功的字符串,从而造成在解压的时候报错。

为什么会引起的堆外内存泄漏

通过最开始的代码我们可以看出,在没有发生异常的情况下,compress()方法是可以正常的关闭流的,所以内存泄露的根源应该是在uncompress()方法,通过跟踪GZIPInputStream的构造函数和close()应该很快就能找到答案。

下面是申请堆外内存和释放堆外内存的过程调用图,可以对比代码参考


Java压缩流GZIPStream导致的内存泄露_第1张图片
堆外内存调用释放流程图

由于篇幅的原因就不将JDK源码注释一同贴上来了,感兴趣的同学可以按图索骥,找到对应的注释。

//java.util.zip.GZIPInputStream.java
public
class GZIPInputStream extends InflaterInputStream {

    public GZIPInputStream(InputStream in) throws IOException {
        this(in, 512); //调用下面的构造函数
    }

    public GZIPInputStream(InputStream in, int size) throws IOException {
        super(in, new Inflater(true), size); //新建 Inflater 对象
        usesDefaultInflater = true;
        readHeader(in);
    }

    public void close() throws IOException {
        if (!closed) {
            super.close(); //这里的父类是java.util.zip.InflaterInputStream
            eos = true;
            closed = true;
        }
    }
}
//java.util.zip.Inflater.java

public
class Inflater {

    public Inflater(boolean nowrap) {
        zsRef = new ZStreamRef(init(nowrap));
    }

    /**
     * Closes the decompressor and discards any unprocessed input.
     * This method should be called when the decompressor is no longer
     * being used, but will also be called automatically by the finalize()
     * method. Once this method is called, the behavior of the Inflater
     * object is undefined.
     */
    public void end() {
        synchronized (zsRef) {
            long addr = zsRef.address();
            zsRef.clear();
            if (addr != 0) {
                end(addr);
                buf = null;
            }
        }
    }

    // 此处调用了 Native 方法
    private native static long init(boolean nowrap);
    private native static void end(long addr);
}
//java.util.zip.InflaterInputStream.java

public
class InflaterInputStream extends FilterInputStream {
  /**
   * Closes this input stream and releases any system resources associated
   * with the stream.
   * @exception IOException if an I/O error has occurred
   */
  public void close() throws IOException {
      if (!closed) {
          if (usesDefaultInflater)
              inf.end();
          in.close();
          closed = true;
      }
  }  
}

openJDK 中 JVM 关于这个本地方法的实现

JNIEXPORT jlong JNICALL
Java_java_util_zip_Inflater_init(JNIEnv *env, jclass cls, jboolean nowrap)
{
    //此处使用 calloc 申请了堆外内存
    z_stream *strm = calloc(1, sizeof(z_stream));

    if (strm == NULL) {
        JNU_ThrowOutOfMemoryError(env, 0);
        return jlong_zero;
    } else {
        const char *msg;
        int ret = inflateInit2(strm, nowrap ? -MAX_WBITS : MAX_WBITS);
        switch (ret) {
          case Z_OK:
            return ptr_to_jlong(strm);
          case Z_MEM_ERROR:
            free(strm);
            JNU_ThrowOutOfMemoryError(env, 0);
            return jlong_zero;
          default:
            msg = ((strm->msg != NULL) ? strm->msg :
                   (ret == Z_VERSION_ERROR) ?
                   "zlib returned Z_VERSION_ERROR: "
                   "compile time and runtime zlib implementations differ" :
                   (ret == Z_STREAM_ERROR) ?
                   "inflateInit2 returned Z_STREAM_ERROR" :
                   "unknown error initializing zlib library");
            free(strm);
            JNU_ThrowInternalError(env, msg);
            return jlong_zero;
        }
    }
}

JNIEXPORT void JNICALL
Java_java_util_zip_Inflater_end(JNIEnv *env, jclass cls, jlong addr)
{
    if (inflateEnd(jlong_to_ptr(addr)) == Z_STREAM_ERROR) {
        JNU_ThrowInternalError(env, 0);
    } else {
        free(jlong_to_ptr(addr)); //此处释放堆外内存
    }
}

参考

  • java.io.EOFException: Unexpected end of ZLIB input stream异常处理

你可能感兴趣的:(Java压缩流GZIPStream导致的内存泄露)