Hadoop压缩和解压缩文件

Hadoop压缩和解压缩文件

修改Hadoop_WordCount单词统计工程

  1. 创建CompressionTest类
package com.blu.compress;

import java.io.File;
import java.io.FileInputStream;
import java.io.FileOutputStream;

import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.IOUtils;
import org.apache.hadoop.io.compress.CompressionCodec;
import org.apache.hadoop.io.compress.CompressionCodecFactory;
import org.apache.hadoop.io.compress.CompressionInputStream;
import org.apache.hadoop.io.compress.CompressionOutputStream;
import org.junit.Test;

public class CompressionTest {

	/**
	 * 压缩文件
	 * 
	 * @author BLU
	 */
	
	@Test
	public void compress_test() throws Exception {
		
		//文件输入流(要压缩的文件)
		FileInputStream fis = new FileInputStream(new File("D:\\data\\money.txt"));
		//创建Configuration
		Configuration conf = new Configuration();
		//获得CompressionCodecFactory
		CompressionCodecFactory factory = new CompressionCodecFactory(conf);
		//根据类名获得不同压缩方式的CompressionCodec的对象
		CompressionCodec  codec = factory.getCodecByClassName("org.apache.hadoop.io.compress.GzipCodec");
		//文件输出流
		FileOutputStream fos = new FileOutputStream(new File("D:\\data\\compress_test.txt" + codec.getDefaultExtension()));
		//文件输出流转换为压缩文件输出流
		CompressionOutputStream cos = codec.createOutputStream(fos);
		//输出
		IOUtils.copyBytes(fis, cos, conf);
		//关闭流
		IOUtils.closeStream(cos);
		IOUtils.closeStream(fos);
		IOUtils.closeStream(fis);
	}
	
	/**
	 * 解压文件
	 * 
	 * @author BLU
	 */
	
	@Test
	public void decompress_test() throws Exception {
		//要解压的文件
		String path = "D:\\data\\compress_test.txt.gz";
		
		//创建Configuration
		Configuration conf = new Configuration();
		//获得CompressionCodecFactory
		CompressionCodecFactory factory = new CompressionCodecFactory(conf);
		CompressionCodec codec = factory.getCodec(new Path(path));
		if(codec != null) {
			//获取解压缩的输入流
			FileInputStream fis = new FileInputStream(new File(path));
			CompressionInputStream cis = codec.createInputStream(fis);
			//输出流FileOutputStream
			FileOutputStream fos = new FileOutputStream("D:\\data\\decompress.txt");
			//输出和关闭流
			IOUtils.copyBytes(cis, fos, conf);
			IOUtils.closeStream(fos);
			IOUtils.closeStream(cis);
			IOUtils.closeStream(fis);
		}else {
			System.out.println("不支持的解压类型");
		}		
	}
	
}
  1. 运行compress_test()压缩方法,在D:\data目录下生成了compress_test.txt.gz压缩包文件Hadoop压缩和解压缩文件_第1张图片
  2. 再运行decompress_test()解压缩方法,生成了decompress.txt文件Hadoop压缩和解压缩文件_第2张图片

你可能感兴趣的:(Hadoop)