最近开发的时候遇到要获取到请求URL里的zip压缩包里面的文件内容
,一开始的想法是先通过代码执行解压,然后读取文件内容,但是感觉好麻烦,于是度了一下,发现可以无需下载到本地,可以直接解压直接读取zip里的文件内容,而且还是JDK提供给我们的工具
。
解决方案就是通过ZipInputStream来读取
。
ZipInputStream在JDK中的util包中,而我们平时用的FileInputStream等都是在io包中的
urlStr:
http://xxxxx/xxx/xxx.zip
public Map<String,String> readData(String urlStr) throws IOException {
URL url = new URL(urlStr);
HttpURLConnection conn = (HttpURLConnection) url.openConnection();
//设置超时间为3秒
conn.setConnectTimeout(3 * 1000);
//得到输入流
InputStream inputStream = conn.getInputStream();
ZipInputStream zin = new ZipInputStream(inputStream, Charset.forName("utf-8"));
BufferedInputStream bs = new BufferedInputStream(zin);
byte[] bytes = null;
ZipEntry ze;
Map<String,String> jsonMap= Maps.newHashMap();
//循环读取压缩包里面的文件
while ((ze = zin.getNextEntry()) != null) {
StringBuilder orginJson = new StringBuilder();
if (ze.toString().endsWith(".json")) {
//读取每个文件的字节,并放进数组
bytes = new byte[(int) ze.getSize()];
bs.read(bytes, 0, (int) ze.getSize());
//将文件转成流
InputStream byteArrayInputStream = new ByteArrayInputStream(bytes);
BufferedReader br = new BufferedReader(
new InputStreamReader(byteArrayInputStream));
//读取文件里面的内容
String line;
while ((line = br.readLine()) != null) {
orginJson.append(line);
}
//关闭流
br.close();
String name=new String(ze.getName().replace(".json",""));
jsonMap.put(name,orginJson.toString());
}
}
zin.closeEntry();
inputStream.close();
return jsonMap;
}
```java
@Test
public void test() throws Exception {
//获取文件输入流
FileInputStream input = new FileInputStream("D:\\2022-03-28.zip");
//获取ZIP输入流(一定要指定字符集Charset.forName("GBK")否则会报java.lang.IllegalArgumentException: MALFORMED)
ZipInputStream zipInputStream = new ZipInputStream(new BufferedInputStream(input), Charset.forName("GBK"));
//定义ZipEntry置为null,避免由于重复调用zipInputStream.getNextEntry造成的不必要的问题
ZipEntry ze;
List<List<Object>> list;
//循环遍历
while ((ze = zipInputStream.getNextEntry()) != null) {
ByteArrayOutputStream baos = new ByteArrayOutputStream();
if (!ze.isDirectory() && ze.toString().endsWith("xls")) {
//读取
byte[] buffer = new byte[1024];
int len;
while ((len = zipInputStream.read(buffer)) > -1) {
baos.write(buffer, 0, len);
}
baos.flush();
InputStream stream = new ByteArrayInputStream(baos.toByteArray()); //excel 流
//根据excel输入流读取EXCEL中的数据
ExcelReader excelReader = ExcelUtil.getReader(stream);
list = excelReader.read(2, excelReader.getRowCount());
for(List<Object> objList : list){
objList.get(0);
objList.get(1);
//获取到数据进行相关处理
......
}
}
}
//一定记得关闭流
zipInputStream.closeEntry();
input.close();
}
jar 文件和 zip 文件都是归档文件,并且都经过压缩。事实上,jar 文件使用与 zip 文件相同的存档和压缩技术,所以 jar 文件实际上是一种特定类型的 zip 文件。(JAR 文件本质上是一个包含可选 META-INF 目录的 zip 文件。)这一切都意味着:
您可以使用与打开 zip 文件相同的工具打开 jar 文件
jar 文件是 zip 文件的子集,因此如果 zip 文件遵循 jar 规范,则它可以用作 jar 文件
private static void readZipFile() {
try (ZipFile zipFile = new ZipFile("/data/testzip.zip");) {
Enumeration<? extends ZipEntry> entries = zipFile.entries();
while(entries.hasMoreElements()){
ZipEntry entry = entries.nextElement();
System.out.println("fileName:"+entry.getName()); //文件名
InputStream stream = zipFile.getInputStream(entry); //读取文件内容
read(stream);
}
} catch(Exception e) {}
//zipFile.close();
}
private static void read(InputStream in) {
try (InputStreamReader reader = new InputStreamReader(in, "UTF-8");
BufferedReader br = new BufferedReader(reader);) {
String con = null;
while ((con = br.readLine()) != null) {
System.out.println(con);
}
} catch (Exception e) {}
}
private static InputStream getInputStream() throws FileNotFoundException {
File file = new File("/data/testzip.zip");
InputStream in = new FileInputStream(file);
return in;
}
//错误方法
private static void readZipInputStream() throws FileNotFoundException, IOException {
InputStream zippedIn = getInputStream(); // zip压缩文件流
ZipInputStream zis = new ZipInputStream(zippedIn);
read(zis); //读取的是空
}
//正确方法
private static void readZipInputStream2() throws FileNotFoundException, IOException {
InputStream zipFileInput = getInputStream(); // zip压缩文件流
ZipInputStream zis = new ZipInputStream(zipFileInput);
ZipEntry entry = null;
try {
while ((entry = zis.getNextEntry()) != null) {
try {
final String name = entry.getName();
System.out.println("fileName:"+name);
String content = IOUtils.toString(zis);
System.out.println(content);
} finally {
zis.closeEntry(); // 关闭zipEntry
}
}
} finally {
zis.close(); //关闭zipInputStream
}
}
注意:在从流中读取数据是使用了IOUtils,原因是自定义read方法读取完后会把传递进来的inputStream给关闭了。如果zip包中有多个文件,那么在读取第二个entry文件时就会报错。zipInputStream只能在最后关闭。而IOUtils使用了copy的方式,不会关闭传入的流。
和ZipFile类似,使用’getEntry(String name)'或’entires’获得ZipEntry或JarEntry(它们可以看作同一东西),接下来使用" JarFile.getInputStream(ZipEntry ze)"将其用于获取InputStream
static void test1() {
String path = "/Users/liuxiao/maven-rep/org/apache/thrift/libthrift/0.9.0/libthrift-0.9.0.jar";
try (JarFile jarFile = new JarFile(new File(path));) {
Enumeration<JarEntry> entries = jarFile.entries();
while (entries.hasMoreElements()) {
JarEntry entry = entries.nextElement();
String entryName = entry.getName();
if (!entry.isDirectory() && entryName.equals("org/apache/thrift/TBase.java")) {
System.out.println(entryName);// org/apache/thrift/EncodingUtils.class
read(jarFile.getInputStream(entry));
}
}
} catch (Exception e) {
}
//使用stream api
try (Stream<JarEntry> stream = new JarFile(new File(path)).stream();) {
stream
.filter(entry -> !entry.isDirectory() && entry.getName().endsWith(".class"))
.forEach(entry -> System.out.println(entry.getName()));
} catch(Exception e) {
}
}
private static InputStream getJarFileInputStream() throws FileNotFoundException {
File file = new File("/data/mvn_repo/commons-lang/commons-lang/2.1/commons-lang-2.1.jar");
InputStream in = new FileInputStream(file);
return in;
}
private static void readJarInputStream2() throws FileNotFoundException, IOException {
InputStream zipFileInput = getJarFileInputStream(); // jar包流
JarInputStream jis = new JarInputStream(zipFileInput);
JarEntry entry = null;
try {
while ((entry = jis.getNextJarEntry()) != null) {
try {
if (entry.isDirectory()) {
continue;
}
final String name = entry.getName();
System.out.println("fileName:"+name);
String content = IOUtils.toString(jis);
System.out.println(content);
} finally {
jis.closeEntry(); // 关闭zipEntry
}
}
} finally {
jis.close(); //关闭zipInputStream
}
}
static void test2() throws Exception {
String filePath = "/Users/liuxiao/maven-rep/org/apache/thrift/libthrift/0.9.0/libthrift-0.9.0.jar";
String name = "org/apache/thrift/TBase.java";
URL url = new URL("jar:file:" + filePath + "!/" + name);
JarURLConnection jarConnection = (JarURLConnection) url.openConnection();
try (InputStream in = jarConnection.getInputStream();
BufferedReader br = new BufferedReader(new InputStreamReader(in, "UTF-8"));) {
String con = null;
while ((con = br.readLine()) != null) {
System.out.println(con);
}
} catch (Exception e) {
e.printStackTrace();
}
}
总结:
由于zip和jar结构时一致的,所以ZipFile和JarFile,ZipInputStream和JarInputStream的使用方法是一样的。
需要说明的一点是,由于zip包的这种特殊结构,默认ZipInputStream中是不包含数据的,只有在调用getNextEntry方法后,才回把对应的entry(zip包中的一个文件)内容写入到ZipInputStream中。上面的一个错误写法中,可以看到直接从ZipInputStream中读不到数据,只有调用getNextEntry后才可以
————————————————
版权声明:本文为CSDN博主「赶路人儿」的原创文章,遵循CC 4.0 BY-SA版权协议,转载请附上原文出处链接及本声明。
原文链接:https://blog.csdn.net/liuxiao723846/article/details/130967940