quqi99

选择压缩算法的经历 (by quqi99)

选择压缩算法的经历 (by quqi99)

作者：张华发表于：2007-08-03 ( http://blog.csdn.net/quqi99 )

版权声明：可以任意转载，转载时请务必以超链接形式标明文章原始出处和作者信息及本版权声明。

最近，由于蜘蛛下载下来的文件需要压缩，本人趁此机会学习了解了一系列的压缩算法。一要考虑压缩比，二要比较速度，三要考虑追加，删除，查询（即在不解压的情况下知道压缩包里压缩的是什么东西，能方便的提取出元数据信息）是否方便。

刚开始，主要是比较 zlib, bzip2, gzip, rar, zip 几种算法的压缩性能比。

对于 rar 格式，由于主要是调用命令进行解压缩（代码见附件一）。它是跑在虚拟机之外的，一旦出现了错误，可能整个虚拟机因此会死掉，所以这种方法不予考虑。

网上说 bzip2 对于文本压缩的效率算是最高的， ant.jar 包的 org.apache.tools.bzip2 提供了相应的 API 。但是使用时总不顺利，也就没花时间继续了。相关代码见附件二（未测试成功）。

于是开始学习 zlib. 它的 JAVA 版本叫 jzlib, 用 jzlib 进行解压缩的代码见附件三。觉得这个还不错。于是，准备用它，但是压缩一个文件还行，但用 java.util.zip 包那样压缩目录确挺不方便的。现在才开始恍然大悟。哦，原来这些压缩算法一般只注重算法本身，至于怎么用着方便如支持按条目压缩则是外围应用要管的事情。

于是，开始考虑怎么吸收 java.util.zip 包里的思想在 zlib 算法的基础上包装能按目录压缩。搞到最后，发现原来 java.util.zip 包的底层用的压缩算法就是用的是 zlib. SUN 公司只不过是在核心算法的基础上加上了一些如校验（ CRC32, Adler32 ）及按目录压缩（ ZipEntry ）以及方便访问的输入输出流（ ZipInputStream ， ZipOutputStream ）。

既然 java.util.zip 包里用的就是 zlib ，我们就不需要再考虑怎样按目录进行压缩了，但事情进展也并不是一帆风顺。

首先，直接用 java.util.zip 的 API 编出的解压缩不能支持中文文件名，因为 java 对于文字的编码是以 unicode 为基础，因此，若是以 ZipInputStream 及 ZipOutputStream 来处理压缩及解压缩的工作，碰到中文档名或路径，它就不处理。仔细查看了 ZipInputStream 的 API ，发现问题就出现在 java.uti.zip.ZipInputStream 类中的这一句： ZipEntry e = createZipEntry(getUTF8String(b, 0, len)); 它应该被改成：

ZipEntry e=null;
try
{
if (this.encoding.toUpperCase().equals("UTF-8"))
e=createZipEntry(getUTF8String(b, 0, len));
else
e=createZipEntry(new String(b,0,len,this.encoding));
}
catch(Exception byteE)
{
e=createZipEntry(getUTF8String(b, 0, len));
}

幸好，在网上一搜，发现这个改动不需要由我们自己来做，因为 ant 的 org.apache.tools.zip 包中已经为我们改好了。用这个包编写的解压缩代码见附件四 .

接着又发现了问题。解压文件时有两种方式，一是采用 ZipFile, 二是采用 ZipOutputStream 。 ZipFile 一次性将 zip 文件全部读到内存中去，对于大 zip 就不行了，这时得采用 ZipOutputStream 方式，但是 org.apache.tools.zip 包对 ZipOutputStream 类恰好没进行改定，只仅仅提供了改写后的 ZipFile 。当你用 java.util.zip.ZipOutputStream 时同样对于中文文件名的文件不能进行压缩。

这时候在网上找到了文件《让 ZipOutputStream 和 ZipInputStream 支持中文》（可在 google 搜）。它的方法是直接改 JDK 的源代码。但是我觉得直接改 JDK 的 JAR 包以后软件部署时比较麻烦，为些，我开始寻找另外的解决办法。

为了不改动 java.util.zip. ZipInputStream, 自己就直接将这个类再重写一遍，首先通过复制粘贴写一个与之内容一模一样的类 jcss.search.base.zip.C ZipInputStream 。然后在这个类中将 ZipEntry e = createZipEntry(getUTF8String(b, 0, len)) 改写成上述的代码。此类见附件五。

另外，将复制出与 java.uti.zip.ZipConstants 内容一模一样的类 jcss.search.base.zip. ZipConstants

另外，再实现一个 jcss.search.base.zip.ZipEntry 类，代码见附件六 .

至此， OK 。

若想进一步提高压缩比的话，可以采用 7zip, 并且目前也有专门版本的 7zip SDK （实现了 LZMA 压缩算法 . 另外，也有热心人士为方便访问在此基础上增加了两件输入输出流类（ net.contrapunctus.lzma.LzmaInputStream 与 net.contrapunctus.lzma.LzmaOutputStream ） ，但是没有包装按目录进行压缩相关的条目类。

附件一：

package jcss.search.base;

/**

* @author 张华

* @time 2007 - 8 - 1

* @description

public class RarUtil {

/**

* 解压

* @param compress

* rar 压缩文件

* @param decompression

* 解压路径

public void unZip(String compress, String decompression) throws Exception {

java.lang.Runtime rt = java.lang.Runtime.getRuntime ();

Process p = rt.exec( "C://Program Files//WinRAR//UNRAR.EXE x -o+ -p- " + compress + " " + decompression);

StringBuffer sb = new StringBuffer();

java.io.InputStream fis = p.getInputStream();

int value = 0;

while ((value = fis.read()) != -1)

{

sb.append(( char ) value);

}

fis.close();

String result = new String(sb.toString().getBytes( "ISO-8859-1" ), "GBK" );

System. out .println(result);

}

/**

* @param outputRar 输出目录

* @param compression 要压缩的文件或目录

* @throws Exception

public void zip(String outputRar, String compression) throws Exception {

java.lang.Runtime rt = java.lang.Runtime.getRuntime ();

//rar.exe x -t -o+ -p- E:/2.rar E:/

Process p = rt.exec( "C://Program Files//WinRAR//rar.exe x -t -o+ -p- " + outputRar + " " + compression);

StringBuffer sb = new StringBuffer();

java.io.InputStream fis = p.getInputStream();

int value = 0;

while ((value = fis.read()) != -1)

{

sb.append(( char ) value);

}

fis.close();

String result = new String(sb.toString().getBytes( "ISO-8859-1" ), "GBK" );

System. out .println(result);

}

/**

* @param args

public static void main(String[] args) {

RarUtil test = new RarUtil();

String compress = "f:/ 增加转码过滤器 .rar" ; // rar 压缩文件

String decompression = "f:/test/" ; // 解压路径

try {

test.zip( "f:/test.rar" , " 说明 .txt" );

//test.unZip(compress, decompression);

} catch (Exception e) {

e.printStackTrace();

}

附件二：

package jcss.search.base;

import java.io.File;

import java.io.FileInputStream;

import java.io.FileOutputStream;

import java.io.IOException;

import java.io.InputStream;

import java.io.OutputStream;

import org.apache.tools.bzip2.CBZip2InputStream;

import org.apache.tools.bzip2.CBZip2OutputStream;

/**

* @author 张华

* @time 2007-7-26

* @description BZip2 压缩，解压算法

public class BZip2Util {

public static void Bzip2Compress(String in, String to) {

try {

File source = new File(in);

File destination = new File(to);

CBZip2OutputStream output = new CBZip2OutputStream(

new FileOutputStream(destination));

final FileInputStream input = new FileInputStream(source);

copy(input, output);

input.close();

output.close();

} catch (Exception e) {

e.printStackTrace();

}

public static void Bzip2Uncompress(String in, String to) {

try {

File source = new File(in);

File destination = new File(to);

FileOutputStream output =new FileOutputStream(destination);

CBZip2InputStream input = new CBZip2InputStream( new FileInputStream(source));

copy( input, output );

input.close();

output.close();

} catch (Exception e) {

e.printStackTrace();

}

static void copy(final InputStream input, final OutputStream output)

throws IOException {

final byte[] buffer = new byte[8024];

int n = 0;

while (-1 != (n = input.read(buffer))) {

output.write(buffer, 0, n);

}

/**

* @param args

public static void main(String[] args) {

BZip2Util test = new BZip2Util();

String in = "f://~HlIndex.htm";

String to = "f://a.bz2";

String out2 = "b.htm";

//test.Bzip2Compress(in, to);

//test.Bzip2Uncompress(to, out2);

}

附件三：

package example;

import java.io.File;

import java.io.FileInputStream;

import java.io.FileOutputStream;

import com.jcraft.jzlib.*;

/** 缺点：不能按目录压缩。

* @author 张华

* @time 2007-7-30

* @description reference http://tianxiagod.spaces.live.com/

* http://blog.csdn.net/kong555/archive/2006/03/28/641855.aspx

public class TestJZlib {

// 压缩的文件长度，压缩、解压时均要用，挺关键。

// 要确保方法 compressfile （）与 uncompressfile （）参数一致

static int resLen = 0;

/**

* 压缩

* @param data

* @param type

* 压缩方法为一个整数 -1 为默认压缩比 9 为最高压缩比 0 为不压缩 1 为快速压缩

* @return

public static byte[] compressfile(byte[] data, int type,int len) {

int err;

int comprLen = len;

byte[] compr = new byte[comprLen];

ZStream c_stream = new ZStream();

err = c_stream.deflateInit(type);

CHECK_ERR(c_stream, err, "deflateInit");

c_stream.next_in = data;

c_stream.next_in_index = 0;

c_stream.next_out = compr;

c_stream.next_out_index = 0;

while (c_stream.total_in != data.length

&& c_stream.total_out < comprLen) {

c_stream.avail_in = c_stream.avail_out = 1; // 置初值

err = c_stream.deflate(JZlib.Z_NO_FLUSH);

CHECK_ERR(c_stream, err, "deflate");

}

System.out.println(" 压缩前 --" + c_stream.total_in + " 字节 ");

while (true) {

c_stream.avail_out = 1;

err = c_stream.deflate(JZlib.Z_FINISH);

if (err == JZlib.Z_STREAM_END) {

break;

}

CHECK_ERR(c_stream, err, "deflate");

}

System.out.println(" 压缩后 --" + c_stream.total_out + " 字节 ");

err = c_stream.deflateEnd();

CHECK_ERR(c_stream, err, "deflateEnd");

byte[] zipfile = new byte[(int) c_stream.total_out];

System.arraycopy(compr, 0, zipfile, 0, zipfile.length);

return zipfile;

}

public static byte[] uncompressfile(byte[] data,int len) {

int err;

int uncomprLen = len;

byte[] uncompr = new byte[uncomprLen];

ZStream d_stream = new ZStream();

err = d_stream.inflateInit();

CHECK_ERR(d_stream, err, "inflateInit");

d_stream.next_in = data;

d_stream.next_in_index = 0;

d_stream.next_out = uncompr;

d_stream.next_out_index = 0;

while (d_stream.total_out < uncomprLen

&& d_stream.total_in < uncomprLen) {

d_stream.avail_in = d_stream.avail_out = 1;

err = d_stream.inflate(JZlib.Z_NO_FLUSH);

if (err == JZlib.Z_STREAM_END) {

break;

}

CHECK_ERR(d_stream, err, "inflate");

}

System.out.println(" 解压缩前 --" + d_stream.total_in + " 字节 ");

System.out.println(" 解压缩后 --" + d_stream.total_out + " 字节 ");

err = d_stream.inflateEnd();

CHECK_ERR(d_stream, err, "inflateEnd");

byte[] unzipfile = new byte[(int) d_stream.total_out];

System.arraycopy(uncompr, 0, unzipfile, 0, unzipfile.length);

return unzipfile;

}

static void CHECK_ERR(ZStream z, int err, String msg) {

if (err != JZlib.Z_OK) {

if (z.msg != null) {

System.out.print(z.msg + " ");

}

System.out.println(msg + " error: " + err);

System.exit(1);

}

static void zip(File input, File output, int compressFactor) {

if (!input.exists())

return;

if (!output.getParentFile().exists())

output.getParentFile().mkdir();

try {

FileInputStream in = new FileInputStream(input);

FileOutputStream out = new FileOutputStream(output);

resLen = in.available();

byte[] buff = new byte[resLen];

in.read(buff);

byte[] suBuf = compressfile(buff, compressFactor,resLen);

out.write(suBuf, 0, suBuf.length); // 写压缩文件

in.close();

out.close();

System.out.println(" 压缩完毕！ " + input.getAbsolutePath());

} catch (Exception e) {

e.printStackTrace();

}

static void unZip(File input, File output) {

if (!input.exists())

return;

if (!output.getParentFile().exists())

output.getParentFile().mkdir();

try {

FileInputStream in = new FileInputStream(input);

FileOutputStream out = new FileOutputStream(output);

byte[] buff = new byte[resLen];

in.read(buff);

byte[] suBuff = uncompressfile(buff,resLen);

out.write(suBuff, 0, suBuff.length); // 写压缩文件

in.close();

out.close();

System.out.println(" 解压完毕！ " + input.getAbsolutePath());

} catch (Exception e) {

e.printStackTrace();

}

/**

* @param args

public static void main(String[] args) {

TestJZlib test = new TestJZlib();

// 压缩

File input = new File("f:// 搜索引擎原理系统与设计 .pdf");

File output = new File("f://test.bz2");

test.zip(input, output, 9);

// 解压

File output2 = new File("f://test.jpg");

test.unZip(output, output2);

}

附件四：

package jcss.search.base;

调用 org.apache.tools.zip 实现压缩。

夜可以使用 java.util.zip 不过如果是中文的话，

解压缩的时候文件名字会是乱码。原因是解压缩软件的编码格式跟

java.util.zip.ZipInputStream 的编码字符集不同

java.util.zip.ZipInputStream 的字符集固定是 UTF-8

注销的部分是解压缩的代码。

import java.io.BufferedInputStream;

import java.io.BufferedOutputStream;

import java.io.File;

import java.io.FileInputStream;

import java.io.FileOutputStream;

import java.io.InputStream;

import java.util.Date;

import java.util.zip.ZipInputStream;

import jcss.search.base.zip.CZipInputStream;

import org.apache.tools.zip.ZipOutputStream;

* @ 作者：张华 @ 日期： 2006-5-14 @ 说明：

public class ZipUtil {

int count = 0;

static final int BUFFER = 2048;

public void zip(String zipFileName, String inputFile) throws Exception {

zip(zipFileName, new File(inputFile));

}

public void zip(String zipFileName, File inputFile) throws Exception {

ZipOutputStream out = new ZipOutputStream(new FileOutputStream(

new String(zipFileName.getBytes("gb2312"))));

System.out.println("zip start");

zip(out, inputFile, "");

System.out.println("zip done");

out.close();

}

public void zip(ZipOutputStream out, File f, String base) throws Exception {

System.out.println("Zipping " + f.getName());

Date beginDate = new Date();

if (f.isDirectory()) {

File[] fl = f.listFiles();

// out.putNextEntry(new ZipEntry(base + "/"));

out.putNextEntry(new org.apache.tools.zip.ZipEntry(base + "/"));

base = base.length() == 0 ? "" : base + "/";

for (int i = 0; i < fl.length; i++) {

zip(out, fl[i], base + fl[i].getName());

System.out.println(fl[i].getName());

// System.out.println(new

// String(fl[i].getName().getBytes("gb2312")));

}

} else {

// out.putNextEntry(new ZipEntry(base));

out.putNextEntry(new org.apache.tools.zip.ZipEntry(base));

System.out.println(base);

FileInputStream in = new FileInputStream(f);

int b;

while ((b = in.read()) != -1)

out.write(b);

in.close();

}

Date endDate = new Date();

long temp = beginDate.getTime() - endDate.getTime();

System.out.println(" 共用时间： " + temp);

}

private void createDirectory(String directory, String subDirectory) {

String dir[];

File fl = new File(directory);

try {

if (subDirectory == "" && fl.exists() != true)

fl.mkdir();

else if (subDirectory != "") {

dir = subDirectory.replace('//', '/').split("/");

for (int i = 0; i < dir.length; i++) {

File subFile = new File(directory + File.separator + dir[i]);

if (subFile.exists() == false)

subFile.mkdir();

directory += File.separator + dir[i];

}

} catch (Exception ex) {

System.out.println(ex.getMessage());

}

/**

* 使用 ZipFile 解压缩小 ZIP

* * 类 ZipInputStream 读出 ZIP 文件序列（简单地说就是读出这个 ZIP 文件压缩了多少文件）

* 而类 ZipFile 使用内嵌的随机文件访问机制读出其中的文件内容，所以不必顺序的读出 ZIP 压缩文件序列。

* ZIPInputStream 和 ZipFile 之间另外一个基本的不同点在于高速缓冲的使用方面。

* 当文件使用 ZipInputStream 和 FileInputStream 流读出的时候， ZIP 条目不使用高速缓冲。

* 然而，如果使用 ZipFile （文件名）来打开文件，它将使用内嵌的高速缓冲，所以如果 ZipFile （文件名）

* 被重复调用的话，文件只被打开一次。缓冲值在第二次打开进使用。如果你工作在 UNIX 系统下，

* 这是什么作用都没有的，因为使用 ZipFile 打开的所有 ZIP 文件都在内存中存在映射，

* 所以使用 ZipFile 的性能优于 ZipInputStream 。

* 然而，如果同一 ZIP 文件的内容在程序执行期间经常改变，或是重载的话，使用 ZipInputStream 就成为你的首选了。

* @param zipFileName

* @param outputDirectory

* @throws Exception

public void unSmallZip(String zipFileName, String outputDirectory)

throws Exception {

try {

Date beginDate = new Date();

org.apache.tools.zip.ZipFile zipFile = new org.apache.tools.zip.ZipFile(zipFileName);

java.util.Enumeration e = zipFile.getEntries();

org.apache.tools.zip.ZipEntry zipEntry = null;

createDirectory(outputDirectory, "");

while (e.hasMoreElements()) {

zipEntry = (org.apache.tools.zip.ZipEntry) e.nextElement();

String name = null;

if (zipEntry.isDirectory()) {

name = zipEntry.getName();

name = name.substring(0, name.length() - 1);

File f = new File(outputDirectory + File.separator + name);

f.mkdir();

System.out.println(" 创建目录： " + outputDirectory

+ File.separator + name);

} else {

String fileName = zipEntry.getName();

fileName = fileName.replace('//', '/');

count++;

System.out.println(" 正在解压第 " + count + " 个文件 : "

+ zipEntry.getName());

if (fileName.indexOf("/") != -1) {

createDirectory(outputDirectory, fileName.substring(0,

fileName.lastIndexOf("/")));

fileName = fileName.substring(

fileName.lastIndexOf("/") + 1, fileName

.length());

}

File f = new File(outputDirectory + File.separator

+ zipEntry.getName());

f.createNewFile();

InputStream in = zipFile.getInputStream(zipEntry);

FileOutputStream out = new FileOutputStream(f);

byte[] by = new byte[1024];

int c;

while ((c = in.read(by)) != -1) {

out.write(by, 0, c);

}

out.close();

in.close();

}

// 删除文件不能在这里删，因为文件正在使用，应在上传那处删

// 解压后，删除压缩文件

// File zipFileToDel = new File(zipFileName);

// zipFileToDel.delete();

// System.out.println(" 正在删除文件： "+ zipFileToDel.getCanonicalPath());

// // 删除解压后的那一层目录

// delALayerDir(zipFileName, outputDirectory);

Date endDate = new Date();

long temp = beginDate.getTime() - endDate.getTime();

System.out.println(" 解压共用时间： " + temp);

} catch (Exception ex) {

System.out.println(ex.getMessage());

}

/**

* 使用 ZipInputStream 解压大 ZIP( 通过修改 ZipInputStream 类让其支持中文文件名 )

* 类 ZipInputStream 读出 ZIP 文件序列（简单地说就是读出这个 ZIP 文件压缩了多少文件）

* 而类 ZipFile 使用内嵌的随机文件访问机制读出其中的文件内容，所以不必顺序的读出 ZIP 压缩文件序列。

* ZIPInputStream 和 ZipFile 之间另外一个基本的不同点在于高速缓冲的使用方面。

* 当文件使用 ZipInputStream 和 FileInputStream 流读出的时候， ZIP 条目不使用高速缓冲。

* 然而，如果使用 ZipFile （文件名）来打开文件，它将使用内嵌的高速缓冲，所以如果 ZipFile （文件名）

* 被重复调用的话，文件只被打开一次。缓冲值在第二次打开进使用。如果你工作在 UNIX 系统下，

* 这是什么作用都没有的，因为使用 ZipFile 打开的所有 ZIP 文件都在内存中存在映射，

* 所以使用 ZipFile 的性能优于 ZipInputStream 。

* 然而，如果同一 ZIP 文件的内容在程序执行期间经常改变，或是重载的话，使用 ZipInputStream 就成为你的首选了。

* @param zipFileName

* @param outputDirectory

* @throws Exception

public void unBigZip(String zipFileName, String outputDirectory)

throws Exception {

try {

Date beginDate = new Date();

//org.apache.tools.zip.ZipFile zipFile = new org.apache.tools.zip.ZipFile(zipFileName);

FileInputStream fis = new FileInputStream(zipFileName);

BufferedOutputStream dest = null;

//CZipInputStream zin = new CZipInputStream(new BufferedInputStream(fis));

CZipInputStream zin = new CZipInputStream(new BufferedInputStream(fis),"gb2312");

//org.apache.tools.zip.ZipEntry entry;

//java.util.zip.ZipEntry entry;

jcss.search.base.zip.ZipEntry entry;

while((entry =zin.getNextEntry()) != null) {

String name = null;

if (entry.isDirectory()) {

name = entry.getName();

name = name.substring(0, name.length() - 1);

File f = new File(outputDirectory + File.separator + name);

f.mkdir();

System.out.println(" 创建目录： " + outputDirectory + File.separator + name);

}else{

String fileName = entry.getName();

fileName = fileName.replace('//', '/');

count++;

System.out.println(" 正在解压第 " + count + " 个文件 : " + entry.getName());

if (fileName.indexOf("/") != -1) {

createDirectory(outputDirectory, fileName.substring(0,fileName.lastIndexOf("/")));

fileName = fileName.substring(fileName.lastIndexOf("/") + 1, fileName.length());

}

File f = new File(outputDirectory + File.separator + entry.getName());

f.createNewFile();

// InputStream in = zipFile.getInputStream(zipEntry);

// FileOutputStream out = new FileOutputStream(f);

// byte[] by = new byte[1024];

// int c;

// while ((c = in.read(by)) != -1) {

// out.write(by, 0, c);

// }

// out.close();

// in.close();

int cnt;

byte data[] = new byte[BUFFER];

FileOutputStream fos = new FileOutputStream(f);

dest = new BufferedOutputStream(fos, BUFFER);

while ((cnt = zin.read(data, 0, BUFFER)) != -1) {

dest.write(data, 0, cnt);

}

dest.flush();

dest.close();

}

zin.close();

// 删除文件不能在这里删，因为文件正在使用，应在上传那处删

// 解压后，删除压缩文件

// File zipFileToDel = new File(zipFileName);

// zipFileToDel.delete();

// System.out.println(" 正在删除文件： "+ zipFileToDel.getCanonicalPath());

// // 删除解压后的那一层目录

// delALayerDir(zipFileName, outputDirectory);

Date endDate = new Date();

long temp = endDate.getTime() - beginDate.getTime();

System.out.println(" 解压共用时间： " + temp);

} catch (Exception ex) {

System.out.println(ex.getMessage());

}

/**

* 删掉一层目录

* @param zipFileName

* @param outputDirectory

public void delALayerDir(String zipFileName, String outputDirectory) {

String[] dir = zipFileName.replace('//', '/').split("/");

String fileFullName = dir[dir.length - 1]; // 得到 aa.zip

int pos = -1;

pos = fileFullName.indexOf(".");

String fileName = fileFullName.substring(0, pos); // 得到 aa

String sourceDir = outputDirectory + File.separator + fileName;

try {

copyFile(new File(outputDirectory), new File(sourceDir), new File(

sourceDir));

deleteSourceBaseDir(new File(sourceDir));

} catch (Exception e) {

e.printStackTrace();

}

/**

* 将 sourceDir 目录的文件全部 copy 到 destDir 中去

public void copyFile(File destDir, File sourceBaseDir, File sourceDir)

throws Exception {

File[] lists = sourceDir.listFiles();

String line = null;

String url = null;

if (lists == null)

return;

for (int i = 0; i < lists.length; i++) {

File f = lists[i];

if (f.isFile()) {

FileInputStream fis = new FileInputStream(f);

String content = "";

String sourceBasePath = sourceBaseDir.getCanonicalPath();

String destPath = destDir.getCanonicalPath();

String fPath = f.getCanonicalPath();

String drPath = destDir

+ fPath.substring(fPath.indexOf(sourceBasePath)

+ sourceBasePath.length());

FileOutputStream fos = new FileOutputStream(drPath);

byte[] b = new byte[2048];

while (fis.read(b) != -1) {

if (content != null)

content += new String(b);

else

content = new String(b);

b = new byte[2048];

}

content = content.trim();

fis.close();

fos.write(content.getBytes());

fos.flush();

fos.close();

} else {

// 先新建目录

new File(destDir + File.separator + f.getName()).mkdir();

copyFile(destDir, sourceBaseDir, f); // 递归调用

}

/**

* 将 sourceDir 目录的文件全部 copy 到 destDir 中去

public void deleteSourceBaseDir(File curFile) throws Exception {

File[] lists = curFile.listFiles();

String line = null;

String url = null;

File parentFile = null;

for (int i = 0; i < lists.length; i++) {

File f = lists[i];

if (f.isFile()) {

f.delete();

// 若它的父目录没有文件了，说明已经删完，应该删除父目录

parentFile = f.getParentFile();

if (parentFile.list().length == 0)

parentFile.delete();

} else {

deleteSourceBaseDir(f); // 递归调用

}

public static void main(String[] args) {

try {

ZipUtil t = new ZipUtil();

// t.zip("e://test.zip", "E://news.sina.com.cn//news.sina.com.cn");

Date beginDate = new Date();

//t.unZip("e://test.zip", "E://news.sina.com.cn");

t.unBigZip("e://test.zip", "E://news.sina.com.cn");

Date endDate = new Date();

long temp = endDate.getTime() - beginDate.getTime();

System.out.println(" 共用时间： " + temp);

} catch (Exception e) {

e.printStackTrace(System.out);

}

附件五：

* @(#)ZipInputStream.java 1.37 04/06/11

* SUN PROPRIETARY/CONFIDENTIAL. Use is subject to license terms.

package jcss.search.base.zip;

import java.io.InputStream;

import java.io.IOException;

import java.io.EOFException;

import java.io.PushbackInputStream;

import java.util.zip.CRC32;

import java.util.zip.Inflater;

import java.util.zip.InflaterInputStream;

import java.util.zip.ZipException;

/**

* @author David Connelly

* @version 1.37, 06/11/04

public class CZipInputStream extends InflaterInputStream implements ZipConstants {

private String encoding = "UTF-8" ;

private ZipEntry entry ;

private CRC32 crc = new CRC32();

private long remaining ;

private byte [] tmpbuf = new byte [512];

private static final int STORED = ZipEntry. STORED ;

private static final int DEFLATED = ZipEntry. DEFLATED ;

private boolean closed = false ;

// this flag is set to true after EOF has reached for

// one entry

private boolean entryEOF = false ;

/**

* Check to make sure that this stream has not been closed

private void ensureOpen() throws IOException {

if ( closed ) {

throw new IOException( "Stream closed" );

}

boolean usesDefaultInflater = false ;

/**

* Creates a new ZIP input stream.

* @param in the actual input stream

public CZipInputStream(InputStream in) {

super ( new PushbackInputStream(in, 512), new Inflater( true ), 512);

usesDefaultInflater = true ;

if (in == null ) {

throw new NullPointerException( "in is null" );

}

public CZipInputStream(InputStream in,String encoding) {

super ( new PushbackInputStream(in,512), new Inflater( true ),512);

usesDefaultInflater = true ;

if (in == null ) {

throw new NullPointerException( "in is null" );

}

this . encoding =encoding;

}

/**

* Reads the next ZIP file entry and positions the stream at the

* beginning of the entry data.

* @return the next ZIP file entry, or null if there are no more entries

* @exception ZipException if a ZIP file error has occurred

* @exception IOException if an I/O error has occurred

public ZipEntry getNextEntry() throws IOException {

ensureOpen();

if ( entry != null ) {

closeEntry();

}

crc .reset();

inf .reset();

if (( entry = readLOC()) == null ) {

return null ;

}

if ( entry . method == STORED ) {

remaining = entry . size ;

}

entryEOF = false ;

return entry ;

}

/**

* Closes the current ZIP entry and positions the stream for reading the

* next entry.

* @exception ZipException if a ZIP file error has occurred

* @exception IOException if an I/O error has occurred

public void closeEntry() throws IOException {

ensureOpen();

while (read( tmpbuf , 0, tmpbuf . length ) != -1) ;

entryEOF = true ;

}

/**

* Returns 0 after EOF has reached for the current entry data,

* otherwise always return 1.

* <p>

* Programs should not count on this method to return the actual number

* of bytes that could be read without blocking.

* @return 1 before EOF and 0 after EOF has reached for current entry.

* @exception IOException if an I/O error occurs.

public int available() throws IOException {

ensureOpen();

if ( entryEOF ) {

return 0;

} else {

return 1;

}

/**

* Reads from the current ZIP entry into an array of bytes. Blocks until

* some input is available.

* @param b the buffer into which the data is read

* @param off the start offset of the data

* @param len the maximum number of bytes read

* @return the actual number of bytes read, or - 1 if the end of the

* entry is reached

* @exception ZipException if a ZIP file error has occurred

* @exception IOException if an I/O error has occurred

public int read( byte [] b, int off, int len) throws IOException {

ensureOpen();

if (off < 0 || len < 0 || off > b. length - len) {

throw new IndexOutOfBoundsException();

} else if (len == 0) {

return 0;

}

if ( entry == null ) {

return -1;

}

switch ( entry . method ) {

case DEFLATED :

len = super .read(b, off, len);

if (len == -1) {

readEnd( entry );

entryEOF = true ;

entry = null ;

} else {

crc .update(b, off, len);

}

return len;

case STORED :

if ( remaining <= 0) {

entryEOF = true ;

entry = null ;

return -1;

}

if (len > remaining ) {

len = ( int ) remaining ;

}

len = in .read(b, off, len);

if (len == -1) {

throw new ZipException( "unexpected EOF" );

}

crc .update(b, off, len);

remaining -= len;

return len;

default :

throw new InternalError( "invalid compression method" );

}

/**

* Skips specified number of bytes in the current ZIP entry.

* @param n the number of bytes to skip

* @return the actual number of bytes skipped

* @exception ZipException if a ZIP file error has occurred

* @exception IOException if an I/O error has occurred

* @exception IllegalArgumentException if n < 0

public long skip( long n) throws IOException {

if (n < 0) {

throw new IllegalArgumentException( "negative skip length" );

}

ensureOpen();

int max = ( int )Math.min (n, Integer. MAX_VALUE );

int total = 0;

while (total < max) {

int len = max - total;

if (len > tmpbuf . length ) {

len = tmpbuf . length ;

}

len = read( tmpbuf , 0, len);

if (len == -1) {

entryEOF = true ;

break ;

}

total += len;

}

return total;

}

/**

* Closes this input stream and releases any system resources associated

* with the stream.

* @exception IOException if an I/O error has occurred

public void close() throws IOException {

if (! closed ) {

super .close();

closed = true ;

}

private byte [] b = new byte [256];

* Reads local file (LOC) header for next entry.

private ZipEntry readLOC() throws IOException {

try {

readFully( tmpbuf , 0, LOCHDR );

} catch (EOFException e) {

return null ;

}

if (get32 ( tmpbuf , 0) != LOCSIG ) {

return null ;

}

// get the entry name and create the ZipEntry first

int len = get16 ( tmpbuf , LOCNAM );

if (len == 0) {

throw new ZipException( "missing entry name" );

}

int blen = b . length ;

if (len > blen) {

blen = blen * 2;

while (len > blen);

b = new byte [blen];

}

readFully( b , 0, len);

//ZipEntry e = createZipEntry(getUTF8String(b, 0, len));

ZipEntry e= null ;

try

{

if ( this . encoding .toUpperCase().equals( "UTF-8" ))

e=createZipEntry(getUTF8String ( b , 0, len));

else

e=createZipEntry( new String( b ,0,len, this . encoding ));

}

catch (Exception byteE)

{

e=createZipEntry(getUTF8String ( b , 0, len));

}

// now get the remaining fields for the entry

e. version = get16 ( tmpbuf , LOCVER );

e. flag = get16 ( tmpbuf , LOCFLG );

if ((e. flag & 1) == 1) {

throw new ZipException( "encrypted ZIP entry not supported" );

}

e. method = get16 ( tmpbuf , LOCHOW );

e. time = get32 ( tmpbuf , LOCTIM );

if ((e. flag & 8) == 8) {

/* EXT descriptor present */

if (e. method != DEFLATED ) {

throw new ZipException(

"only DEFLATED entries can have EXT descriptor" );

}

} else {

e. crc = get32 ( tmpbuf , LOCCRC );

e. csize = get32 ( tmpbuf , LOCSIZ );

e. size = get32 ( tmpbuf , LOCLEN );

}

len = get16 ( tmpbuf , LOCEXT );

if (len > 0) {

byte [] bb = new byte [len];

readFully(bb, 0, len);

e. extra = bb;

}

return e;

}

* Fetches a UTF8-encoded String from the specified byte array.

private static String getUTF8String( byte [] b, int off, int len) {

// First, count the number of characters in the sequence

int count = 0;

int max = off + len;

int i = off;

while (i < max) {

int c = b[i++] & 0xff;

switch (c >> 4) {

case 0: case 1: case 2: case 3: case 4: case 5: case 6: case 7:

// 0xxxxxxx

count++;

break ;

case 12: case 13:

// 110xxxxx 10xxxxxx

if (( int )(b[i++] & 0xc0) != 0x80) {

throw new IllegalArgumentException();

}

count++;

break ;

case 14:

// 1110xxxx 10xxxxxx 10xxxxxx

if ((( int )(b[i++] & 0xc0) != 0x80) ||

(( int )(b[i++] & 0xc0) != 0x80)) {

throw new IllegalArgumentException();

}

count++;

break ;

default :

// 10xxxxxx, 1111xxxx

throw new IllegalArgumentException();

}

if (i != max) {

throw new IllegalArgumentException();

}

// Now decode the characters...

char [] cs = new char [count];

i = 0;

while (off < max) {

int c = b[off++] & 0xff;

switch (c >> 4) {

case 0: case 1: case 2: case 3: case 4: case 5: case 6: case 7:

// 0xxxxxxx

cs[i++] = ( char )c;

break ;

case 12: case 13:

// 110xxxxx 10xxxxxx

cs[i++] = ( char )(((c & 0x1f) << 6) | (b[off++] & 0x3f));

break ;

case 14:

// 1110xxxx 10xxxxxx 10xxxxxx

int t = (b[off++] & 0x3f) << 6;

cs[i++] = ( char )(((c & 0x0f) << 12) | t | (b[off++] & 0x3f));

break ;

default :

// 10xxxxxx, 1111xxxx

throw new IllegalArgumentException();

}

return new String(cs, 0, count);

}

/**

* Creates a new <code> ZipEntry </code> object for the specified

* entry name.

* @param name the ZIP file entry name

* @return the ZipEntry just created

protected ZipEntry createZipEntry(String name) {

return new ZipEntry(name);

}

* Reads end of deflated entry as well as EXT descriptor if present.

private void readEnd(ZipEntry e) throws IOException {

int n = inf .getRemaining();

if (n > 0) {

((PushbackInputStream) in ).unread( buf , len - n, n);

}

if ((e. flag & 8) == 8) {

/* EXT descriptor present */

readFully( tmpbuf , 0, EXTHDR );

long sig = get32 ( tmpbuf , 0);

if (sig != EXTSIG ) { // no EXTSIG present

e. crc = sig;

e. csize = get32 ( tmpbuf , EXTSIZ - EXTCRC );

e. size = get32 ( tmpbuf , EXTLEN - EXTCRC );

((PushbackInputStream) in ).unread(

tmpbuf , EXTHDR - EXTCRC - 1, EXTCRC );

} else {

e. crc = get32 ( tmpbuf , EXTCRC );

e. csize = get32 ( tmpbuf , EXTSIZ );

e. size = get32 ( tmpbuf , EXTLEN );

}

if (e. size != inf .getBytesWritten()) {

throw new ZipException(

"invalid entry size (expected " + e. size +

" but got " + inf .getBytesWritten() + " bytes)" );

}

if (e. csize != inf .getBytesRead()) {

throw new ZipException(

"invalid entry compressed size (expected " + e. csize +

" but got " + inf .getBytesRead() + " bytes)" );

}

if (e. crc != crc .getValue()) {

throw new ZipException(

"invalid entry CRC (expected 0x" + Long.toHexString (e. crc ) +

" but got 0x" + Long.toHexString ( crc .getValue()) + ")" );

}

* Reads bytes, blocking until all bytes are read.

private void readFully( byte [] b, int off, int len) throws IOException {

while (len > 0) {

int n = in .read(b, off, len);

if (n == -1) {

throw new EOFException();

}

off += n;

len -= n;

}

* Fetches unsigned 16-bit value from byte array at specified offset.

* The bytes are assumed to be in Intel (little-endian) byte order.

private static final int get16( byte b[], int off) {

return (b[off] & 0xff) | ((b[off+1] & 0xff) << 8);

}

* Fetches unsigned 32-bit value from byte array at specified offset.

* The bytes are assumed to be in Intel (little-endian) byte order.

private static final long get32( byte b[], int off) {

return get16 (b, off) | (( long )get16 (b, off+2) << 16);

}

附件六：

package jcss.search.base.zip;

/**

* @author 张华

* @time 2007 - 8 - 3

* @description

**/

public class ZipEntry extends org.apache.tools.zip.ZipEntry {

String name ; // entry name

long time = -1; // modification time (in DOS time)

long crc = -1; // crc-32 of entry data

long size = -1; // uncompressed size of entry data

long csize = -1; // compressed size of entry data

int method = -1; // compression method

byte [] extra ; // optional extra field data for entry

String comment ; // optional comment string for entry

// The following flags are used only by Zip{Input,Output}Stream

int flag ; // bit flags

int version ; // version needed to extract

long offset ; // offset of loc header

/**

* Compression method for uncompressed entries.

public static final int STORED = 0;

/**

* Compression method for compressed (deflated) entries.

public static final int DEFLATED = 8;

// 下面这句一定要注释掉

// static {

// /* Zip library is loaded from System.initializeSystemClass */

// initIDs();

// }

// private static native void initIDs();

public ZipEntry(String name){

super (name);

}

你可能感兴趣的:(exception,算法,String,File,import,byte)

机器学习与深度学习间关系与区别 ℒℴѵℯ心·动ꦿ໊ོ꫞ 人工智能学习深度学习 python
一、机器学习概述定义机器学习（MachineLearning,ML）是一种通过数据驱动的方法，利用统计学和计算算法来训练模型，使计算机能够从数据中学习并自动进行预测或决策。机器学习通过分析大量数据样本，识别其中的模式和规律，从而对新的数据进行判断。其核心在于通过训练过程，让模型不断优化和提升其预测准确性。主要类型1.监督学习（SupervisedLearning）监督学习是指在训练数据集中包含输入
LocalDateTime 转 String igotyback java 开发语言
importjava.time.LocalDateTime;importjava.time.format.DateTimeFormatter;publicclassMain{publicstaticvoidmain(String[]args){//获取当前时间LocalDateTimenow=LocalDateTime.now();//定义日期格式化器DateTimeFormatterformat
每日一题——第九十题互联网打工人no1 C语言程序设计每日一练 c语言
题目：判断子串是否与主串匹配#include#include#include//////判断子串是否在主串中匹配//////主串///子串///boolisSubstring(constchar*str,constchar*substr){intlenstr=strlen(str);//计算主串的长度intlenSub=strlen(substr);//计算子串的长度//遍历主字符串，对每个可能得
C#中使用split分割字符串互联网打工人no1 c#
1、用字符串分隔：usingSystem.Text.RegularExpressions;stringstr="aaajsbbbjsccc";string[]sArray=Regex.Split(str,"js",RegexOptions.IgnoreCase);foreach(stringiinsArray)Response.Write(i.ToString()+"");输出结果：aaabbbc
linux sdl windows.h,Windows下的SDL安装奔跑吧linux内核 linux sdl windows.h
首先你要下载并安装SDL开发包。如果装在C盘下，路径为C:\SDL1.2.5如果在WINDOWS下。你可以按以下步骤：1.打开VC++，点击"Tools",Options2,点击directories选项3.选择"Includefiles"增加一个新的路径。"C:\SDL1.2.5\include"4，现在选择"Libaryfiles“增加"C:\SDL1.2.5\lib"现在你可以开始编写你的第
Goolge earth studio 进阶4——路径修改与平滑陟彼高冈yu Google earth studio 进阶教程旅游
如果我们希望在大约中途时获得更多的城市鸟瞰视角。可以将相机拖动到这里并创建一个新的关键帧。camera_target_clip_7EarthStudio会自动平滑我们的路径，所以当我们通过这个关键帧时，不是一个生硬的角度，而是一个平滑的曲线。camera_target_clip_8路径上有贝塞尔控制手柄，允许我们调整路径的形状。右键单击，我们可以选择“平滑路径”，这是默认的自动平滑算法，或者我们可
python os.environ_python os.environ 读取和设置环境变量 weixin_39605414 python os.environ
>>>importos>>>os.environ.keys()['LC_NUMERIC','GOPATH','GOROOT','GOBIN','LESSOPEN','SSH_CLIENT','LOGNAME','USER','HOME','LC_PAPER','PATH','DISPLAY','LANG','TERM','SHELL','J2REDIR','LC_MONETARY','QT_QPA
基于社交网络算法优化的二维最大熵图像分割智能算法研学社（Jack旭）智能优化算法应用图像分割算法 php 开发语言
智能优化算法应用：基于社交网络优化的二维最大熵图像阈值分割-附代码文章目录智能优化算法应用：基于社交网络优化的二维最大熵图像阈值分割-附代码1.前言2.二维最大熵阈值分割原理3.基于社交网络优化的多阈值分割4.算法结果：5.参考文献：6.Matlab代码摘要：本文介绍基于最大熵的图像分割，并且应用社交网络算法进行阈值寻优。1.前言阅读此文章前，请阅读《图像分割：直方图区域划分及信息统计介绍》htt
使用LLaVa和Ollama实现多模态RAG示例 llzwxh888 python 人工智能开发语言
本文将详细介绍如何使用LLaVa和Ollama实现多模态RAG（检索增强生成），通过提取图像中的结构化数据、生成图像字幕等功能来展示这一技术的强大之处。安装环境首先，您需要安装以下依赖包：!pipinstallllama-index-multi-modal-llms-ollama!pipinstallllama-index-readers-file!pipinstallunstructured!p
python是什么意思中文-在python中%是什么意思编程大乐趣
Python中%有两种：1、数值运算：%代表取模，返回除法的余数。如：>>>7%212、%操作符（字符串格式化，stringformatting），说明如下：%[(name)][flags][width].[precision]typecode(name)为命名flags可以有+，-，''或0。+表示右对齐。-表示左对齐。''为一个空格，表示在正数的左侧填充一个空格，从而与负数对齐。0表示使用0填
121. 买卖股票的最佳时机薄荷糖的味道_fb40
给定一个数组，它的第i个元素是一支给定股票第i天的价格。如果你最多只允许完成一笔交易（即买入和卖出一支股票），设计一个算法来计算你所能获取的最大利润。注意你不能在买入股票前卖出股票。示例1:输入:[7,1,5,3,6,4]输出:5解释:在第2天（股票价格=1）的时候买入，在第5天（股票价格=6）的时候卖出，最大利润=6-1=5。注意利润不能是7-1=6,因为卖出价格需要大于买入价格。示例2:输入:
每日算法&面试题，大厂特训二十八天——第二十天（树）肥学 ⚡算法题⚡面试题每日精进 java 算法数据结构
目录标题导读算法特训二十八天面试题点击直接资料领取导读肥友们为了更好的去帮助新同学适应算法和面试题，最近我们开始进行专项突击一步一步来。上一期我们完成了动态规划二十一天现在我们进行下一项对各类算法进行二十八天的一个小总结。还在等什么快来一起肥学进行二十八天挑战吧！！特别介绍小白练手专栏，适合刚入手的新人欢迎订阅编程小白进阶python有趣练手项目里面包括了像《机器人尬聊》《恶搞程序》这样的有趣文章
webpack图片等资源的处理 dmengmeng
需要的loaderfile-loader（让我们可以引入这些资源文件）url-loader（其实是file-loader的二次封装）img-loader（处理图片所需要的）在没有使用任何处理图片的loader之前，比如说css中用到了背景图片，那么最后打包会报错的，因为他没办法处理图片。其实你只想能够使用图片的话。只加一个file-loader就可以，打开网页能准确看到图片。{test:/\.(p
回溯算法-重新安排行程 chirou_ 算法数据结构图论 c++图搜索
leetcode332.重新安排行程这题我还没自己ac过，只能现在凭着刚学完的热乎劲把我对题解的理解记下来。本题我认为对数据结构的考察比较多，用什么数据结构去存数据，去读取数据，都是很重要的。classSolution{private:unordered_map>targets;boolbacktracking(intticketNum,vector&result){//1.确定参数和返回值//2
python os 环境变量 CV矿工 python 开发语言 numpy
环境变量：环境变量是程序和操作系统之间的通信方式。有些字符不宜明文写进代码里，比如数据库密码，个人账户密码，如果写进自己本机的环境变量里，程序用的时候通过os.environ.get（）取出来就行了。os.environ是一个环境变量的字典。环境变量的相关操作importos"""设置/修改环境变量：os.environ[‘环境变量名称’]=‘环境变量值’#其中key和value均为string类
Redis系列：Geo 类型赋能亿级地图位置计算 Ly768768 redis bootstrap 数据库
1前言我们在篇深刻理解高性能Redis的本质的时候就介绍过Redis的几种基本数据结构，它是基于不同业务场景而设计的：动态字符串(REDIS_STRING)：整数(REDIS_ENCODING_INT)、字符串(REDIS_ENCODING_RAW)双端列表(REDIS_ENCODING_LINKEDLIST)压缩列表(REDIS_ENCODING_ZIPLIST)跳跃表(REDIS_ENCODI
ARM驱动学习之4小结 JT灬新一嵌入式 C++arm开发学习 linux
ARM驱动学习之4小结#include#include#include#include#include#defineDEVICE_NAME"hello_ctl123"MODULE_LICENSE("DualBSD/GPL");MODULE_AUTHOR("TOPEET");staticlonghello_ioctl(structfile*file,unsignedintcmd,unsignedlo
C++ | Leetcode C++题解之第409题最长回文串 Ddddddd_158 经验分享 C++Leetcode 题解
题目：题解：classSolution{public:intlongestPalindrome(strings){unordered_mapcount;intans=0;for(charc:s)++count[c];for(autop:count){intv=p.second;ans+=v/2*2;if(v%2==1andans%2==0)++ans;}returnans;}};
Faiss：高效相似性搜索与聚类的利器网络·魚大数据 faiss
Faiss是一个针对大规模向量集合的相似性搜索库，由FacebookAIResearch开发。它提供了一系列高效的算法和数据结构，用于加速向量之间的相似性搜索，特别是在大规模数据集上。本文将介绍Faiss的原理、核心功能以及如何在实际项目中使用它。Faiss原理：近似最近邻搜索：Faiss的核心功能之一是近似最近邻搜索，它能够高效地在大规模数据集中找到与给定查询向量最相似的向量。这种搜索是近似的，
insert into select 主键自增_mybatis拦截器实现主键自动生成 weixin_39521651 insert into select 主键自增 mybatis delete返回值 mybatis insert返回主键 mybatis insert返回对象 mybatis plus insert返回主键 mybatis plus 插入生成id
前言前阵子和朋友聊天，他说他们项目有个需求，要实现主键自动生成，不想每次新增的时候，都手动设置主键。于是我就问他，那你们数据库表设置主键自动递增不就得了。他的回答是他们项目目前的id都是采用雪花算法来生成，因此为了项目稳定性，不会切换id的生成方式。朋友问我有没有什么实现思路，他们公司的orm框架是mybatis，我就建议他说，不然让你老大把mybatis切换成mybatis-plus。mybat
k均值聚类算法考试例题_k均值算法(k均值聚类算法计算题) 寻找你83497 k均值聚类算法考试例题
?算法：第一步：选K个初始聚类中心，z1(1),z2(1)，…，zK(1)，其中括号内的序号为寻找聚类中心的迭代运算的次序号。聚类中心的向量值可任意设定，例如可选开始的K个.k均值聚类：---------一种硬聚类算法，隶属度只有两个取值0或1，提出的基本根据是“类内误差平方和最小化”准则；模糊的c均值聚类算法：--------一种模糊聚类算法，是.K均值聚类算法是先随机选取K个对象作为初始的聚类
Python实现简单的机器学习算法 master_chenchengg python python 办公效率 python开发 IT
Python实现简单的机器学习算法开篇：初探机器学习的奇妙之旅搭建环境：一切从安装开始必备工具箱第一步：安装Anaconda和JupyterNotebook小贴士：如何配置Python环境变量算法初体验：从零开始的Python机器学习线性回归：让数据说话数据准备：从哪里找数据编码实战：Python实现线性回归模型评估：如何判断模型好坏逻辑回归：从分类开始理论入门：什么是逻辑回归代码实现：使用skl
推荐算法_隐语义-梯度下降 _feivirus_ 算法机器学习和数学推荐算法机器学习隐语义
importnumpyasnp1.模型实现"""inputrate_matrix:M行N列的评分矩阵，值为P*Q.P:初始化用户特征矩阵M*K.Q:初始化物品特征矩阵K*N.latent_feature_cnt:隐特征的向量个数max_iteration:最大迭代次数alpha:步长lamda:正则化系数output分解之后的P和Q"""defLFM_grad_desc(rate_matrix,l
自然语言处理_tf-idf _feivirus_ 算法机器学习和数学自然语言处理 tf-idf 逆文档频率词频
importpandasaspdimportmath1.数据预处理docA="Thecatsatonmyface"docB="Thedogsatonmybed"wordsA=docA.split("")wordsB=docB.split("")wordsSet=set(wordsA).union(set(wordsB))print(wordsSet){'on','my','face','sat',
K近邻算法_分类鸢尾花数据集 _feivirus_ 算法机器学习和数学分类机器学习 K近邻
importnumpyasnpimportpandasaspdfromsklearn.datasetsimportload_irisfromsklearn.model_selectionimporttrain_test_splitfromsklearn.metricsimportaccuracy_score1.数据预处理iris=load_iris()df=pd.DataFrame(data=ir
用Python实现简单的猜数字游戏程序媛了了 python 游戏 java
猜数字游戏代码：importrandomdefpythonit():a=random.randint(1,100)n=int(input("输入你猜想的数字："))whilen!=a:ifn>a:print("很遗憾，猜大了")n=int(input("请再次输入你猜想的数字："))elifna::如果玩家猜的数字n大于随机数字a，则输出"很遗憾，猜大了"，并提示玩家再次输入。elifn
用Python实现读取统计单词个数程序媛了了 python 游戏 java
完整实例代码：fromcollectionsimportCounterdefpythonit():danci={}withopen("pythonit.txt","r",encoding="utf-8")asf:foriinf:words=i.strip().split()forwordinwords:ifwordnotindanci:danci[word]=1else:danci[word]+=
数据结构 | 栈和队列 TT-Kun 数据结构与算法数据结构栈队列 C语言
文章目录栈和队列1.栈：后进先出（LIFO）的数据结构1.1概念与结构1.2栈的实现2.队列：先进先出（FIFO）的数据结构2.1概念与结构2.2队列的实现3.栈和队列算法题3.1有效的括号3.2用队列实现栈3.3用栈实现队列3.4设计循环队列结论栈和队列在计算机科学中，栈和队列是两种基本且重要的数据结构，它们在处理数据存储和访问顺序方面有着独特的规则和应用。本文将详细介绍栈和队列的概念、结构、实
2024.9.6 Python，华为笔试题总结，字符串格式化，字符串操作，广度优先搜索解决公司组织绩效互评问题，无向图 RaidenQ python 华为 leetcode 算法力扣广度优先无向图
1.字符串格式化name="Alice"age=30formatted_string="Name:{},Age:{}".format(name,age)print(formatted_string)或者name="Alice"age=30formatted_string=f"Name:{name},Age:{age}"print(formatted_string)2.网络健康检查第一行有两个整数m
[Python] 数据结构详解及代码 AIAdvocate 算法 python 数据结构链表
今日内容大纲介绍数据结构介绍列表链表1.数据结构和算法简介程序大白话翻译,程序=数据结构+算法数据结构指的是存储,组织数据的方式.算法指的是为了解决实际业务问题而思考思路和方法,就叫:算法.2.算法的5大特性介绍算法具有独立性算法是解决问题的思路和方式,最重要的是思维,而不是语言,其(算法)可以通过多种语言进行演绎.5大特性有输入,需要传入1或者多个参数有输出,需要返回1个或者多个结果有穷性,执行
HQL之投影查询归来朝歌 HQL Hibernate 查询语句投影查询
在HQL查询中，常常面临这样一个场景，对于多表查询，是要将一个表的对象查出来还是要只需要每个表中的几个字段，最后放在一起显示？针对上面的场景，如果需要将一个对象查出来： HQL语句写“from 对象”即可 Session session = HibernateUtil.openSession();
Spring整合redis bylijinnan redis
pom.xml <dependencies>  <dependency> <groupId>org.springframework.data</groupId> <artifactId>spring-data-redi
org.hibernate.NonUniqueResultException: query did not return a unique result: 2 0624chenhong Hibernate
参考：http://blog.csdn.net/qingfeilee/article/details/7052736 org.hibernate.NonUniqueResultException: query did not return a unique result: 2 在项目中出现了org.hiber
android动画效果不懂事的小屁孩 android动画
前几天弄alertdialog和popupwindow的时候，用到了android的动画效果，今天专门研究了一下关于android的动画效果，列出来，方便以后使用。 Android 平台提供了两类动画。一类是Tween动画，就是对场景里的对象不断的进行图像变化来产生动画效果（旋转、平移、放缩和渐变）。第二类就是 Frame动画，即顺序的播放事先做好的图像，与gif图片原理类似。
js delete 删除机理以及它的内存泄露问题的解决方案换个号韩国红果果 JavaScript
delete删除属性时只是解除了属性与对象的绑定，故当属性值为一个对象时，删除时会造成内存泄露（其实还未删除）举例： var person={name:{firstname:'bob'}} var p=person.name delete person.name p.firstname -->'bob' // 依然可以访问p.firstname，存在内存泄露
Oracle将零干预分析加入网络即服务计划蓝儿唯美 oracle
由Oracle通信技术部门主导的演示项目并没有在本月较早前法国南斯举行的行业集团TM论坛大会中获得嘉奖。但是，Oracle通信官员解雇致力于打造一个支持零干预分配和编制功能的网络即服务（NaaS）平台，帮助企业以更灵活和更适合云的方式实现通信服务提供商（CSP）的连接产品。这个Oracle主导的项目属于TM Forum Live!活动上展示的Catalyst计划的19个项目之一。Catalyst计
spring学习——springmvc（二） a-john springMVC
Spring MVC提供了非常方便的文件上传功能。 1，配置Spring支持文件上传： DispatcherServlet本身并不知道如何处理multipart的表单数据，需要一个multipart解析器把POST请求的multipart数据中抽取出来，这样DispatcherServlet就能将其传递给我们的控制器了。为了在Spring中注册multipart解析器，需要声明一个实现了Mul
POJ-2828-Buy Tickets aijuans ACM_POJ
POJ-2828-Buy Tickets http://poj.org/problem?id=2828 线段树，逆序插入 #include<iostream>#include<cstdio>#include<cstring>#include<cstdlib>using namespace std;#define N 200010struct
Java Ant build.xml详解 asia007 build.xml
1,什么是antant是构建工具2,什么是构建概念到处可查到，形象来说，你要把代码从某个地方拿来，编译，再拷贝到某个地方去等等操作，当然不仅与此，但是主要用来干这个3,ant的好处跨平台 --因为ant是使用java实现的，所以它跨平台使用简单--与ant的兄弟make比起来语法清晰--同样是和make相比功能强大--ant能做的事情很多，可能你用了很久，你仍然不知道它能有
android按钮监听器的四种技术百合不是茶 android xml配置监听器实现接口
android开发中经常会用到各种各样的监听器,android监听器的写法与java又有不同的地方; 1,activity中使用内部类实现接口 ,创建内部类实例使用add方法与java类似创建监听器的实例 myLis lis = new myLis(); 使用add方法给按钮添加监听器
软件架构师不等同于资深程序员 bijian1013 程序员架构师架构设计
本文的作者Armel Nene是ETAPIX Global公司的首席架构师，他居住在伦敦，他参与过的开源项目包括 Apache Lucene,，Apache Nutch， Liferay 和 Pentaho等。如今很多的公司
TeamForge Wiki Syntax & CollabNet User Information Center sunjing TeamForge How do Attachement Anchor Wiki Syntax
the CollabNet user information center http://help.collab.net/ How do I create a new Wiki page? A CollabNet TeamForge project can have any number of Wiki pages. All Wiki pages are linked, and
【Redis四】Redis数据类型 bit1129 redis
概述 Redis是一个高性能的数据结构服务器，称之为数据结构服务器的原因是，它提供了丰富的数据类型以满足不同的应用场景，本文对Redis的数据类型以及对这些类型可能的操作进行总结。 Redis常用的数据类型包括string、set、list、hash以及sorted set.Redis本身是K/V系统，这里的数据类型指的是value的类型，而不是key的类型，key的类型只有一种即string
SSH2整合-附源码白糖_ eclipse spring tomcat Hibernate Google
今天用eclipse终于整合出了struts2+hibernate+spring框架。我创建的是tomcat项目，需要有tomcat插件。导入项目以后，鼠标右键选择属性，然后再找到“tomcat”项，勾选一下“Is a tomcat project”即可。具体方法见源码里的jsp图片，sql也在源码里。补充1：项目中部分jar包不是最新版的，可能导
[转]开源项目代码的学习方法 braveCS 学习方法
转自： http://blog.sina.com.cn/s/blog_693458530100lk5m.html http://www.cnblogs.com/west-link/archive/2011/06/07/2074466.html 1）阅读features。以此来搞清楚该项目有哪些特性2）思考。想想如果自己来做有这些features的项目该如何构架3）下载并安装d
编程之美-子数组的最大和（二维） bylijinnan 编程之美
package beautyOfCoding; import java.util.Arrays; import java.util.Random; public class MaxSubArraySum2 { /** * 编程之美子数组之和的最大值（二维） */ private static final int ROW = 5; private stat
读书笔记-3 chengxuyuancsdn jquery笔记 resultMap配置 ibatis一对多配置
1、resultMap配置 2、ibatis一对多配置 3、jquery笔记 1、resultMap配置当<select resultMap="topic_data"> <resultMap id="topic_data">必须一一对应。 (1)<resultMap class="tblTopic&q
[物理与天文]物理学新进展 comsci
如果我们必须获得某种地球上没有的矿石,才能够进行某些能量输出装置的设计和建造,而要获得这种矿石,又必须首先进行深空探测,而要进行深空探测,又必须获得这种能量输出装置,这个矛盾的循环,会导致地球联盟在与宇宙文明建立关系的时候,陷入困境怎么办呢?
Oracle 11g新特性:Automatic Diagnostic Repository daizj oracle ADR
Oracle Database 11g的FDI（Fault Diagnosability Infrastructure）是自动化诊断方面的又一增强。 FDI的一个关键组件是自动诊断库（Automatic Diagnostic Repository-ADR）。在oracle 11g中，alert文件的信息是以xml的文件格式存在的，另外提供了普通文本格式的alert文件。这两份log文
简单排序:选择排序 dieslrae 选择排序
public void selectSort(int[] array){ int select; for(int i=0;i<array.length;i++){ select = i; for(int k=i+1;k<array.leng
C语言学习六指针的经典程序，互换两个数字 dcj3sjt126com c
示例程序，swap_1和swap_2都是错误的，推理从1开始推到2，2没完成，推到3就完成了 # include <stdio.h> void swap_1(int, int); void swap_2(int *, int *); void swap_3(int *, int *); int main(void) { int a = 3; int b =
php 5.4中php-fpm 的重启、终止操作命令 dcj3sjt126com PHP
php 5.4中php-fpm 的重启、终止操作命令: 查看php运行目录命令：which php/usr/bin/php 查看php-fpm进程数：ps aux | grep -c php-fpm 查看运行内存/usr/bin/php -i|grep mem 重启php-fpm/etc/init.d/php-fpm restart 在phpinfo()输出内容可以看到php
线程同步工具类 shuizhaosi888 同步工具类
同步工具类包括信号量（Semaphore）、栅栏（barrier）、闭锁（CountDownLatch）闭锁（CountDownLatch） public class RunMain { public long timeTasks(int nThreads, final Runnable task) throws InterruptedException { fin
bleeding edge是什么意思 haojinghua DI
不止一次，看到很多讲技术的文章里面出现过这个词语。今天终于弄懂了——通过朋友给的浏览软件，上了wiki。我再一次感到，没有辞典能像WiKi一样，给出这样体贴人心、一清二楚的解释了。为了表达我对WiKi的喜爱，只好在此一一中英对照，给大家上次课。 In computer science, bleeding edge is a term that
c中实现utf8和gbk的互转 jimmee c iconv utf8&gbk编码
#include <iconv.h> #include <stdlib.h> #include <stdio.h> #include <unistd.h> #include <fcntl.h> #include <string.h> #include <sys/stat.h> int code_c
大型分布式网站架构设计与实践 lilin530 应用服务器搜索引擎
1.大型网站软件系统的特点？ a.高并发，大流量。 b.高可用。 c.海量数据。 d.用户分布广泛，网络情况复杂。 e.安全环境恶劣。 f.需求快速变更，发布频繁。 g.渐进式发展。 2.大型网站架构演化发展历程？ a.初始阶段的网站架构。应用程序，数据库，文件等所有的资源都在一台服务器上。 b.应用服务器和数据服务器分离。 c.使用缓存改善网站性能。 d.使用应用
在代码中获取Android theme中的attr属性值 OliveExcel android theme
Android的Theme是由各种attr组合而成, 每个attr对应了这个属性的一个引用, 这个引用又可以是各种东西. 在某些情况下, 我们需要获取非自定义的主题下某个属性的内容 (比如拿到系统默认的配色colorAccent), 操作方式举例一则: int defaultColor = 0xFF000000; int[] attrsArray = { andorid.r.
基于Zookeeper的分布式共享锁 roadrunners zookeeper 分布式共享锁
首先，说说我们的场景，订单服务是做成集群的，当两个以上结点同时收到一个相同订单的创建指令，这时并发就产生了，系统就会重复创建订单。等等......场景。这时，分布式共享锁就闪亮登场了。共享锁在同一个进程中是很容易实现的，但在跨进程或者在不同Server之间就不好实现了。Zookeeper就很容易实现。具体的实现原理官网和其它网站也有翻译，这里就不在赘述了。官
两个容易被忽略的MySQL知识 tomcat_oracle mysql
1、varchar(5)可以存储多少个汉字，多少个字母数字？　　相信有好多人应该跟我一样，对这个已经很熟悉了，根据经验我们能很快的做出决定，比如说用varchar(200)去存储url等等，但是，即使你用了很多次也很熟悉了，也有可能对上面的问题做出错误的回答。　　这个问题我查了好多资料，有的人说是可以存储5个字符，2.5个汉字（每个汉字占用两个字节的话），有的人说这个要区分版本，5.0
zoj 3827 Information Entropy(水题) 阿尔萨斯 format
题目链接：zoj 3827 Information Entropy 题目大意：三种底，计算和。解题思路：调用库函数就可以直接算了，不过要注意Pi = 0的时候，不过它题目里居然也讲了。。。limp→0+plogb(p)=0，因为p是logp的高阶。 #include <cstdio> #include <cstring> #include <cmath&