a276202460

边学边记（八） lucene索引结构五（_N.tis,_N.tii）

_N.tis保存了此段内容中的项（term）信息，因为lucene是倒排的索引格式所以分词出来的term保存在tis文件里每个term的信息包含了出现此term的doc的频率(多少个doc存在)等信息，每个term的具体信息中包含了出现此term的域编号（fieldnum）等信息

tis的文件结构：

TermInfoFile (.tis)--> TIVersion, TermCount, IndexInterval, SkipInterval, MaxSkipLevels, TermInfos

TIVersion --> UInt32

TermCount --> UInt64

IndexInterval --> UInt32

SkipInterval --> UInt32

MaxSkipLevels --> UInt32

TermInfos --> <TermInfo> ^TermCount

TermInfo --> <Term, DocFreq, FreqDelta, ProxDelta, SkipDelta>

Term --> <PrefixLength, Suffix, FieldNum>

Suffix --> String

PrefixLength, DocFreq, FreqDelta, ProxDelta, SkipDelta
--> VInt

读取tis文件内容（文档中没有详细说明文件保存的细节可参看org.apache.lucene.index.TermInfosWriter 类）

/**************** * *Create Class:ReadTermIndex.java *Author:a276202460 *Create at:2010-6-7 */ package com.rich.lucene.io; public class ReadTerminfo { /** * @param args * @throws Exception */ public static void main(String[] args) throws Exception { String indexfile = "D:/lucenetest/indexs/txtindex/index4/_0.tis"; IndexFileInput input = null; try{ input = new IndexFileInput(indexfile); System.out.println("term index version:"+input.readInt()); long termcount = input.readLong(); System.out.println("term count:"+termcount); System.out.println("term IndexInterval:"+input.readInt()); System.out.println("term SkipInterval:"+input.readInt()); System.out.println("term MaxSkipLevels:"+input.readInt()); for(long i = 0 ;i < termcount;i++){ System.out.println("*****read term info["+i+"]******"); System.out.println("the term share prefixlength is :"+input.readVInt()); System.out.println("term's own stuffix is:"+input.readString()); System.out.println("exists this term's field number is:"+input.readVInt()); int doccount = input.readVInt(); System.out.println("the doc count contain this term is:"+doccount); System.out.println("the position of this term's TermFreqs within the .frq file is:"+input.readVLong()); System.out.println("the position of this term's TermPositions within the .prx file is:"+input.readVLong()); if(doccount >= 16) System.out.println("the position of this term's SkipData within the .frq file is:"+input.readVInt()); } }finally{ input.close(); } } }

运行结果：

term index version:-4 term count:22 term IndexInterval:128 term SkipInterval:16 term MaxSkipLevels:10 *****read term info[0]****** the term share prefixlength is :0 term's own stuffix is:做 exists this term's field number is:2 the doc count contain this term is:1 the position of this term's TermFreqs within the .frq file is:0 the position of this term's TermPositions within the .prx file is:0 *****read term info[1]****** the term share prefixlength is :1 term's own stuffix is:?? exists this term's field number is:2 the doc count contain this term is:1 the position of this term's TermFreqs within the .frq file is:1 the position of this term's TermPositions within the .prx file is:1 *****read term info[2]****** the term share prefixlength is :1 term's own stuffix is:?? exists this term's field number is:2 the doc count contain this term is:1 the position of this term's TermFreqs within the .frq file is:1 the position of this term's TermPositions within the .prx file is:1 *****read term info[3]****** the term share prefixlength is :1 term's own stuffix is:?? exists this term's field number is:2 the doc count contain this term is:1 the position of this term's TermFreqs within the .frq file is:1 the position of this term's TermPositions within the .prx file is:1 *****read term info[4]****** the term share prefixlength is :1 term's own stuffix is:?? exists this term's field number is:2 the doc count contain this term is:2 the position of this term's TermFreqs within the .frq file is:1 the position of this term's TermPositions within the .prx file is:1 *****read term info[5]****** the term share prefixlength is :1 term's own stuffix is:?? exists this term's field number is:2 the doc count contain this term is:1 the position of this term's TermFreqs within the .frq file is:2 the position of this term's TermPositions within the .prx file is:2 *****read term info[6]****** the term share prefixlength is :1 term's own stuffix is:?? exists this term's field number is:2 the doc count contain this term is:2 the position of this term's TermFreqs within the .frq file is:1 the position of this term's TermPositions within the .prx file is:1 *****read term info[7]****** the term share prefixlength is :0 term's own stuffix is:搜 exists this term's field number is:2 the doc count contain this term is:2 the position of this term's TermFreqs within the .frq file is:3 the position of this term's TermPositions within the .prx file is:3 *****read term info[8]****** the term share prefixlength is :1 term's own stuffix is:?? exists this term's field number is:2 the doc count contain this term is:2 the position of this term's TermFreqs within the .frq file is:3 the position of this term's TermPositions within the .prx file is:3 *****read term info[9]****** the term share prefixlength is :1 term's own stuffix is:?? exists this term's field number is:2 the doc count contain this term is:1 the position of this term's TermFreqs within the .frq file is:3 the position of this term's TermPositions within the .prx file is:3 *****read term info[10]****** the term share prefixlength is :0 term's own stuffix is:球 exists this term's field number is:2 the doc count contain this term is:1 the position of this term's TermFreqs within the .frq file is:1 the position of this term's TermPositions within the .prx file is:1 *****read term info[11]****** the term share prefixlength is :1 term's own stuffix is:?? exists this term's field number is:2 the doc count contain this term is:1 the position of this term's TermFreqs within the .frq file is:1 the position of this term's TermPositions within the .prx file is:1 *****read term info[12]****** the term share prefixlength is :1 term's own stuffix is:?? exists this term's field number is:2 the doc count contain this term is:2 the position of this term's TermFreqs within the .frq file is:1 the position of this term's TermPositions within the .prx file is:1 *****read term info[13]****** the term share prefixlength is :1 term's own stuffix is:?? exists this term's field number is:2 the doc count contain this term is:2 the position of this term's TermFreqs within the .frq file is:2 the position of this term's TermPositions within the .prx file is:2 *****read term info[14]****** the term share prefixlength is :0 term's own stuffix is:度 exists this term's field number is:0 the doc count contain this term is:1 the position of this term's TermFreqs within the .frq file is:3 the position of this term's TermPositions within the .prx file is:3 *****read term info[15]****** the term share prefixlength is :0 term's own stuffix is:搜 exists this term's field number is:0 the doc count contain this term is:2 the position of this term's TermFreqs within the .frq file is:1 the position of this term's TermPositions within the .prx file is:1 *****read term info[16]****** the term share prefixlength is :1 term's own stuffix is:?? exists this term's field number is:0 the doc count contain this term is:1 the position of this term's TermFreqs within the .frq file is:2 the position of this term's TermPositions within the .prx file is:2 *****read term info[17]****** the term share prefixlength is :0 term's own stuffix is:百 exists this term's field number is:0 the doc count contain this term is:1 the position of this term's TermFreqs within the .frq file is:1 the position of this term's TermPositions within the .prx file is:1 *****read term info[18]****** the term share prefixlength is :1 term's own stuffix is:?? exists this term's field number is:0 the doc count contain this term is:2 the position of this term's TermFreqs within the .frq file is:1 the position of this term's TermPositions within the .prx file is:1 *****read term info[19]****** the term share prefixlength is :0 term's own stuffix is:谷 exists this term's field number is:0 the doc count contain this term is:1 the position of this term's TermFreqs within the .frq file is:2 the position of this term's TermPositions within the .prx file is:2 *****read term info[20]****** the term share prefixlength is :0 term's own stuffix is:http://www.baidu.com exists this term's field number is:1 the doc count contain this term is:1 the position of this term's TermFreqs within the .frq file is:1 the position of this term's TermPositions within the .prx file is:1 *****read term info[21]****** the term share prefixlength is :11 term's own stuffix is:g.cn exists this term's field number is:1 the doc count contain this term is:1 the position of this term's TermFreqs within the .frq file is:1 the position of this term's TermPositions within the .prx file is:1

由运行结果看貌似对又貌似错为什么有些乱码呢，如果按照java的string的charat然后做equal操作的话两个汉字是不可能有相同的前缀的

看了下文档有看了下源码发现lucene比较前后两个term的公共前缀使用的是UTF8的字节码

比如说"做" 转换为utf8的字节数组是

-27
-127
-102

"全"转换为utf8的字节数组时：

-27
-123
-88

term ‘做’ 和term ‘全’ 比较的话byte[0]的值是相同的。由于term‘做’作为第一个term保存所以保存term ‘做’ 的value信息就是

【3】【-27】【-127】【-102】就是一个String类型的保存格式作为相邻的term ‘全’ 共享了byte【0】

那么此时term ‘全’ 的stuff 字符串的就是【2】【-123】【-88】虽然也是string格式的存储但是作为UTF8编码格式两位的byte是不能保存汉字的如果是纯英文的话就不会出现乱码问题。

修改代码如下：

/**************** * *Create Class:ReadTermIndex.java *Author:a276202460 *Create at:2010-6-7 */ package com.rich.lucene.io; import org.apache.lucene.util.UnicodeUtil; public class ReadTerminfo { /** * @param args * @throws Exception */ public static void main(String[] args) throws Exception { String indexfile = "D:/lucenetest/indexs/txtindex/index4/_0.tis"; IndexFileInput input = null; try{ input = new IndexFileInput(indexfile); System.out.println("term index version:"+input.readInt()); long termcount = input.readLong(); System.out.println("term count:"+termcount); System.out.println("term IndexInterval:"+input.readInt()); System.out.println("term SkipInterval:"+input.readInt()); System.out.println("term MaxSkipLevels:"+input.readInt()); int doccount = 0; int prefixlength = 0; String termvalue = null; byte[] lasttermbyte = null; int stufflenth; for(long i = 0 ;i < termcount;i++){ System.out.println("*****read term info["+i+"]******"); prefixlength = input.readVInt(); System.out.println("the term share prefixlength is :"+prefixlength); stufflenth = input.readVInt(); byte[] stuffbyte = new byte[stufflenth]; input.readBytes(stuffbyte, 0, stufflenth); if(prefixlength == 0){ termvalue = new String(stuffbyte,"UTF-8"); lasttermbyte = stuffbyte; }else{ byte[] termbyte = new byte[prefixlength+stufflenth]; System.arraycopy(lasttermbyte, 0, termbyte, 0, prefixlength); System.arraycopy(stuffbyte, 0, termbyte, prefixlength, stufflenth); termvalue = new String(termbyte,"UTF-8"); lasttermbyte = termbyte; } System.out.println("term's value is:"+termvalue); System.out.println("exists this term's field number is:"+input.readVInt()); doccount = input.readVInt(); System.out.println("the doc count contain this term is:"+doccount); System.out.println("the position of this term's TermFreqs within the .frq file is:"+input.readVLong()); System.out.println("the position of this term's TermPositions within the .prx file is:"+input.readVLong()); if(doccount >= 16) System.out.println("the position of this term's SkipData within the .frq file is:"+input.readVInt()); } }finally{ input.close(); } } }

运行结果：

term index version:-4 term count:22 term IndexInterval:128 term SkipInterval:16 term MaxSkipLevels:10 *****read term info[0]****** the term share prefixlength is :0 term's value is:做 exists this term's field number is:2 the doc count contain this term is:1 the position of this term's TermFreqs within the .frq file is:0 the position of this term's TermPositions within the .prx file is:0 *****read term info[1]****** the term share prefixlength is :1 term's value is:全 exists this term's field number is:2 the doc count contain this term is:1 the position of this term's TermFreqs within the .frq file is:1 the position of this term's TermPositions within the .prx file is:1 *****read term info[2]****** the term share prefixlength is :1 term's value is:内 exists this term's field number is:2 the doc count contain this term is:1 the position of this term's TermFreqs within the .frq file is:1 the position of this term's TermPositions within the .prx file is:1 *****read term info[3]****** the term share prefixlength is :1 term's value is:国 exists this term's field number is:2 the doc count contain this term is:1 the position of this term's TermFreqs within the .frq file is:1 the position of this term's TermPositions within the .prx file is:1 *****read term info[4]****** the term share prefixlength is :1 term's value is:大 exists this term's field number is:2 the doc count contain this term is:2 the position of this term's TermFreqs within the .frq file is:1 the position of this term's TermPositions within the .prx file is:1 *****read term info[5]****** the term share prefixlength is :1 term's value is:度 exists this term's field number is:2 the doc count contain this term is:1 the position of this term's TermFreqs within the .frq file is:2 the position of this term's TermPositions within the .prx file is:2 *****read term info[6]****** the term share prefixlength is :1 term's value is:引 exists this term's field number is:2 the doc count contain this term is:2 the position of this term's TermFreqs within the .frq file is:1 the position of this term's TermPositions within the .prx file is:1 *****read term info[7]****** the term share prefixlength is :0 term's value is:搜 exists this term's field number is:2 the doc count contain this term is:2 the position of this term's TermFreqs within the .frq file is:3 the position of this term's TermPositions within the .prx file is:3 *****read term info[8]****** the term share prefixlength is :1 term's value is:擎 exists this term's field number is:2 the doc count contain this term is:2 the position of this term's TermFreqs within the .frq file is:3 the position of this term's TermPositions within the .prx file is:3 *****read term info[9]****** the term share prefixlength is :1 term's value is:最 exists this term's field number is:2 the doc count contain this term is:1 the position of this term's TermFreqs within the .frq file is:3 the position of this term's TermPositions within the .prx file is:3 *****read term info[10]****** the term share prefixlength is :0 term's value is:球 exists this term's field number is:2 the doc count contain this term is:1 the position of this term's TermFreqs within the .frq file is:1 the position of this term's TermPositions within the .prx file is:1 *****read term info[11]****** the term share prefixlength is :1 term's value is:百 exists this term's field number is:2 the doc count contain this term is:1 the position of this term's TermFreqs within the .frq file is:1 the position of this term's TermPositions within the .prx file is:1 *****read term info[12]****** the term share prefixlength is :1 term's value is:的 exists this term's field number is:2 the doc count contain this term is:2 the position of this term's TermFreqs within the .frq file is:1 the position of this term's TermPositions within the .prx file is:1 *****read term info[13]****** the term share prefixlength is :1 term's value is:索 exists this term's field number is:2 the doc count contain this term is:2 the position of this term's TermFreqs within the .frq file is:2 the position of this term's TermPositions within the .prx file is:2 *****read term info[14]****** the term share prefixlength is :0 term's value is:度 exists this term's field number is:0 the doc count contain this term is:1 the position of this term's TermFreqs within the .frq file is:3 the position of this term's TermPositions within the .prx file is:3 *****read term info[15]****** the term share prefixlength is :0 term's value is:搜 exists this term's field number is:0 the doc count contain this term is:2 the position of this term's TermFreqs within the .frq file is:1 the position of this term's TermPositions within the .prx file is:1 *****read term info[16]****** the term share prefixlength is :1 term's value is:歌 exists this term's field number is:0 the doc count contain this term is:1 the position of this term's TermFreqs within the .frq file is:2 the position of this term's TermPositions within the .prx file is:2 *****read term info[17]****** the term share prefixlength is :0 term's value is:百 exists this term's field number is:0 the doc count contain this term is:1 the position of this term's TermFreqs within the .frq file is:1 the position of this term's TermPositions within the .prx file is:1 *****read term info[18]****** the term share prefixlength is :1 term's value is:索 exists this term's field number is:0 the doc count contain this term is:2 the position of this term's TermFreqs within the .frq file is:1 the position of this term's TermPositions within the .prx file is:1 *****read term info[19]****** the term share prefixlength is :0 term's value is:谷 exists this term's field number is:0 the doc count contain this term is:1 the position of this term's TermFreqs within the .frq file is:2 the position of this term's TermPositions within the .prx file is:2 *****read term info[20]****** the term share prefixlength is :0 term's value is:http://www.baidu.com exists this term's field number is:1 the doc count contain this term is:1 the position of this term's TermFreqs within the .frq file is:1 the position of this term's TermPositions within the .prx file is:1 *****read term info[21]****** the term share prefixlength is :11 term's value is:http://www.g.cn exists this term's field number is:1 the doc count contain this term is:1 the position of this term's TermFreqs within the .frq file is:1 the position of this term's TermPositions within the .prx file is:1

Tis - term 的详细信息存储

Tii – term 详细信息的索引文件（标识详细信息的页索引每 128 个 term 在 tii 文件中建立一个索引项）

两个文件的头信息都是一样的

TIVersion --> UInt32 文件的格式版本号

TermCount --> UInt64 文件中保存的term 的数量（tis 中就是此段索引中的所有分隔的term （项）的数量，不论源来自哪个field,tii 文件中记录的也是此文件中term 的数量但是不是全部，是每页的最后一项（第一页为空最后一页没有记录，128 （IndexInterval ）个term 为一页）

IndexInterval --> UInt32 （每页存储的term 数量）

SkipInterval --> UInt32

MaxSkipLevels --> UInt32

SkipInterval 和 MaxSkipLevels 的意义和其他的文件存储有关系，现在还不知道具体的含义，但是和查看TIS,TII 文件的结构没有关系，以后学习frq ，prx 文件的结构的时候在检验这个标识的意思

头信息完后就是每个term 的具体信息了

TermInfo --> <Term, DocFreq, FreqDelta, ProxDelta, SkipDelta>

Term --> <PrefixLength, Suffix, FieldNum>

PrefixLength 表示后面一个 term 共享的前面 term 的 byte 长度（ utf8 ）

Suffix 表示自己独有的后缀信息文档中说是字符串，对英文来说没有异议，对中文的话就可能是一个不完整的字符，是长度和后缀 utf8 字节

FieldNum term 来源的 field number

DocFreq 出现此 term 的 document 的数量

FreqDelta frq 文件中词 term 的位置（具体此位置的信息还得接下来看 frq 文件）

ProxDelta, SkipDelta 和 FreqDelta 意思差不多也是位置信息，指定了位置也就对此位置的信息建立了指针也是一个索引

两个文件的内容格式：

图中在tis保存第一个term的时候tii保存了一个空的term信息进去

如果tis刚好存了128*n个数据的话那么最后一页的末项term是不会被记录到tii文件中的。接下来将frq，prx，nrm的信息读取完以后了解lucene整个查询检索的过程和索引创建的结构就很清楚了。

内容都是边学边写到博客的，欢迎拍砖指正。

鸿蒙(HarmonyOS)应用开发实战——自定义安全键盘案例 CTrup HarmonyOS 移动开发鸿蒙开发 harmonyos 安全音视频移动开发鸿蒙开发组件化
往期知识点整理鸿蒙（HarmonyOS）北向开发知识点记录~被裁员后，踏上了鸿蒙开发求职之路持续更新中……介绍金融类应用在密码输入时，一般会使用自定义安全键盘。本示例介绍如何使用TextInput组件实现自定义安全键盘场景，主要包括TextInput.customKeyboard绑定自定义键盘、自定义键盘布局和状态更新等知识点。效果图预览实现思路1.使用TextInput的customKeyboa
VB.NET在2021年后有哪些更新=待验证专注VB编程开发20年数据库 VB c#.net 开发语言
在2021年后，VB.NET随着VisualStudio和.NET平台的更新持续演进，主要在.NET6（2021年11月）、.NET7（2022年11月）和.NET8（2023年11月）中引入了以下特性和改进：1.语言特性增强文件范围的命名空间（.NET6）允许在文件顶部声明单个命名空间，无需大括号，减少缩进：vbNamespaceMyNamespace.FileScoped'整个文件的代码都属于
转全角半角(C#，VB.NET) chinaherolts2008 vb.net教程 c#开发语言 vb.net教程
vb.net教程https://www.xin3721.com/eschool/vbnetxin3721///////转全角的函数(SBCcase)//////任意字符串///全角字符串//////全角空格为12288，半角空格为32///其他字符半角(33-126)与全角(65281-65374)的对应关系是：均相差65248///publicstringToSBC(stringinput){/
C# 用VB.NET函数库实现全角半角转换 Jelly_tracy C#vb.net c#string microsoft c input
///转全角的函数(SBCcase)//////任意字符串///全角字符串//////全角空格为12288，半角空格为32///其他字符半角(33-126)与全角(65281-65374)的对应关系是：均相差65248///publicstringToSBC(stringinput){//半角转全角：char[]c=input.ToCharArray();for(inti=0;i65280&&c[
2025.最新java高频面试题（八股文） Java进阶八股文 java 算法 jvm spring spring boot spring cloud
1.String的底层实现是怎样的？1.String类由final修饰，不可以被继承2.底层是由char数组实现的3.value用final修饰，不能修改value的引用地址（value不可变）4.private修饰和成员变量没有提供setter接口，保证了不可以通过外部接口来修改String的值5.在JDK9中，将底层的char[]数组改为了byte[]数组存储。原因：char类型是2字节的，使
Vue2案例尔-尔学习笔记 vue 前端
一、自定义创建项目1、基于VueCli自定义创建项目Babel/Router/Vuex/CSS/LinterVue2.xVueRouterhash模式CSS预处理LessESlint:StandardconfigLintonSaveIndedicatedconfigfiles(配置文件所在位置)Npm2、ESlint代码规范1.认识代码规范代码规范:一套写代码的约定规则。赋值符号的左右是否需要空格
数据结构学习——KMP算法 uwvwko 算法数据结构学习 c++kmp
//KMP算法#include#include#include#includeusingnamespacestd;//next数组值的推导voidgetNext(string&str,vector&next){intstrlong=str.size();//next数组的0位为0next[0]=0;//i为当前字符的位置，从1位（第2个开始）inti=1;//length为当前字符之前的最长匹配子
数据结构学习——树的储存结构 uwvwko 数据库学习算法树
三种表示法：双亲表示法，孩子表示法，孩子兄弟表示法双亲表示法//树结构——双亲表示法#includeusingnamespacestd;structTree{stringdata;Tree*parent;//双亲指针Tree*firstchild;//第一个孩子指针Tree*nextsibling;//下一个兄弟指针};voidCreateTree(Tree*&root,stringdata,Tr
Ultralytics YOLO 库介绍与使用指南东北豆子哥人工智能/机器学习 YOLO
文章目录UltralyticsYOLO库介绍与使用指南主要特点安装基本使用1.使用预训练模型进行推理2.训练自定义模型3.验证模型4.导出模型高级功能1.使用不同任务模型2.使用自定义数据集3.跟踪对象(结合ByteTrack)常见问题解决性能优化技巧UltralyticsYOLO库介绍与使用指南UltralyticsYOLO是一个流行的计算机视觉库，专注于实现和优化YOLO(YouOnlyLoo
八股文——JAVA基础：字符串拼接用“+” 还是 StringBuilder? Hellyc 八股文自用 java 开发语言
java中仅有两个操作符的重载就是用于字符串的拼接操作的：+与+=操作符+底层使用的是StringBuilder来进行实现的，+用于拼接的缺陷在于使用StringBuilder，本身线程不安全，其次在循环中使用+来拼接，会导致重复创建StringBuilder对象，导致空间的浪费。而在循环中使用StringBuilder就不会出现这个问题。
String字符串与StringBuffer、StringBuilder的区别以及String的不可变性是什么 Hellyc java 开发语言
String字符串是八个基本数据类型之一，其底层实现是通过字符数组来进行实现的，也就是abc的字符数组与abc的字符串是完全相等的。StringBuilder与StringBuffer都继承相同的父类AbstractStringBuilder,这两个方法都提供了一些字符串的基本操作，比如append()使两个字符串进行相加。其中String与StringBuffer是线程安全的，StringBui
Git常见使用北珣. git
基本操作创建仓库1.先创建一个文件,再进入到对应的文件夹中#创建文件mkdir[file_name]#进入该文件cd[file_name]2.创建对应的Git仓库(在对应的文件夹内)#创建对应的仓库gitinit#可以查看当前文件内的内容llfile_name#查看tree目录tree.git/配置本地仓库必须要配置的配置项:nameemail为了方便操作,推荐在初始化仓库之后就进行配置#配置gi
完美解决SSL访问认证 sun.security.validator.ValidatorException: PKIX path building failed cqwuliu jAVA工具 TCP/IP ssl 网络协议网络
一、创建createIgnoreVerifySSL绕过SSL、TLS证书importjavax.net.ssl.SSLContext;importjavax.net.ssl.TrustManager;importjavax.net.ssl.X509TrustManager;importjava.io.IOException;importjava.security.KeyManagementExce
java 导出pdf去除边框_docx4j生成pdf时，如何指定pdf的页边框
生成pdf：publicStringsavePdf(WordprocessingMLPackagewordMLPackage,Mapdata)throwsException{StringpdfDir=getFilePath()+".pdf";Filefile=newFile(pdfDir);FileUtils.createDir(file);//使用默认的FOSettingssettings.se
Pthon httpx 使用代理下载文件（qbit）
前言技术栈Python3.11.8httpx0.28.1示例代码#encoding:utf-8#author:qbit#date:2025-06-30#summary:httpx使用代理下载文件importhttpxproxy='http://127.0.0.1:8081'defDownFile(url,file):withopen(file,'wb')asf:withhttpx.stream('
python网络安全实战_基于Python网络爬虫实战 weixin_39907850 python网络安全实战
文件的操作：一般都要使用os模块和os.path模块importos.pathos.path.exists('D:\\Python\\1.txt')#判断文件是否存在abspath(path)#返回path所在的绝对路径dirname(p)#返回目录的路径exists(path)#判断文件是否存在getatime(filename)#返回文件的最后访问时间getctime(filename)#返回
OceanBase批量插入数据报错java.lang.ArrayIndexOutOfBoundsException:0 二宝哥 oceanbase java 开发语言
OceanBase数据库MySQL模式，插入数据报错，直接首先换了连接池，插入数据成功。参考文章：com.mysql.cj.jdbc.result.ResultSetMetaData.getCloumnType(ResultSetMetaData.java:188)空指针-CSDN博客批量插入数据时，报错如下：OceanBase社区中搜索批量插入报错，出现“ArrayIndexOutOfBound
【Java从入门到放弃之 ConcurrentModificationException】 ThetaarSofVenice #Java从入门到放弃 java 开发语言
ConcurrentModificationExceptionConcurrentModificationException探索ConcurrentModificationException解决问题总结ConcurrentModificationExceptionConcurrentModificationException是Java中的一种运行时异常，通常发生在使用迭代器遍历集合（如ArrayL
ali docker部属paddleocr 大熊程序猿 ASP.NET Core docker 容器运维
dockerpullregistry.baidubce.com/paddlepaddle/paddle:2.6.0nano/root/projects/paddleocr_server.py========================fromflaskimportFlask,requestfromwerkzeug.utilsimportsecure_filenameimportuuidfrom
在 Excel 中实现引用另一个Excel文件中VBA代码的三种方法唐骁虎 excel windows
在Excel中，让第二个文件引用第一个文件中的VBA代码有以下几种方法：方法一：使用VBA项目引用操作步骤打开第一个包含VBA代码的Excel文件（假设为File1.xlsm）和第二个需要引用代码的Excel文件（假设为File2.xlsm）。在File2中，按下Alt+F11打开VBA编辑器。在VBA编辑器中，点击菜单栏的“工具”->“引用”。在弹出的“引用”对话框中，点击“浏览”按钮。找到并选
C++ string 类深度解析：字符串操作（拼接、查找、替换）景彡先生 C++基础 c++开发语言
在C++编程中，std::string是处理字符串的核心工具，它封装了动态字符串的内存管理，并提供了丰富的操作接口。本文将深入解析string类中最常用的字符串操作——拼接、查找、替换，通过原理分析和实战示例，帮助开发者高效掌握这些核心功能。一、string类基础：动态字符串的本质1.1核心特性动态内存管理：自动处理内存分配与释放，避免缓冲区溢出值语义：拷贝时复制内容，修改独立（区别于C风格字符数
rollupOptions 详细讲解，如何优化性能东心十 vue.js
RollupOptions详细讲解与性能优化Rollup是一个JavaScript模块打包器，特别适合用于库和应用的打包。rollupOptions是在使用Vite、WMR等构建工具时配置Rollup的选项对象。下面我将详细讲解rollupOptions的各个配置项以及如何优化打包性能。核心配置项详解输入(input)javascriptrollupOptions:{input:‘src/main
串口输出的三种方式 Ricardo.lucky STM32学习笔记 linux 运维服务器
目录一、输入输出重定向二、使用sprintf的Usart_SendString三、封装sprintf一、输入输出重定向这个是使用输入输出重定向，将输出的内容使用scanf()或getchar()从写入，使用printf的格式打印出来。使用这个输出重定向和输出重定向的时候需要使用头文件。输入重定向中只能通过使用这个函数是让scanf()或getchar()从串口读取一个字符。/*这个是输出重定向*/
递归经典问题--老鼠迷宫阿亮爱学代码 Java java 算法开发语言
代码：publicclassMiGong{publicstaticvoidmain(String[]args){//先创建迷宫，二维数组表示int[][]map=newint[8][7];//先规定map数组的元素值0：表示可以走1：表示障碍物int[][]map=newint[8][7];for(inti=0;i<7;i++){map[0][i]=1;map[7][i]=1;}map[3][1]
关于JAVA中LIST元素修改的一个问题记录
在工作中有遇到一个问题，需要从既有获取数据库中的LIST数据，做一下对其中部分数据做处理存入另外一个LIST集合之中，但是，有些现象还是比较出乎我的意料的，模拟了一下相关场景，具体的代码如下：packagecom.interview.demo;importjava.util.ArrayList;importjava.util.List;classStudent{privateStringname;
生信技能16 - 生信分析序列处理常用函数生信与基因组学生信分析项目实战技能合集 python numpy 数据分析
生信分析序列处理常用函数生信分析经常需要对序列进行处理，下面的实现代码可用于个人练习，可以让我们更好地理解序列处理的原理，当然python也有更高效率的包可以实现以下功能。read_seq_file读取序列txt文件函数count_nucletotides计算各核苷酸数量函数dna2rnaDNA序列转RNA序列函数seq_reverseDNA序列转换为互补序列函数count_GC_ratio计算序
Docker 镜像制作 Ris Hen docker docker
目录镜像制作及原因快照方式制作镜像Dockerfile制作镜像为什么需要DockerfileDockerfile指令常见问题镜像制作及原因镜像制作是因为某种需求，官方的镜像无法满足需求，需要我们通过一定手段来自定义镜像来满足要求。制作镜像往往因为以下原因1.编写的代码如何打包到镜像中直接跟随镜像发布2.第三方制作的内容安全性未知，如含有安全漏洞3.特定的需求或者功能无法满足，如需要给数据库添加审计
解锁云原生微服务架构：搭建与部署实战全攻略奔跑吧邓邓子必备核心技能云原生架构微服务搭建与部署实战全攻略
目录一、引言二、微服务拆分2.1拆分的必要性2.2拆分方法2.3注意事项三、服务注册与发现3.1概念与原理3.2常用组件介绍3.3实践案例四、负载均衡4.1作用与原理4.2实现方式4.3负载均衡算法4.4案例与代码实现4.4.1项目依赖配置4.4.2配置Ribbon4.4.3代码实现负载均衡调用五、容器化部署5.1容器化技术基础5.2容器化部署流程5.2.1编写Dockerfile5.2.2构建D
人名分类器（RNN案例） Turbo_O. rnn 深度学习人工智能
案例介绍：人名分类案例是多分类问题，根据人名预测属于哪个国家人名->x,国家->y监督学习，历史数据中已知y案例步骤：1.数据预处理获取常用字符以及国家类别#导入torch工具fromcProfileimportlabelimporttorch#导入nn准备构建模型importtorch.nnasnnimporttorch.optimasoptimfromjax.experimental.rnni
【Rust + Actix Web】现代后端开发：从零构建高并发 Web 应用 LCG元前端 rust 前端开发语言
目录项目概述环境准备项目创建与依赖配置系统架构设计核心代码实现1.数据库模型(`src/models.rs`)2.应用状态管理(`src/state.rs`)3.核心业务逻辑(`src/handlers.rs`)4.主应用入口(`src/main.rs`)高并发优化策略1.异步处理模型2.连接池配置优化3.缓存策略设计性能测试结果部署方案Docker部署配置(`Dockerfile`)Kubern
插入表主键冲突做更新 a-john
有以下场景：用户下了一个订单，订单内的内容较多，且来自多表，首次下单的时候，内容可能会不全（部分内容不是必须，出现有些表根本就没有没有该订单的值）。在以后更改订单时，有些内容会更改，有些内容会新增。问题：如果在sql语句中执行update操作，在没有数据的表中会出错。如果在逻辑代码中先做查询，查询结果有做更新，没有做插入，这样会将代码复杂化。解决： mysql中提供了一个sql语
Android xml资源文件中@、@android:type、@*、？、@+含义和区别 Cb123456 @+@?@*
一.@代表引用资源 1.引用自定义资源。格式：@[package:]type/name android：text="@string/hello" 2.引用系统资源。格式：@android:type/name android:textColor="@android:color/opaque_red"
数据结构的基本介绍天子之骄数据结构散列表树、图线性结构价格标签
数据结构的基本介绍数据结构就是数据的组织形式，用一种提前设计好的框架去存取数据，以便更方便，高效的对数据进行增删查改。正确选择合适的数据结构，对软件程序的高效执行的影响作用不亚于算法的设计。此外，在计算机系统中数据结构的作用也是非同小可。例如常常在编程语言中听到的栈，堆等，就是经典的数据结构。经典的数据结构大致如下：一：线性数据结构 (1)：列表 a
通过二维码开放平台的API快速生成二维码一炮送你回车库 api
现在很多网站都有通过扫二维码用手机连接的功能，联图网(http://www.liantu.com/pingtai/)的二维码开放平台开放了一个生成二维码图片的Api,挺方便使用的。闲着无聊，写了个前台快速生成二维码的方法。 html代码如下:(二维码将生成在这div下) ? 1 &nbs
ImageIO读取一张图片改变大小 3213213333332132 java IO image BufferedImage
package com.demo; import java.awt.image.BufferedImage; import java.io.File; import java.io.IOException; import javax.imageio.ImageIO; /** * @Description 读取一张图片改变大小 * @author FuJianyon
myeclipse集成svn（一针见血） 7454103 eclipse SVN MyEclipse
&n
装箱与拆箱----autoboxing和unboxing darkranger J2SE
4.2　自动装箱和拆箱基本数据(Primitive)类型的自动装箱(autoboxing)、拆箱(unboxing)是自J2SE 5.0开始提供的功能。虽然为您打包基本数据类型提供了方便，但提供方便的同时表示隐藏了细节，建议在能够区分基本数据类型与对象的差别时再使用。 4.2.1　autoboxing和unboxing 在Java中，所有要处理的东西几乎都是对象(Object)
ajax传统的方式制作ajax aijuans Ajax
//这是前台的代码 <%@ page language="java" import="java.util.*" pageEncoding="UTF-8"%> <% String path = request.getContextPath(); String basePath = request.getScheme()+
只用jre的eclipse是怎么编译java源文件的？ avords java eclipse jdk tomcat
eclipse只需要jre就可以运行开发java程序了，也能自动编译java源代码，但是jre不是java的运行环境么，难道jre中也带有编译工具？还是eclipse自己实现的？谁能给解释一下呢问题补充：假设系统中没有安装jdk or jre，只在eclipse的目录中有一个jre，那么eclipse会采用该jre，问题是eclipse照样可以编译java源文件，为什么呢？ &nb
前端模块化 bee1314 模块化
背景：前端JavaScript模块化，其实已经不是什么新鲜事了。但是很多的项目还没有真正的使用起来，还处于刀耕火种的野蛮生长阶段。 JavaScript一直缺乏有效的包管理机制，造成了大量的全局变量，大量的方法冲突。我们多么渴望有天能像Java（import），Python (import)，Ruby(require)那样写代码。在没有包管理机制的年代，我们是怎么避免所
处理百万级以上的数据处理 bijian1013 oracle sql 数据库大数据查询
一.处理百万级以上的数据提高查询速度的方法： 1.应尽量避免在 where 子句中使用!=或<>操作符，否则将引擎放弃使用索引而进行全表扫描。 2.对查询进行优化，应尽量避免全表扫描，首先应考虑在 where 及 o
mac 卸载 java 1.7 或更高版本征客丶 java OS
卸载 java 1.7 或更高 sudo rm -rf /Library/Internet\ Plug-Ins/JavaAppletPlugin.plugin 成功执行此命令后，还可以执行 java 与 javac 命令 sudo rm -rf /Library/PreferencePanes/JavaControlPanel.prefPane 成功执行此命令后，还可以执行 java
【Spark六十一】Spark Streaming结合Flume、Kafka进行日志分析 bit1129 Stream
第一步，Flume和Kakfa对接，Flume抓取日志，写到Kafka中第二部，Spark Streaming读取Kafka中的数据，进行实时分析本文首先使用Kakfa自带的消息处理（脚本）来获取消息，走通Flume和Kafka的对接 1. Flume配置 1. 下载Flume和Kafka集成的插件，下载地址：https://github.com/beyondj2ee/f
Erlang vs TNSDL bookjovi erlang
TNSDL是Nokia内部用于开发电信交换软件的私有语言，是在SDL语言的基础上加以修改而成，TNSDL需翻译成C语言得以编译执行，TNSDL语言中实现了异步并行的特点，当然要完整实现异步并行还需要运行时动态库的支持，异步并行类似于Erlang的process（轻量级进程），TNSDL中则称之为hand，Erlang是基于vm(beam)开发，
非常希望有一个预防疲劳的java软件, 预防过劳死和眼睛疲劳,大家一起努力搞一个 ljy325 企业应用
　非常希望有一个预防疲劳的java软件，我看新闻和网站，国防科技大学的科学家累死了，太疲劳，老是加班，不休息，经常吃药，吃药根本就没用，根本原因是疲劳过度。我以前做java,那会公司垃圾，老想赶快学习到东西跳槽离开，搞得超负荷，不明理。深圳做软件开发经常累死人，总有不明理的人，有个软件提醒限制很好，可以挽救很多人的生命。相关新闻：（1）IT行业成五大疾病重灾区：过劳死平均37.9岁
读《研磨设计模式》-代码笔记-原型模式 bylijinnan java 设计模式
声明：本文只为方便我个人查阅和理解，详细的分析以及源代码请移步原作者的博客http://chjavach.iteye.com/ /** * Effective Java 建议使用copy constructor or copy factory来代替clone()方法： * 1.public Product copy(Product p){} * 2.publi
配置管理---svn工具之权限配置 chenyu19891124 SVN
今天花了大半天的功夫，终于弄懂svn权限配置。下面是今天收获的战绩。安装完svn后就是在svn中建立版本库，比如我本地的是版本库路径是C:\Repositories\pepos。pepos是我的版本库。在pepos的目录结构 pepos component webapps 在conf里面的auth里赋予的权限配置为 [groups]
浅谈程序员的数学修养 comsci 设计模式编程算法面试招聘
浅谈程序员的数学修养
批量执行 bulk collect与forall用法 daizj oracle sql bulk collect forall
BULK COLLECT 子句会批量检索结果，即一次性将结果集绑定到一个集合变量中，并从SQL引擎发送到PL/SQL引擎。通常可以在SELECT INTO、 FETCH INTO以及RETURNING INTO子句中使用BULK COLLECT。本文将逐一描述BULK COLLECT在这几种情形下的用法。有关FORALL语句的用法请参考：批量SQL之 F
Linux下使用rsync最快速删除海量文件的方法 dongwei_6688 OS
1、先安装rsync：yum install rsync 2、建立一个空的文件夹：mkdir /tmp/test 3、用rsync删除目标目录：rsync --delete-before -a -H -v --progress --stats /tmp/test/ log/这样我们要删除的log目录就会被清空了，删除的速度会非常快。rsync实际上用的是替换原理，处理数十万个文件也是秒删。
Yii CModel中rules验证规格 dcj3sjt126com rules yii validate
Yii cValidator主要用法分析： yii验证rulesit 分类： Yii yii的rules验证 cValidator主要属性 attributes ,builtInValidators,enableClientValidation,message,on,safe,skipOnError
基于vagrant的redis主从实验 dcj3sjt126com vagrant
平台: Mac 工具: Vagrant 系统: Centos6.5 实验目的: Redis主从实现思路制作一个基于sentos6.5, 已经安装好reids的box, 添加一个脚本配置从机, 然后作为后面主机从机的基础box 制作sentos6.5+redis的box mkdir vagrant_redis cd vagrant_
Memcached(二)、Centos安装Memcached服务器 frank1234 centos memcached
一、安装gcc rpm和yum安装memcached服务器连接没有找到，所以我使用的是make的方式安装，由于make依赖于gcc，所以要先安装gcc 开始安装，命令如下，[color=red][b]顺序一定不能出错[/b][/color]：建议可以先切换到root用户，不然可能会遇到权限问题：su root 输入密码...... rpm -ivh kernel-head
Remove Duplicates from Sorted List hcx2013 remove
Given a sorted linked list, delete all duplicates such that each element appear only once. For example,Given 1->1->2, return 1->2.Given 1->1->2->3->3, return&
Spring4新特性——JSR310日期时间API的支持 jinnianshilongnian spring4
Spring4新特性——泛型限定式依赖注入 Spring4新特性——核心容器的其他改进 Spring4新特性——Web开发的增强 Spring4新特性——集成Bean Validation 1.1(JSR-349)到SpringMVC Spring4新特性——Groovy Bean定义DSL Spring4新特性——更好的Java泛型操作API Spring4新
浅谈enum与单例设计模式 247687009 java 单例
在JDK1.5之前的单例实现方式有两种(懒汉式和饿汉式并无设计上的区别故看做一种)，两者同是私有构造器，导出静态成员变量，以便调用者访问。第一种 package singleton; public class Singleton { //导出全局成员 public final static Singleton INSTANCE = new S
使用switch条件语句需要注意的几点 openwrt c break switch
1. 当满足条件的case中没有break，程序将依次执行其后的每种条件（包括default）直到遇到break跳出 int main() { int n = 1; switch(n) { case 1: printf("--1--\n"); default: printf("defa
配置Spring Mybatis JUnit测试环境的应用上下文 schnell18 spring mybatis JUnit
Spring-test模块中的应用上下文和web及spring boot的有很大差异。主要试下来差异有：单元测试的app context不支持从外部properties文件注入属性 @Value注解不能解析带通配符的路径字符串解决第一个问题可以配置一个PropertyPlaceholderConfigurer的bean。第二个问题的具体实例是：
Java 定时任务总结一 tuoni java spring timer quartz timertask
Java定时任务总结一.从技术上分类大概分为以下三种方式： 1.Java自带的java.util.Timer类，这个类允许你调度一个java.util.TimerTask任务; 说明： java.util.Timer定时器，实际上是个线程，定时执行TimerTask类 &
一种防止用户生成内容站点出现商业广告以及非法有害等垃圾信息的方法 yangshangchuan rank 相似度计算文本相似度词袋模型余弦相似度
本文描述了一种在ITEYE博客频道上面出现的新型的商业广告形式及其应对方法，对于其他的用户生成内容站点类型也具有同样的适用性。最近在ITEYE博客频道上面出现了一种新型的商业广告形式，方法如下： 1、注册多个账号（一般10个以上）。 2、从多个账号中选择一个账号，发表1-2篇博文

边学边记（八） lucene索引结构五（_N.tis,_N.tii）

你可能感兴趣的:(exception,String,File,Lucene,input,byte)