lucene 是基于JAVA的全文搜索,具体原理是索引创建----查询索引,具体如下图:
建立索引分为数据库和文件两种:
其中在数据库中建立索引如下:
Analyzer analyzer = new StandardAnalyzer(Version.LUCENE_46);
Directory dire = FSDirectory.open(new File("E:\\work_inspur\\lucene\\index"));//索引存放位置。
IndexWriterConfig iwc = new IndexWriterConfig(Version.LUCENE_46,
analyzer);
IndexWriter iw = new IndexWriter(dire, iwc);
.........................................................................连接数据库查询,rs为查询result...............................................................................
//将查询结果放入索引
while (rs.next()) {
Document doc = new Document();
doc.add(new TextField("mc", rs.getString("swjg_jc"), Store.YES));
iw.addDocument(doc);
}
iw.commit();//提交;
iw.close(); // 关闭读写器
第二种为文件内容的索引建立:
Analyzer analyzer = new StandardAnalyzer(Version.LUCENE_46);
Directory dire = FSDirectory.open(new File("E:\\work_inspur\\lucene\\index"));//索引存放地址
IndexWriter iw = new IndexWriter(dire, iwc);
File root = new File("E:\\work_inspur\\lucene\\test");//原文件存储地址
showAllFiles(root, iw);//自己编写的加载索引方法,具体代码如下
iw.commit();
public static void
showAllFiles(File dir,IndexWriter iw) throws Exception {
File[] fs = dir.listFiles();
for (int i = 0; i < fs.length; i++) {
if (fs[i].isDirectory()) {
try {
showAllFiles(fs[i],iw);
} catch (Exception e) {
}
}else{
Document doc=new Document();
String content=getContent(fs[i]); //getContent为读取文件为荣,通过FileInputStream进行操作
String name=fs[i].getName();
String path=fs[i].getAbsolutePath();
doc.add(new TextField("content", content, Store.YES));
doc.add(new TextField("name", name, Store.YES));
doc.add(new TextField("path", path,Store.YES));
iw.addDocument(doc);
}
}
}
查找索引:
Directory dir = FSDirectory.open(new File("E:\\work_inspur\\lucene\\index"));//打开索引位置
IndexReader ir = DirectoryReader.open(dir);//读取索引
IndexSearcher searcher = new IndexSearcher(ir);
Analyzer analyzer = new StandardAnalyzer(Version.LUCENE_46);
QueryParser parser = new QueryParser(Version.LUCENE_46, "content", analyzer);
Query query = parser.parse("顶");//查找内容
TopDocs topDocs = searcher.search(query, 100);
ScoreDoc[] hits = topDocs.scoreDocs;
List list = new ArrayList();//查找结果
然后可以根据MAP键值对对其中索引中的内容进行提取输出,例如如下:
for (ScoreDoc sd : hits) {
Document d = searcher.doc(sd.doc);
WMap m = new WMap();
m.set("content", d.get("content"));
System.out.println(d.get("name")+"------"+d.get("content"));
list.add(m); //以list的形式输出来
}
删除索引:
Analyzer analyzer = new StandardAnalyzer(Version.LUCENE_46);
Directory dire = FSDirectory.open(new File("E:\\work_inspur\\lucene\\index"));
IndexWriterConfig iwc = new IndexWriterConfig(Version.LUCENE_46,
analyzer);
IndexWriter iw = new IndexWriter(dire, iwc);
iw.deleteAll();//删除所有索引
iw.commit();