Lucene作为一个开源的搜索工具包,它为开发人员提供了丰富的查询方法,总结如下:
第一种:TermQuery.TermQuery是Lucene里面最基本的一种原子查询。开发人员可以通过它来检索索引中含有指定词条的Document。代码如下:
public static void main(String[] args) throws IOException { createIndex(); termQuery(); } private static void createIndex() throws IOException { IndexWriter writer = new IndexWriter(STORE_PATH, new StandardAnalyzer(), true); writer.setUseCompoundFile(false); Document doc1 = new Document(); Document doc2 = new Document(); Field field = new Field("PostTitle", "Lucene开发浅谈", Field.Store.YES, Field.Index.TOKENIZED); Field field1 = new Field("PostContent", "Lucene是一个开源的搜索工具包", Field.Store.YES, Field.Index.TOKENIZED); doc1.add(field); doc1.add(field1); Field field2 = new Field("PostTitle", "云计算浅谈", Field.Store.YES, Field.Index.TOKENIZED); Field field3 = new Field("PostContent", "云计算是一种基于 Web 的服务,它使得普通计算机拥有超级计算机的能力", Field.Store.NO, Field.Index.TOKENIZED); doc2.add(field2); doc2.add(field3); writer.addDocument(doc1); writer.addDocument(doc2); writer.close(); } private static void termQuery() throws IOException { IndexSearcher searcher = new IndexSearcher(STORE_PATH); Term term = new Term("PostContent", "lucene"); Query termQuery = new TermQuery(term); Hits hits = searcher.search(termQuery); System.out.println("TermQuery demo------"); System.out.println("hits.length()==" + hits.length()); for (int i = 0; i < hits.length(); i++) { System.out.println(hits.doc(i)); } }
运行结果如下:
TermQuery demo------ hits.length()==1 Document<stored/uncompressed,indexed,tokenized<PostTitle:Lucene开发浅谈> stored/uncompressed,indexed,tokenized<PostContent:Lucene是一个开源的搜索工具包>>
第二种:BooleanQuery。布尔查询其实就是将各种查询的结果再进行布尔运算,最后在得到查询结果。其中具体的组合方式有如下几种:
1 MUST,MUST
2 MUST,MUST_NOT
3 MUST_SHOULD
4 MUST_NOT,SHOULD.
5 SHOULD,SHOULD
6 MUST_NOT,MUST_NOT
具体代码如下:
public static void main(String[] args) throws IOException { createIndex(); booleanQuery(); } private static void booleanQuery() throws IOException { IndexSearcher searcher = new IndexSearcher(STORE_PATH); Term term1 = new Term("PostTitle", "谈"); Term term2 = new Term("PostContent", "源"); TermQuery termquery1 = new TermQuery(term1); TermQuery termquery2 = new TermQuery(term2); BooleanQuery query = new BooleanQuery(); query.add(termquery1, BooleanClause.Occur.MUST); query.add(termquery2, BooleanClause.Occur.MUST); Hits hits = searcher.search(query); System.out.println("BooleanQuery demo with MUST and MUST -------"); System.out.println("hits.length()==" + hits.length()); for (int i = 0; i < hits.length(); i++) { System.out.println(hits.doc(i)); } System.out.println("-----------------------------------"); BooleanQuery query1 = new BooleanQuery(); query1.add(termquery1, BooleanClause.Occur.MUST); query1.add(termquery2, BooleanClause.Occur.MUST_NOT); Hits hits1 = searcher.search(query1); System.out.println("BooleanQuery demo with MUST and MUST_NOT -------"); System.out.println("hits.length()==" + hits1.length()); for (int i = 0; i < hits1.length(); i++) { System.out.println(hits1.doc(i)); } System.out.println("-----------------------------------"); BooleanQuery query2 = new BooleanQuery(); query2.add(termquery1, BooleanClause.Occur.SHOULD); query2.add(termquery2, BooleanClause.Occur.MUST_NOT); Hits hits2 = searcher.search(query2); System.out.println("BooleanQuery demo with SHOULD and MUST_NOT -------"); System.out.println("hits.length()==" + hits2.length()); for (int i = 0; i < hits2.length(); i++) { System.out.println(hits2.doc(i)); } System.out.println("-----------------------------------"); Term term3 = new Term("PostTitle","lucene"); Term term4 = new Term("PostContent","云"); TermQuery termquery3 = new TermQuery(term3); TermQuery termquery4 = new TermQuery(term4); BooleanQuery query3 = new BooleanQuery(); query3.add(termquery3, BooleanClause.Occur.SHOULD); query3.add(termquery4, BooleanClause.Occur.SHOULD); Hits hits3 = searcher.search(query3); System.out.println("BooleanQuery demo with SHOULD and SHOULD -------"); System.out.println("hits.length()==" + hits3.length()); for (int i = 0; i < hits3.length(); i++) { System.out.println(hits3.doc(i)); } }
运行结果如下:
BooleanQuery demo with MUST and MUST ------- hits.length()==1 Document<stored/uncompressed,indexed,tokenized<PostTitle:Lucene开发浅谈> stored/uncompressed,indexed,tokenized<PostContent:Lucene是一个开源的搜索工具包>> ----------------------------------- BooleanQuery demo with MUST and MUST_NOT ------- hits.length()==1 Document<stored/uncompressed,indexed,tokenized<PostTitle:云计算浅谈>> ----------------------------------- BooleanQuery demo with SHOULD and MUST_NOT ------- hits.length()==1 Document<stored/uncompressed,indexed,tokenized<PostTitle:云计算浅谈>> ----------------------------------- BooleanQuery demo with SHOULD and SHOULD ------- hits.length()==2 Document<stored/uncompressed,indexed,tokenized<PostTitle:Lucene开发浅谈> stored/uncompressed,indexed,tokenized<PostContent:Lucene是一个开源的搜索工具包>> Document<stored/uncompressed,indexed,tokenized<PostTitle:云计算浅谈>>
其中Should和must组合时,检索结果为must的结果,当于must_not结合时就相当于must和must_not.
第三种:RangeQuery。范围查询顾名思意就是给定一个方位来查询,比如查询用户ID在“10001-10005”之间的用户等。具体的代码如下:
private static void rangeQuery() throws IOException { IndexWriter writer = new IndexWriter(STORE_PATH, new StandardAnalyzer(), true); Field field1 = new Field("userID", "10001", Field.Store.YES, Field.Index.TOKENIZED); Field field2 = new Field("userID", "10002", Field.Store.YES, Field.Index.TOKENIZED); Field field3 = new Field("userID", "10003", Field.Store.YES, Field.Index.TOKENIZED); Field field4 = new Field("userID", "10004", Field.Store.YES, Field.Index.TOKENIZED); Field field5 = new Field("userID", "10005", Field.Store.YES, Field.Index.TOKENIZED); Document doc1 = new Document(); Document doc2 = new Document(); Document doc3 = new Document(); Document doc4 = new Document(); Document doc5 = new Document(); doc1.add(field1); doc2.add(field2); doc3.add(field3); doc4.add(field4); doc5.add(field5); writer.addDocument(doc1); writer.addDocument(doc2); writer.addDocument(doc3); writer.addDocument(doc4); writer.addDocument(doc5); writer.close(); IndexSearcher searcher = new IndexSearcher(STORE_PATH); Term start = new Term("userID", "10001"); Term end = new Term("userID", "10002"); RangeQuery query = new RangeQuery(start, end, true); Hits hits = searcher.search(query); System.out.println("RangeQuery demo------"); System.out.println("hits.length()==" + hits.length()); for (int i = 0; i < hits.length(); i++) { System.out.println(hits.doc(i)); } }
运行结果如下:
RangeQuery demo------ hits.length()==5 Document<stored/uncompressed,indexed,tokenized<userID:10001>> Document<stored/uncompressed,indexed,tokenized<userID:10002>> Document<stored/uncompressed,indexed,tokenized<userID:10003>> Document<stored/uncompressed,indexed,tokenized<userID:10004>> Document<stored/uncompressed,indexed,tokenized<userID:10005>>
其中RangeQuery构造函数的第三个参数是用来指定是否包含边界值,如果是true,就是闭区间,如果为false则为开区间。