Lucene查询对象浅析(一)

Lucene作为一个开源的搜索工具包,它为开发人员提供了丰富的查询方法,总结如下:
 
第一种:TermQuery.TermQuery是Lucene里面最基本的一种原子查询。开发人员可以通过它来检索索引中含有指定词条的Document。代码如下:
Java代码
public static void main(String[] args) throws IOException {        
    
                createIndex();        
    
                termQuery();        
                        
}        
    
    
private static void createIndex() throws IOException {        
                IndexWriter writer = new IndexWriter(STORE_PATH,        
                                 new StandardAnalyzer(), true);        
                writer.setUseCompoundFile( false);        
                Document doc1 = new Document();        
                Document doc2 = new Document();        
                Field field = new Field( "PostTitle", "Lucene开发浅谈", Field.Store.YES,        
                                Field.Index.TOKENIZED);        
                Field field1 = new Field( "PostContent", "Lucene是一个开源的搜索工具包",        
                                Field.Store.YES, Field.Index.TOKENIZED);        
                doc1.add(field);        
                doc1.add(field1);        
    
                Field field2 = new Field( "PostTitle", "云计算浅谈", Field.Store.YES,        
                                Field.Index.TOKENIZED);        
                Field field3 = new Field( "PostContent",        
                                 "云计算是一种基于 Web 的服务,它使得普通计算机拥有超级计算机的能力", Field.Store.NO,        
                                Field.Index.TOKENIZED);        
                doc2.add(field2);        
                doc2.add(field3);        
                writer.addDocument(doc1);        
                writer.addDocument(doc2);        
                writer.close();        
}        
    
private static void termQuery() throws IOException {        
    
                IndexSearcher searcher = new IndexSearcher(STORE_PATH);        
                Term term = new Term( "PostContent", "lucene");        
                Query termQuery = new TermQuery(term);        
                Hits hits = searcher.search(termQuery);        
                System.out.println( "TermQuery demo------");        
                System.out.println( "hits.length()==" + hits.length());        
                 for ( int i = 0; i < hits.length(); i++) {        
    
                        System.out.println(hits.doc(i));        
                }        
}    
 
Java代码
运行结果如下:    
 
Java代码
TermQuery demo------        
hits.length()==1    
Document<stored/uncompressed,indexed,tokenized<PostTitle:Lucene开发浅谈> stored/uncompressed,indexed,tokenized<PostContent:Lucene是一个开源的搜索工具包>>    
 
第二种:BooleanQuery。布尔查询其实就是将各种查询的结果再进行布尔运算,最后在得到查询结果。其中具体的组合方式有如下几种:
    1 MUST,MUST
    2 MUST,MUST_NOT
    3 MUST_SHOULD
    4 MUST_NOT,SHOULD.
    5 SHOULD,SHOULD
    6  MUST_NOT,MUST_NOT
    具体代码如下:
Java代码
public static void main(String[] args) throws IOException {        
    
                createIndex();        
    
                booleanQuery();        
                        
}        
    
private static void booleanQuery() throws IOException {        
    
                IndexSearcher searcher = new IndexSearcher(STORE_PATH);        
    
                Term term1 = new Term( "PostTitle", "谈");        
                Term term2 = new Term( "PostContent", "源");        
    
                TermQuery termquery1 = new TermQuery(term1);        
                TermQuery termquery2 = new TermQuery(term2);        
    
                BooleanQuery query = new BooleanQuery();        
                query.add(termquery1, BooleanClause.Occur.MUST);        
                query.add(termquery2, BooleanClause.Occur.MUST);        
                Hits hits = searcher.search(query);        
                System.out.println( "BooleanQuery demo with MUST and MUST -------");        
                System.out.println( "hits.length()==" + hits.length());        
                 for ( int i = 0; i < hits.length(); i++) {        
    
                        System.out.println(hits.doc(i));        
                }        
                System.out.println( "-----------------------------------");        
    
                BooleanQuery query1 = new BooleanQuery();        
                query1.add(termquery1, BooleanClause.Occur.MUST);        
                query1.add(termquery2, BooleanClause.Occur.MUST_NOT);        
                Hits hits1 = searcher.search(query1);        
                System.out.println( "BooleanQuery demo with MUST and MUST_NOT -------");        
                System.out.println( "hits.length()==" + hits1.length());        
                 for ( int i = 0; i < hits1.length(); i++) {        
    
                        System.out.println(hits1.doc(i));        
                }        
    
                System.out.println( "-----------------------------------");        
    
                BooleanQuery query2 = new BooleanQuery();        
                query2.add(termquery1, BooleanClause.Occur.SHOULD);        
                query2.add(termquery2, BooleanClause.Occur.MUST_NOT);        
                Hits hits2 = searcher.search(query2);        
                System.out.println( "BooleanQuery demo with SHOULD and MUST_NOT -------");        
                System.out.println( "hits.length()==" + hits2.length());        
                 for ( int i = 0; i < hits2.length(); i++) {        
    
                        System.out.println(hits2.doc(i));        
                }        
                        
                System.out.println( "-----------------------------------");        
                        
                Term term3 = new Term( "PostTitle", "lucene");        
                Term term4 = new Term( "PostContent", "云");        
                TermQuery termquery3 = new TermQuery(term3);        
                TermQuery termquery4 = new TermQuery(term4);        
                        
                BooleanQuery query3 = new BooleanQuery();        
                query3.add(termquery3, BooleanClause.Occur.SHOULD);        
                query3.add(termquery4, BooleanClause.Occur.SHOULD);        
                Hits hits3 = searcher.search(query3);        
                System.out.println( "BooleanQuery demo with SHOULD and SHOULD -------");        
                System.out.println( "hits.length()==" + hits3.length());        
                 for ( int i = 0; i < hits3.length(); i++) {        
    
                        System.out.println(hits3.doc(i));        
                }        
}    
 
运行结果如下:
Java代码
BooleanQuery demo with MUST and MUST -------        
hits.length()==1    
Document<stored/uncompressed,indexed,tokenized<PostTitle:Lucene开发浅谈> stored/uncompressed,indexed,tokenized<PostContent:Lucene是一个开源的搜索工具包>>        
-----------------------------------        
BooleanQuery demo with MUST and MUST_NOT -------        
hits.length()==1    
Document<stored/uncompressed,indexed,tokenized<PostTitle:云计算浅谈>>        
-----------------------------------        
BooleanQuery demo with SHOULD and MUST_NOT -------        
hits.length()==1    
Document<stored/uncompressed,indexed,tokenized<PostTitle:云计算浅谈>>        
-----------------------------------        
BooleanQuery demo with SHOULD and SHOULD -------        
hits.length()==2    
Document<stored/uncompressed,indexed,tokenized<PostTitle:Lucene开发浅谈> stored/uncompressed,indexed,tokenized<PostContent:Lucene是一个开源的搜索工具包>>        
Document<stored/uncompressed,indexed,tokenized<PostTitle:云计算浅谈>>    
 
其中Should和must组合时,检索结果为must的结果,当于must_not结合时就相当于must和must_not.
 
第三种:RangeQuery。范围查询顾名思意就是给定一个方位来查询,比如查询用户ID在“10001-10005”之间的用户等。具体的代码如下:
Java代码
private static void rangeQuery() throws IOException {        
    
                IndexWriter writer = new IndexWriter(STORE_PATH,        
                                 new StandardAnalyzer(), true);        
                Field field1 = new Field( "userID", "10001", Field.Store.YES,        
                                Field.Index.TOKENIZED);        
                Field field2 = new Field( "userID", "10002", Field.Store.YES,        
                                Field.Index.TOKENIZED);        
                Field field3 = new Field( "userID", "10003", Field.Store.YES,        
                                Field.Index.TOKENIZED);        
                Field field4 = new Field( "userID", "10004", Field.Store.YES,        
                                Field.Index.TOKENIZED);        
                Field field5 = new Field( "userID", "10005", Field.Store.YES,        
                                Field.Index.TOKENIZED);        
                Document doc1 = new Document();        
                Document doc2 = new Document();        
                Document doc3 = new Document();        
                Document doc4 = new Document();        
                Document doc5 = new Document();        
                doc1.add(field1);        
                doc2.add(field2);        
                doc3.add(field3);        
                doc4.add(field4);        
                doc5.add(field5);        
    
                writer.addDocument(doc1);        
                writer.addDocument(doc2);        
                writer.addDocument(doc3);        
                writer.addDocument(doc4);        
                writer.addDocument(doc5);        
    
                writer.close();        
    
                IndexSearcher searcher = new IndexSearcher(STORE_PATH);        
                Term start = new Term( "userID", "10001");        
                Term end = new Term( "userID", "10002");        
    
                RangeQuery query = new RangeQuery(start, end, true);        
    
                Hits hits = searcher.search(query);        
                System.out.println( "RangeQuery demo------");        
                System.out.println( "hits.length()==" + hits.length());        
                 for ( int i = 0; i < hits.length(); i++) {        
    
                        System.out.println(hits.doc(i));        
                }        
        }    
 
运行结果如下:
Java代码
RangeQuery demo------        
hits.length()==5    
Document<stored/uncompressed,indexed,tokenized<userID:10001>>        
Document<stored/uncompressed,indexed,tokenized<userID:10002>>        
Document<stored/uncompressed,indexed,tokenized<userID:10003>>        
Document<stored/uncompressed,indexed,tokenized<userID:10004>>        
Document<stored/uncompressed,indexed,tokenized<userID:10005>>    
 
其中RangeQuery构造函数的第三个参数是用来指定是否包含边界值,如果是true,就是闭区间,如果为false则为开区间。

你可能感兴趣的:(职场,Lucene,query,休闲)