Lucene查询对象浅析(一)

Lucene作为一个开源的搜索工具包,它为开发人员提供了丰富的查询方法,总结如下:

 

第一种:TermQuery.TermQuery是Lucene里面最基本的一种原子查询。开发人员可以通过它来检索索引中含有指定词条的Document。代码如下:

public static void main(String[] args) throws IOException {

		createIndex();

		termQuery();
		
}


private static void createIndex() throws IOException {
		IndexWriter writer = new IndexWriter(STORE_PATH,
				new StandardAnalyzer(), true);
		writer.setUseCompoundFile(false);
		Document doc1 = new Document();
		Document doc2 = new Document();
		Field field = new Field("PostTitle", "Lucene开发浅谈", Field.Store.YES,
				Field.Index.TOKENIZED);
		Field field1 = new Field("PostContent", "Lucene是一个开源的搜索工具包",
				Field.Store.YES, Field.Index.TOKENIZED);
		doc1.add(field);
		doc1.add(field1);

		Field field2 = new Field("PostTitle", "云计算浅谈", Field.Store.YES,
				Field.Index.TOKENIZED);
		Field field3 = new Field("PostContent",
				"云计算是一种基于 Web 的服务,它使得普通计算机拥有超级计算机的能力", Field.Store.NO,
				Field.Index.TOKENIZED);
		doc2.add(field2);
		doc2.add(field3);
		writer.addDocument(doc1);
		writer.addDocument(doc2);
		writer.close();
}

private static void termQuery() throws IOException {

		IndexSearcher searcher = new IndexSearcher(STORE_PATH);
		Term term = new Term("PostContent", "lucene");
		Query termQuery = new TermQuery(term);
		Hits hits = searcher.search(termQuery);
		System.out.println("TermQuery demo------");
		System.out.println("hits.length()==" + hits.length());
		for (int i = 0; i < hits.length(); i++) {

			System.out.println(hits.doc(i));
		}
}
运行结果如下:
TermQuery demo------
hits.length()==1
Document<stored/uncompressed,indexed,tokenized<PostTitle:Lucene开发浅谈> stored/uncompressed,indexed,tokenized<PostContent:Lucene是一个开源的搜索工具包>>


   第二种:BooleanQuery。布尔查询其实就是将各种查询的结果再进行布尔运算,最后在得到查询结果。其中具体的组合方式有如下几种:

    1 MUST,MUST

    2 MUST,MUST_NOT

    3 MUST_SHOULD

    4 MUST_NOT,SHOULD.

    5 SHOULD,SHOULD

    6  MUST_NOT,MUST_NOT

    具体代码如下:

 

public static void main(String[] args) throws IOException {

		createIndex();

		booleanQuery();
		
}

private static void booleanQuery() throws IOException {

		IndexSearcher searcher = new IndexSearcher(STORE_PATH);

		Term term1 = new Term("PostTitle", "谈");
		Term term2 = new Term("PostContent", "源");

		TermQuery termquery1 = new TermQuery(term1);
		TermQuery termquery2 = new TermQuery(term2);

		BooleanQuery query = new BooleanQuery();
		query.add(termquery1, BooleanClause.Occur.MUST);
		query.add(termquery2, BooleanClause.Occur.MUST);
		Hits hits = searcher.search(query);
		System.out.println("BooleanQuery demo with MUST and MUST -------");
		System.out.println("hits.length()==" + hits.length());
		for (int i = 0; i < hits.length(); i++) {

			System.out.println(hits.doc(i));
		}
		System.out.println("-----------------------------------");

		BooleanQuery query1 = new BooleanQuery();
		query1.add(termquery1, BooleanClause.Occur.MUST);
		query1.add(termquery2, BooleanClause.Occur.MUST_NOT);
		Hits hits1 = searcher.search(query1);
		System.out.println("BooleanQuery demo with MUST and MUST_NOT -------");
		System.out.println("hits.length()==" + hits1.length());
		for (int i = 0; i < hits1.length(); i++) {

			System.out.println(hits1.doc(i));
		}

		System.out.println("-----------------------------------");

		BooleanQuery query2 = new BooleanQuery();
		query2.add(termquery1, BooleanClause.Occur.SHOULD);
		query2.add(termquery2, BooleanClause.Occur.MUST_NOT);
		Hits hits2 = searcher.search(query2);
		System.out.println("BooleanQuery demo with SHOULD and MUST_NOT -------");
		System.out.println("hits.length()==" + hits2.length());
		for (int i = 0; i < hits2.length(); i++) {

			System.out.println(hits2.doc(i));
		}
		
		System.out.println("-----------------------------------");
		
	    Term term3 = new Term("PostTitle","lucene");
		Term term4 = new Term("PostContent","云");
		TermQuery termquery3 = new TermQuery(term3);
		TermQuery termquery4 = new TermQuery(term4);
		
	    BooleanQuery query3 = new BooleanQuery();
		query3.add(termquery3, BooleanClause.Occur.SHOULD);
		query3.add(termquery4, BooleanClause.Occur.SHOULD);
		Hits hits3 = searcher.search(query3);
		System.out.println("BooleanQuery demo with SHOULD and SHOULD -------");
		System.out.println("hits.length()==" + hits3.length());
		for (int i = 0; i < hits3.length(); i++) {

			System.out.println(hits3.doc(i));
		}
}

 

 运行结果如下:

 

BooleanQuery demo with MUST and MUST -------
hits.length()==1
Document<stored/uncompressed,indexed,tokenized<PostTitle:Lucene开发浅谈> stored/uncompressed,indexed,tokenized<PostContent:Lucene是一个开源的搜索工具包>>
-----------------------------------
BooleanQuery demo with MUST and MUST_NOT -------
hits.length()==1
Document<stored/uncompressed,indexed,tokenized<PostTitle:云计算浅谈>>
-----------------------------------
BooleanQuery demo with SHOULD and MUST_NOT -------
hits.length()==1
Document<stored/uncompressed,indexed,tokenized<PostTitle:云计算浅谈>>
-----------------------------------
BooleanQuery demo with SHOULD and SHOULD -------
hits.length()==2
Document<stored/uncompressed,indexed,tokenized<PostTitle:Lucene开发浅谈> stored/uncompressed,indexed,tokenized<PostContent:Lucene是一个开源的搜索工具包>>
Document<stored/uncompressed,indexed,tokenized<PostTitle:云计算浅谈>>

 

 

其中Should和must组合时,检索结果为must的结果,当于must_not结合时就相当于must和must_not.

 

第三种:RangeQuery。范围查询顾名思意就是给定一个方位来查询,比如查询用户ID在“10001-10005”之间的用户等。具体的代码如下:

 

private static void rangeQuery() throws IOException {

		IndexWriter writer = new IndexWriter(STORE_PATH,
				new StandardAnalyzer(), true);
		Field field1 = new Field("userID", "10001", Field.Store.YES,
				Field.Index.TOKENIZED);
		Field field2 = new Field("userID", "10002", Field.Store.YES,
				Field.Index.TOKENIZED);
		Field field3 = new Field("userID", "10003", Field.Store.YES,
				Field.Index.TOKENIZED);
		Field field4 = new Field("userID", "10004", Field.Store.YES,
				Field.Index.TOKENIZED);
		Field field5 = new Field("userID", "10005", Field.Store.YES,
				Field.Index.TOKENIZED);
		Document doc1 = new Document();
		Document doc2 = new Document();
		Document doc3 = new Document();
		Document doc4 = new Document();
		Document doc5 = new Document();
		doc1.add(field1);
		doc2.add(field2);
		doc3.add(field3);
		doc4.add(field4);
		doc5.add(field5);

		writer.addDocument(doc1);
		writer.addDocument(doc2);
		writer.addDocument(doc3);
		writer.addDocument(doc4);
		writer.addDocument(doc5);

		writer.close();

		IndexSearcher searcher = new IndexSearcher(STORE_PATH);
		Term start = new Term("userID", "10001");
		Term end = new Term("userID", "10002");

		RangeQuery query = new RangeQuery(start, end, true);

		Hits hits = searcher.search(query);
		System.out.println("RangeQuery demo------");
		System.out.println("hits.length()==" + hits.length());
		for (int i = 0; i < hits.length(); i++) {

			System.out.println(hits.doc(i));
		}
	}

 

 

运行结果如下:

RangeQuery demo------
hits.length()==5
Document<stored/uncompressed,indexed,tokenized<userID:10001>>
Document<stored/uncompressed,indexed,tokenized<userID:10002>>
Document<stored/uncompressed,indexed,tokenized<userID:10003>>
Document<stored/uncompressed,indexed,tokenized<userID:10004>>
Document<stored/uncompressed,indexed,tokenized<userID:10005>>

 其中RangeQuery构造函数的第三个参数是用来指定是否包含边界值,如果是true,就是闭区间,如果为false则为开区间。

你可能感兴趣的:(Lucene,云计算)