理解Lucene(三) 理解核心的Searching类

这是Lucen In Action一书中1.6 节的全部内容。

理解核心的Searching类 
■ IndexSearcher

■ Term

■ Query

■ TermQuery

■ Hits

IndexSearcher 能够搜索IndexWriter索引的东西(即,index):Index暴露的几个搜索方法的核心连接。可以把它看作以只读方式打开索引的类。它提供了一组Search方法,有些是在它的抽象父类Searcher中实现的;最简单的Search方法只有一个Query参数,返回Hits对象。一个典型的应用:




 

IndexSearcher is = new IndexSearcher(

FSDirectory.getDirectory("/tmp/index", false));

Query q = new TermQuery(new Term("contents", "lucene"));

Hits hits = is.search(q);


在第三章有IndexSearcher的整体描述,在五,六章可以看到它更高级的特性

注:可以对比IndexWriter

Term

Term是搜索的基本单元。和Field类似它由一对字符串元素组成:field name,field value。

注意,Term对象在indexing 过程中也有涉及,那时他们被创建,然而那是在Lucene的内部,因此通常你并不需要在Indexing时考虑他们。在搜索中,你可以构造Term对象,并且和TermQuery一起使用:

 

Query q = new TermQuery(new Term("contents", "lucene"));

Hits hits = is.search(q);


这段代码使Lucene在所有文档中查找位于“contents”field中的“Lucene”单词。因为TermQuery对象是从Query抽象类派生的,你可以使用在语句的左边使用Query类型。(多态)

Query

Lucene提供了Query,随之提供的还有Query的许多子类。到本章为止,我们只涉及到了最基本的LuceneQuery:TermQuery。其他的Query类型有:BooleanQuery, PhraseQuery, PrefixQuery, PhrasePrefixQuery, RangeQuery,FilteredQuery, 和 SpanQuery.

第三章能讲到他们。Query是通用的抽象基类,包含几个有用的方法,最重要的是setBoost(float),将在3.5.9描述

TermQuery

TermQuery是Lucene支持的最基本的Query类型,并且它是基本的(primitive)的查询类型之一。它被用来匹配包含指定fields值的文档,正像你在最近几段看到的那样。

Hits

Hits 类是一个简单的排列了(ranked)搜索结果的指针容器,这些搜索结果是匹配指定Query的文档。由于性能原因,Hits实例没有从索引中加载所有匹配的文档,一次仅加载他们的一小部分。第三章有更详细的信息
原文:
IndexSearcher

IndexSearcher is to searching what IndexWriter is to indexing: the central link to

the index that exposes several search methods. You can think of IndexSearcher

as a class that opens an index in a read-only mode. It offers a number of search

methods, some of which are implemented in its abstract parent class Searcher;

the simplest takes a single Query object as a parameter and returns a Hits object.

A typical use of this method looks like this:

IndexSearcher is = new IndexSearcher(

FSDirectory.getDirectory("/tmp/index", false));

Query q = new TermQuery(new Term("contents", "lucene"));

Hits hits = is.search(q);

We cover the details of IndexSearcher in chapter 3, along with more advanced

information in chapters 5 and 6.

Term

A Term is the basic unit for searching. Similar to the Field object, it consists of a

pair of string elements: the name of the field and the value of that field. Note

that Term objects are also involved in the indexing process. However, they’re created

by Lucene’s internals, so you typically don’t need to think about them while

indexing. During searching, you may construct Term objects and use them

together with TermQuery:

Query q = new TermQuery(new Term("contents", "lucene"));

Hits hits = is.search(q);

This code instructs Lucene to find all documents that contain the word lucene in

a field named contents. Because the TermQuery object is derived from the abstract

parent class Query, you can use the Query type on the left side of the statement.



Query 
 

Lucene comes with a number of concrete Query subclasses. So far in this chapter

we’ve mentioned only the most basic Lucene Query: TermQuery. Other Query types

are BooleanQuery, PhraseQuery, PrefixQuery, PhrasePrefixQuery, RangeQuery,

FilteredQuery, and SpanQuery. All of these are covered in chapter 3. Query is the

common, abstract parent class. It contains several utility methods, the most

interesting of which is setBoost(float), described in section 3.5.9.

TermQuery

TermQuery is the most basic type of query supported by Lucene, and it’s one of

the primitive query types. It’s used for matching documents that contain fields

with specific values, as you’ve seen in the last few paragraphs.

Hits

The Hits class is a simple container of pointers to ranked search results—documents

that match a given query. For performance reasons, Hits instances don’t

load from the index all documents that match a query, but only a small portion

of them at a time. Chapter 3 describes this in more detail

你可能感兴趣的:(Lucene,performance)