Lucene搜索时,索引如何reopen


Lucene搜索时,索引如何reopen

@2010-8-30 for&ever

 

IndexReader是一个线程安全的对象,跟索引目录是一一对应,实例化IndexReader很耗资源,通常搜索时同一个索引目录只需要实例化一个IndexReader即可。
当索引数据比较大的时候,一般把索引数据按照某种规则散列在多个文件目录(如:indexdir0,indexdir01,indexdir02)。

当索引目录有增量更新时,可以使用lucene的reopen方法来加载那些变更过的索引片断,而不是重新加载完整的索引从而节省资源。

那么如何使用reopen呢?

 

1、searcher的构造方式一


传一个文件路径的字符串或者Directory给searcher, searcher将维护一个内部reader,当本次搜索结束后这个内部reader就会关掉。

2、searcher的构造方式二


传reader给searcher然后构造这个searcher, 那么这个reader在本次搜索结束后不会被关掉, 除非调用reader.close()才能关闭。


因此,必须要用reader去构造searcher,然后通过searcher.getIndexReader()可以获取当前searcher的reader,
再用reader.iscurrent()判断索引文件是否有变化,如果索引文件有变化,那么先关闭当前的searcher,再通过reader.reopen()获取新的reader,然后重新创建新的searcher。

如果是通过上面提到的构造方式一的方式获得的searcher的话,就不能使用reopen,否则会报reader已经关闭的异常。

代码:
IndexReader reader = IndexReader.().....
IndexSearcher indexSearcher = new IndexSearcher(reader); // 注意这里的IndexSearcher构造方式
...
...
...
IndexReader reader1 = indexSearcher.getIndexReader();
if(!reader1.isCurrent()){
     indexSearcher.close();
     indexSearcher = new IndexSearcher(reader1.reopen());
}


Lucene API 的相关描述:

---------------------------------------------------------
A、search的close方法
Note that the underlying IndexReader is not closed, if IndexSearcher was constructed with IndexSearcher(IndexReader r). If the IndexReader was supplied implicitly by specifying a directory, then the IndexReader gets closed 

---------------------------------------------------------
B、IndexReader的reopen方法
Refreshes an IndexReader if the index has changed since this instance was (re)opened.   
Opening an IndexReader is an expensive operation. This method can be used to refresh an existing IndexReader to reduce these costs. This method tries to only load segments that have changed or were created after the IndexReader was (re)opened.   
 
If the index has not changed since this instance was (re)opened, then this call is a NOOP and returns this instance. Otherwise, a new instance is returned. The old instance is not closed and remains usable.  
Note: The re-opened reader instance and the old instance might share the same resources. For this reason no index modification operations (e. g. deleteDocument(int), setNorm(int, String, byte)) should be performed using one of the readers until the old reader instance is closed. Otherwise, the behavior of the readers is undefined.   
 
You can determine whether a reader was actually reopened by comparing the old instance with the instance returned by this method:   
 
 IndexReader reader = ...   
 ...  
 IndexReader new = r.reopen();  
 if (new != reader) {  
   ...     // reader was reopened  
   reader.close();   
 }  
 reader = new; 

 

forandever @2010-8-30

 

 

 


 

你可能感兴趣的:(String,Lucene,search,byte,2010,behavior)