org.apache.lucene.analysis.TokenStream
一个 抽象类。一个TokenStream会枚举若干个token的序列,要么来自文档的域,要门来自查询文本。A TokenStream enumerates the sequence of tokens, either from Fields of a Document or from query text.
TokenStream org.apache.lucene.analysis.Analyzer.tokenStream(String fieldName, Reader reader)
从reader的文本中得到一个Analyzer分词后的TokenStream。
Creates a TokenStream which tokenizes all the text in the provided Reader.
void org.apache.lucene.analysis.TokenStream.reset() throws IOException
将TokenStream的游标重置到初始位置。
Resets this stream to the beginning.
boolean org.apache.lucene.analysis.TokenStream.incrementToken() throws IOException
消费者,也就是IndexWriter使用这个方法来获得下一个token。
Consumers (i.e., IndexWriter) use this method to advance the stream to the next token.
org.apache.lucene.analysis.tokenattributes.CharTermAttribute
一个token的词文本。
The term text of a Token.
<CharTermAttribute> CharTermAttribute org.apache.lucene.util.AttributeSource.getAttribute(Class<CharTermAttribute> attClass)
获得指定的Attribute。
The caller must pass in a Class<? extends Attribute> value. Returns the instance of the passed in Attribute contained in this AttributeSource。