樊樊_

Lucene6.5.0 下中文分词IKAnalyzer编译和使用

前言

lucene本省对中文分词有支持，不过支持的不好，其分词方式是机械的将中文词一个分成一个进行存储，例如：成都信息工程大学，最终分成为:：成|都|信|息|工|程|大|学，显然这种分词方式是低效且浪费存储空间的，IK分词是林良益前辈自定义写的一个专门针对中文分词的分析器,最新版本为2012年的版本for4.0之后未做更新，后续版本lucene的接口改变使其不支持，所以需要进行修改。

修改和编译IKAnalyzer

（谷歌不稳定访问不了）国内源码地址：http://git.oschina.net/wltea/IK-Analyzer-2012FF 网盘下载：链接：http://pan.baidu.com/s/1jIt7kGm 密码：hu1g

lucene6.5.0下载地址：https://lucene.apache.org 网盘下载：链接：http://pan.baidu.com/s/1mic8iBe 密码：axca

下载源码之后解压并导入到单独的java project,然后再导入lucene的jar包，如图所示，是我的工程结构

导入后修改四个文件：IKAnalyzer和IKTokenizer以及SWMCQueryBuilder、IKQueryExpressionParser，至于demo中的两个文件可直接删除或进行修改，我进行了修改。修改方式很简单，这里贴出修改的原文，以及修改后工程和源码下载。

修改后的工程地址：链接：http://pan.baidu.com/s/1nuALOql密码：miyq

编译好的IKAnalyzer的jar包下载地址：http://download.csdn.net/detail/fanpei_moukoy/9796612可直接导入lucene项目进行使用

IKAnalyzer

/**
 * IK 中文分词  版本 6.5.0
 * IK Analyzer release 6.5.0
 * 
 * Licensed to the Apache Software Foundation (ASF) under one or more
 * contributor license agreements.  See the NOTICE file distributed with
 * this work for additional information regarding copyright ownership.
 * The ASF licenses this file to You under the Apache License, Version 2.0
 * (the "License"); you may not use this file except in compliance with
 * the License.  You may obtain a copy of the License at
 *
 *     http://www.apache.org/licenses/LICENSE-2.0
 *
 * Unless required by applicable law or agreed to in writing, software
 * distributed under the License is distributed on an "AS IS" BASIS,
 * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 * See the License for the specific language governing permissions and
 * limitations under the License.
 *
 * provided by Linliangyi and copyright 2012 by Oolong studio
 * 
 */
package org.wltea.analyzer.lucene;

import java.io.Reader;
import java.io.StringReader;

import org.apache.lucene.analysis.Analyzer;
import org.apache.lucene.util.IOUtils;


/**
 * IK分词器，Lucene Analyzer接口实现
 * 兼容Lucene 6.5.0版本 暴走抹茶 2017.3.28
 */
public final class IKAnalyzer extends Analyzer{
	
	private boolean useSmart;
	
	public boolean useSmart() {
		return useSmart;
	}

	public void setUseSmart(boolean useSmart) {
		this.useSmart = useSmart;
	}

	/**
	 * IK分词器Lucene  Analyzer接口实现类
	 * 
	 * 默认细粒度切分算法
	 */
	public IKAnalyzer(){
		this(false);
	}
	
	/**
	 * IK分词器Lucene Analyzer接口实现类
	 * 
	 * @param useSmart 当为true时，分词器进行智能切分
	 */
	public IKAnalyzer(boolean useSmart){
		super();
		this.useSmart = useSmart;
	}


	@Override
	protected TokenStreamComponents createComponents(String fieldName) {
		 Reader reader=null;
	        try{
	            reader=new StringReader(fieldName);
	            IKTokenizer it = new IKTokenizer(reader);
	            return new Analyzer.TokenStreamComponents(it);
	        }finally {
	            IOUtils.closeWhileHandlingException(reader);
	        }
	}

}

IKTokenizer

/**
 * IK 中文分词  版本 6.5.0
 * IK Analyzer release 6.5.0
 * 
 * Licensed to the Apache Software Foundation (ASF) under one or more
 * contributor license agreements.  See the NOTICE file distributed with
 * this work for additional information regarding copyright ownership.
 * The ASF licenses this file to You under the Apache License, Version 2.0
 * (the "License"); you may not use this file except in compliance with
 * the License.  You may obtain a copy of the License at
 *
 *     http://www.apache.org/licenses/LICENSE-2.0
 *
 * Unless required by applicable law or agreed to in writing, software
 * distributed under the License is distributed on an "AS IS" BASIS,
 * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 * See the License for the specific language governing permissions and
 * limitations under the License.
 *
 * provided by Linliangyi and copyright 2012 by Oolong studio
 * 

 * 
 */
package org.wltea.analyzer.lucene;

import java.io.IOException;
import java.io.Reader;

import org.apache.lucene.analysis.Tokenizer;
import org.apache.lucene.analysis.tokenattributes.CharTermAttribute;
import org.apache.lucene.analysis.tokenattributes.OffsetAttribute;
import org.apache.lucene.analysis.tokenattributes.TypeAttribute;

import org.wltea.analyzer.core.IKSegmenter;
import org.wltea.analyzer.core.Lexeme;

/**
 * IK分词器 Lucene Tokenizer适配器类 兼容Lucene 6.5.0版本 暴走抹茶 2017.3.28
 */
public final class IKTokenizer extends Tokenizer {

	// IK分词器实现
	private IKSegmenter _IKImplement;

	// 词元文本属性
	private final CharTermAttribute termAtt;
	// 词元位移属性
	private final OffsetAttribute offsetAtt;
	// 词元分类属性（该属性分类参考org.wltea.analyzer.core.Lexeme中的分类常量）
	private final TypeAttribute typeAtt;
	// 记录最后一个词元的结束位置
	private int endPosition;

	public IKTokenizer(Reader in) {
		this(in, false);
	}

	/**
	 * Lucene 6.5.0 Tokenizer适配器类构造函数
	 * 
	 * @param in
	 * @param useSmart
	 */
	public IKTokenizer(Reader in, boolean useSmart) {
		offsetAtt = addAttribute(OffsetAttribute.class);
		termAtt = addAttribute(CharTermAttribute.class);
		typeAtt = addAttribute(TypeAttribute.class);
		_IKImplement = new IKSegmenter(input, useSmart);
	}

	/*
	 * (non-Javadoc)
	 * 
	 * @see org.apache.lucene.analysis.TokenStream#incrementToken()
	 */
	@Override
	public boolean incrementToken() throws IOException {
		// 清除所有的词元属性
		clearAttributes();
		Lexeme nextLexeme = _IKImplement.next();
		if (nextLexeme != null) {
			// 将Lexeme转成Attributes
			// 设置词元文本
			termAtt.append(nextLexeme.getLexemeText());
			// 设置词元长度
			termAtt.setLength(nextLexeme.getLength());
			// 设置词元位移
			offsetAtt.setOffset(nextLexeme.getBeginPosition(), nextLexeme.getEndPosition());
			// 记录分词的最后位置
			endPosition = nextLexeme.getEndPosition();
			// 记录词元分类
			typeAtt.setType(nextLexeme.getLexemeTypeString());
			// 返会true告知还有下个词元
			return true;
		}
		// 返会false告知词元输出完毕
		return false;
	}

	/*
	 * (non-Javadoc)
	 * 
	 * @see org.apache.lucene.analysis.Tokenizer#reset(java.io.Reader)
	 */
	@Override
	public void reset() throws IOException {
		super.reset();
		_IKImplement.reset(input);
	}

	@Override
	public final void end() {
		// set final offset
		int finalOffset = correctOffset(this.endPosition);
		offsetAtt.setOffset(finalOffset, finalOffset);
	}
}

IKQueryExpressionParser

/**
 * IK 中文分词  版本 6.5.0
 * IK Analyzer release 6.5.0
 * 
 * Licensed to the Apache Software Foundation (ASF) under one or more
 * contributor license agreements.  See the NOTICE file distributed with
 * this work for additional information regarding copyright ownership.
 * The ASF licenses this file to You under the Apache License, Version 2.0
 * (the "License"); you may not use this file except in compliance with
 * the License.  You may obtain a copy of the License at
 *
 *     http://www.apache.org/licenses/LICENSE-2.0
 *
 * Unless required by applicable law or agreed to in writing, software
 * distributed under the License is distributed on an "AS IS" BASIS,
 * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 * See the License for the specific language governing permissions and
 * limitations under the License.
 *
 * provided by Linliangyi and copyright 2012 by Oolong studio
 * 
 */
package org.wltea.analyzer.query;

import java.util.ArrayList;
import java.util.LinkedList;
import java.util.List;
import java.util.Stack;

import org.apache.lucene.index.Term;
import org.apache.lucene.search.BooleanClause;
import org.apache.lucene.search.BooleanQuery;
import org.apache.lucene.search.BooleanQuery.Builder;
import org.apache.lucene.search.Query;
import org.apache.lucene.search.TermQuery;
import org.apache.lucene.search.TermRangeQuery;
import org.apache.lucene.search.BooleanClause.Occur;
import org.apache.lucene.util.BytesRef;

/**
 * IK简易查询表达式解析 
 * 结合SWMCQuery算法  暴走抹茶 2017.3.28
 * 
 * 表达式例子 ：
 * (id='1231231' && title:'monkey') || (content:'你好吗'  || ulr='www.ik.com') - name:'helloword'
 * @author linliangyi
 *
 */
public class IKQueryExpressionParser {
	
	//public static final String LUCENE_SPECIAL_CHAR = "&&||-()':={}[],";
	
	private List elements = new ArrayList();
	
	private Stack querys =  new Stack();
	
	private Stack operates = new Stack();
	
	/**
	 * 解析查询表达式，生成Lucene Query对象
	 * 
	 * @param expression
	 * @param quickMode 
	 * @return Lucene query
	 */
	public Query parseExp(String expression , boolean quickMode){
		Query lucenceQuery = null;
		if(expression != null && !"".equals(expression.trim())){
			try{
				//文法解析
				this.splitElements(expression);
				//语法解析
				this.parseSyntax(quickMode);
				if(this.querys.size() == 1){
					lucenceQuery = this.querys.pop();
				}else{
					throw new IllegalStateException("表达式异常： 缺少逻辑操作符 或 括号缺失");
				}
			}finally{
				elements.clear();
				querys.clear();
				operates.clear();
			}
		}
		return lucenceQuery;
	}	
	
	/**
	 * 表达式文法解析
	 * @param expression
	 */
	private void splitElements(String expression){
 		
		if(expression == null){
			return;
		}
		Element curretElement = null;
		
		char[] expChars = expression.toCharArray();
		for(int i = 0 ; i < expChars.length ; i++){
			switch(expChars[i]){
			case '&' :
				if(curretElement == null){
					curretElement = new Element();
					curretElement.type = '&';
					curretElement.append(expChars[i]);
				}else if(curretElement.type == '&'){
					curretElement.append(expChars[i]);
					this.elements.add(curretElement);
					curretElement = null;
				}else if(curretElement.type == '\''){
					curretElement.append(expChars[i]);
				}else {
					this.elements.add(curretElement);
					curretElement = new Element();
					curretElement.type = '&';
					curretElement.append(expChars[i]);
				}
				break;
				
			case '|' :
				if(curretElement == null){
					curretElement = new Element();
					curretElement.type = '|';
					curretElement.append(expChars[i]);
				}else if(curretElement.type == '|'){
					curretElement.append(expChars[i]);
					this.elements.add(curretElement);
					curretElement = null;
				}else if(curretElement.type == '\''){
					curretElement.append(expChars[i]);
				}else {
					this.elements.add(curretElement);
					curretElement = new Element();
					curretElement.type = '|';
					curretElement.append(expChars[i]);
				}				
				break;
				
			case '-' :
				if(curretElement != null){
					if(curretElement.type == '\''){
						curretElement.append(expChars[i]);
						continue;
					}else{
						this.elements.add(curretElement);
					}
				}
				curretElement = new Element();
				curretElement.type = '-';
				curretElement.append(expChars[i]);
				this.elements.add(curretElement);
				curretElement = null;			
				break;

			case '(' :
				if(curretElement != null){
					if(curretElement.type == '\''){
						curretElement.append(expChars[i]);
						continue;
					}else{
						this.elements.add(curretElement);
					}
				}
				curretElement = new Element();
				curretElement.type = '(';
				curretElement.append(expChars[i]);
				this.elements.add(curretElement);
				curretElement = null;			
				break;				

			case ')' :
				if(curretElement != null){
					if(curretElement.type == '\''){
						curretElement.append(expChars[i]);
						continue;
					}else{
						this.elements.add(curretElement);
					}
				}
				curretElement = new Element();
				curretElement.type = ')';
				curretElement.append(expChars[i]);
				this.elements.add(curretElement);
				curretElement = null;			
				break;					

			case ':' :
				if(curretElement != null){
					if(curretElement.type == '\''){
						curretElement.append(expChars[i]);
						continue;
					}else{
						this.elements.add(curretElement);
					}
				}
				curretElement = new Element();
				curretElement.type = ':';
				curretElement.append(expChars[i]);
				this.elements.add(curretElement);
				curretElement = null;			
				break;	
			
			case '=' :
				if(curretElement != null){
					if(curretElement.type == '\''){
						curretElement.append(expChars[i]);
						continue;
					}else{
						this.elements.add(curretElement);
					}
				}
				curretElement = new Element();
				curretElement.type = '=';
				curretElement.append(expChars[i]);
				this.elements.add(curretElement);
				curretElement = null;			
				break;					

			case ' ' :
				if(curretElement != null){
					if(curretElement.type == '\''){
						curretElement.append(expChars[i]);
					}else{
						this.elements.add(curretElement);
						curretElement = null;
					}
				}
				
				break;
			
			case '\'' :
				if(curretElement == null){
					curretElement = new Element();
					curretElement.type = '\'';
					
				}else if(curretElement.type == '\''){
					this.elements.add(curretElement);
					curretElement = null;
					
				}else{
					this.elements.add(curretElement);
					curretElement = new Element();
					curretElement.type = '\'';
					
				}
				break;
				
			case '[':
				if(curretElement != null){
					if(curretElement.type == '\''){
						curretElement.append(expChars[i]);
						continue;
					}else{
						this.elements.add(curretElement);
					}
				}
				curretElement = new Element();
				curretElement.type = '[';
				curretElement.append(expChars[i]);
				this.elements.add(curretElement);
				curretElement = null;					
				break;
				
			case ']':
				if(curretElement != null){
					if(curretElement.type == '\''){
						curretElement.append(expChars[i]);
						continue;
					}else{
						this.elements.add(curretElement);
					}
				}
				curretElement = new Element();
				curretElement.type = ']';
				curretElement.append(expChars[i]);
				this.elements.add(curretElement);
				curretElement = null;
				
				break;
				
			case '{':
				if(curretElement != null){
					if(curretElement.type == '\''){
						curretElement.append(expChars[i]);
						continue;
					}else{
						this.elements.add(curretElement);
					}
				}
				curretElement = new Element();
				curretElement.type = '{';
				curretElement.append(expChars[i]);
				this.elements.add(curretElement);
				curretElement = null;					
				break;
				
			case '}':
				if(curretElement != null){
					if(curretElement.type == '\''){
						curretElement.append(expChars[i]);
						continue;
					}else{
						this.elements.add(curretElement);
					}
				}
				curretElement = new Element();
				curretElement.type = '}';
				curretElement.append(expChars[i]);
				this.elements.add(curretElement);
				curretElement = null;
				
				break;
			case ',':
				if(curretElement != null){
					if(curretElement.type == '\''){
						curretElement.append(expChars[i]);
						continue;
					}else{
						this.elements.add(curretElement);
					}
				}
				curretElement = new Element();
				curretElement.type = ',';
				curretElement.append(expChars[i]);
				this.elements.add(curretElement);
				curretElement = null;
				
				break;
				
			default :
				if(curretElement == null){
					curretElement = new Element();
					curretElement.type = 'F';
					curretElement.append(expChars[i]);
					
				}else if(curretElement.type == 'F'){
					curretElement.append(expChars[i]);
					
				}else if(curretElement.type == '\''){
					curretElement.append(expChars[i]);

				}else{
					this.elements.add(curretElement);
					curretElement = new Element();
					curretElement.type = 'F';
					curretElement.append(expChars[i]);
				}			
			}
		}
		
		if(curretElement != null){
			this.elements.add(curretElement);
			curretElement = null;
		}
	}
		
	/**
	 * 语法解析
	 * 
	 */
	private void parseSyntax(boolean quickMode){
		for(int i = 0 ; i < this.elements.size() ; i++){
			Element e = this.elements.get(i);
			if('F' == e.type){
				Element e2 = this.elements.get(i + 1);
				if('=' != e2.type && ':' != e2.type){
					throw new IllegalStateException("表达式异常： = 或 ： 号丢失");
				}
				Element e3 = this.elements.get(i + 2);
				//处理 = 和 ： 运算
				if('\'' == e3.type){
					i+=2;
					if('=' == e2.type){
						TermQuery tQuery = new TermQuery(new Term(e.toString() , e3.toString()));
						this.querys.push(tQuery);
					}else if(':' == e2.type){
						String keyword = e3.toString();
						//SWMCQuery Here
						Query _SWMCQuery =  SWMCQueryBuilder.create(e.toString(), keyword , quickMode);
						this.querys.push(_SWMCQuery);
					}
					
				}else if('[' == e3.type || '{' == e3.type){
					i+=2;
					//处理 [] 和 {}
					LinkedList eQueue = new LinkedList();
					eQueue.add(e3);
					for( i++ ; i < this.elements.size() ; i++){							
						Element eN = this.elements.get(i);
						eQueue.add(eN);
						if(']' == eN.type || '}' == eN.type){
							break;
						}
					}
					//翻译RangeQuery
					Query rangeQuery = this.toTermRangeQuery(e , eQueue);
					this.querys.push(rangeQuery);
				}else{
					throw new IllegalStateException("表达式异常：匹配值丢失");
				}
				
			}else if('(' == e.type){
				this.operates.push(e);
				
			}else if(')' == e.type){
				boolean doPop = true;
				while(doPop && !this.operates.empty()){
					Element op = this.operates.pop();
					if('(' == op.type){
						doPop = false;
					}else {
						Query q = toBooleanQuery(op);
						this.querys.push(q);
					}
					
				}
			}else{ 
				
				if(this.operates.isEmpty()){
					this.operates.push(e);
				}else{
					boolean doPeek = true;
					while(doPeek && !this.operates.isEmpty()){
						Element eleOnTop = this.operates.peek();
						if('(' == eleOnTop.type){
							doPeek = false;
							this.operates.push(e);
						}else if(compare(e , eleOnTop) == 1){
							this.operates.push(e);
							doPeek = false;
						}else if(compare(e , eleOnTop) == 0){
							Query q = toBooleanQuery(eleOnTop);
							this.operates.pop();
							this.querys.push(q);
						}else{
							Query q = toBooleanQuery(eleOnTop);
							this.operates.pop();
							this.querys.push(q);
						}
					}
					
					if(doPeek && this.operates.empty()){
						this.operates.push(e);
					}
				}
			}			
		}
		
		while(!this.operates.isEmpty()){
			Element eleOnTop = this.operates.pop();
			Query q = toBooleanQuery(eleOnTop);
			this.querys.push(q);			
		}		
	}

	/**
	 * 根据逻辑操作符，生成BooleanQuery
	 * @param op
	 * @return
	 */
	private Query toBooleanQuery(Element op){
		if(this.querys.size() == 0){
			return null;
		}
		
		//BooleanQuery resultQuery = null;
		Builder builder = new Builder();

		if(this.querys.size() == 1){
			return this.querys.get(0);
		}
		
		Query q2 = this.querys.pop();
		Query q1 = this.querys.pop();
		if('&' == op.type){
			if(q1 != null){
				if(q1 instanceof BooleanQuery){
					List clauses = ((BooleanQuery)q1).clauses();
					if(clauses.size() > 0 
							&& clauses.get(0).getOccur() == Occur.MUST){
						for(BooleanClause c : clauses){
							builder.add(c);
						}					
					}else{
						builder.add(q1,Occur.MUST);
					}

				}else{
					//q1 instanceof TermQuery 
					//q1 instanceof TermRangeQuery 
					//q1 instanceof PhraseQuery
					//others
					builder.add(q1,Occur.MUST);
				}
			}
			
			if(q2 != null){
				if(q2 instanceof BooleanQuery){
					List clauses = ((BooleanQuery)q2).clauses();
					if(clauses.size() > 0 
							&& clauses.get(0).getOccur() == Occur.MUST){
						for(BooleanClause c : clauses){
							builder.add(c);
						}					
					}else{
						builder.add(q2,Occur.MUST);
					}
					
				}else{
					//q1 instanceof TermQuery 
					//q1 instanceof TermRangeQuery 
					//q1 instanceof PhraseQuery
					//others
					builder.add(q2,Occur.MUST);
				}
			}
			
		}else if('|' == op.type){
			if(q1 != null){
				if(q1 instanceof BooleanQuery){
					List clauses = ((BooleanQuery)q1).clauses();
					if(clauses.size() > 0 
							&& clauses.get(0).getOccur() == Occur.SHOULD){
						for(BooleanClause c : clauses){
							builder.add(c);
						}					
					}else{
						builder.add(q1,Occur.SHOULD);
					}
					
				}else{
					//q1 instanceof TermQuery 
					//q1 instanceof TermRangeQuery 
					//q1 instanceof PhraseQuery
					//others
					builder.add(q1,Occur.SHOULD);
				}
			}
			
			if(q2 != null){
				if(q2 instanceof BooleanQuery){
					List clauses = ((BooleanQuery)q2).clauses();
					if(clauses.size() > 0 
							&& clauses.get(0).getOccur() == Occur.SHOULD){
						for(BooleanClause c : clauses){
							builder.add(c);
						}					
					}else{
						builder.add(q2,Occur.SHOULD);
					}
				}else{
					//q2 instanceof TermQuery 
					//q2 instanceof TermRangeQuery 
					//q2 instanceof PhraseQuery
					//others
					builder.add(q2,Occur.SHOULD);
					
				}
			}
			
		}else if('-' == op.type){
			if(q1 == null || q2 == null){
				throw new IllegalStateException("表达式异常：SubQuery 个数不匹配");
			}
			
			if(q1 instanceof BooleanQuery){
				List clauses = ((BooleanQuery)q1).clauses();
				if(clauses.size() > 0){
					for(BooleanClause c : clauses){
						builder.add(c);
					}					
				}else{
					builder.add(q1,Occur.MUST);
				}

			}else{
				//q1 instanceof TermQuery 
				//q1 instanceof TermRangeQuery 
				//q1 instanceof PhraseQuery
				//others
				builder.add(q1,Occur.MUST);
			}				
			
			builder.add(q2,Occur.MUST_NOT);
		}
		return builder.build();
	}	
	
	/**
	 * 组装TermRangeQuery
	 * @param elements
	 * @return
	 */
	private TermRangeQuery toTermRangeQuery(Element fieldNameEle , LinkedList elements){

		boolean includeFirst = false;
		boolean includeLast = false;
		String firstValue = null;
		String lastValue = null;
		//检查第一个元素是否是[或者{
		Element first = elements.getFirst();
		if('[' == first.type){
			includeFirst = true;
		}else if('{' == first.type){
			includeFirst = false;
		}else {
			throw new IllegalStateException("表达式异常");
		}
		//检查最后一个元素是否是]或者}
		Element last = elements.getLast();
		if(']' == last.type){
			includeLast = true;
		}else if('}' == last.type){
			includeLast = false;
		}else {
			throw new IllegalStateException("表达式异常, RangeQuery缺少结束括号");
		}
		if(elements.size() < 4 || elements.size() > 5){
			throw new IllegalStateException("表达式异常, RangeQuery 错误");
		}			
		//读出中间部分
		Element e2 = elements.get(1);
		if('\'' == e2.type){
			firstValue = e2.toString();
			//
			Element e3 = elements.get(2);
			if(',' != e3.type){
				throw new IllegalStateException("表达式异常, RangeQuery缺少逗号分隔");
			}
			//
			Element e4 = elements.get(3);
			if('\'' == e4.type){
				lastValue = e4.toString();
			}else if(e4 != last){
				throw new IllegalStateException("表达式异常，RangeQuery格式错误");
			}				
		}else if(',' == e2.type){
			firstValue = null;
			//
			Element e3 = elements.get(2);
			if('\'' == e3.type){
				lastValue = e3.toString();
			}else{
				throw new IllegalStateException("表达式异常，RangeQuery格式错误");
			}
			
		}else {
			throw new IllegalStateException("表达式异常, RangeQuery格式错误");
		}
		
		return new TermRangeQuery(fieldNameEle.toString() , new BytesRef(firstValue) , new BytesRef(lastValue) , includeFirst , includeLast);
	}	
	
	/**
	 * 比较操作符优先级
	 * @param e1
	 * @param e2
	 * @return
	 */
	private int compare(Element e1 , Element e2){
		if('&' == e1.type){
			if('&' == e2.type){
				return 0;
			}else {
				return 1;
			}
		}else if('|' == e1.type){
			if('&' == e2.type){
				return -1;
			}else if('|' == e2.type){
				return 0;
			}else{
				return 1;
			}
		}else{
			if('-' == e2.type){
				return 0;
			}else{
				return -1;
			}
		}
	}
	
	/**
	 * 表达式元素（操作符、FieldName、FieldValue）
	 * @author linliangyi
	 * May 20, 2010
	 */
	private class Element{
		char type = 0;
		StringBuffer eleTextBuff;

		public Element(){
			eleTextBuff = new StringBuffer();
		}
		
		public void append(char c){
			this.eleTextBuff.append(c);
		}
	
		public String toString(){
			return this.eleTextBuff.toString();
		}
	}	

	public static void main(String[] args){
		IKQueryExpressionParser parser = new IKQueryExpressionParser();
		//String ikQueryExp = "newsTitle:'的两款《魔兽世界》插件Bigfoot和月光宝盒'";
		String ikQueryExp = "(id='ABcdRf' && date:{'20010101','20110101'} && keyword:'魔兽中国') || (content:'KSHT-KSH-A001-18'  || ulr='www.ik.com') - name:'林良益'";
		Query result = parser.parseExp(ikQueryExp , true);
		System.out.println(result);

	}	
	
}

SWMCQueryBuilder

/**
 * IK 中文分词  版本 6.5.0
 * IK Analyzer release 6.5.0
 * 
 * Licensed to the Apache Software Foundation (ASF) under one or more
 * contributor license agreements.  See the NOTICE file distributed with
 * this work for additional information regarding copyright ownership.
 * The ASF licenses this file to You under the Apache License, Version 2.0
 * (the "License"); you may not use this file except in compliance with
 * the License.  You may obtain a copy of the License at
 *
 *     http://www.apache.org/licenses/LICENSE-2.0
 *
 * Unless required by applicable law or agreed to in writing, software
 * distributed under the License is distributed on an "AS IS" BASIS,
 * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 * See the License for the specific language governing permissions and
 * limitations under the License.
 * provided by Linliangyi and copyright 2012 by Oolong studio
 * 
 */
package org.wltea.analyzer.query;

import java.io.IOException;
import java.io.StringReader;
import java.util.ArrayList;
import java.util.List;

import org.apache.lucene.analysis.standard.StandardAnalyzer;
import org.apache.lucene.queryparser.classic.ParseException;
import org.apache.lucene.queryparser.classic.QueryParser;
import org.apache.lucene.search.Query;
import org.wltea.analyzer.core.IKSegmenter;
import org.wltea.analyzer.core.Lexeme;

/**
 * Single Word Multi Char Query Builder
 * IK分词算法专用  暴走抹茶 2017.3.28
 * @author linliangyi
 *
 */
public class SWMCQueryBuilder {

	/**
	 * 生成SWMCQuery
	 * @param fieldName
	 * @param keywords
	 * @param quickMode
	 * @return Lucene Query
	 */
	public static Query create(String fieldName ,String keywords , boolean quickMode){
		if(fieldName == null || keywords == null){
			throw new IllegalArgumentException("参数 fieldName 、 keywords 不能为null.");
		}
		//1.对keywords进行分词处理
		List lexemes = doAnalyze(keywords);
		//2.根据分词结果，生成SWMCQuery
		Query _SWMCQuery = getSWMCQuery(fieldName , lexemes , quickMode);
		return _SWMCQuery;
	}
	
	/**
	 * 分词切分，并返回结链表
	 * @param keywords
	 * @return
	 */
	private static List doAnalyze(String keywords){
		List lexemes = new ArrayList();
		IKSegmenter ikSeg = new IKSegmenter(new StringReader(keywords) , true);
		try{
			Lexeme l = null;
			while( (l = ikSeg.next()) != null){
				lexemes.add(l);
			}
		}catch(IOException e){
			e.printStackTrace();
		}
		return lexemes;
	}
	
	
	/**
	 * 根据分词结果生成SWMC搜索
	 * @param fieldName
	 * @param pathOption
	 * @param quickMode
	 * @return
	 */
	private static Query getSWMCQuery(String fieldName , List lexemes , boolean quickMode){
		//构造SWMC的查询表达式
		StringBuffer keywordBuffer = new StringBuffer();
		//精简的SWMC的查询表达式
		StringBuffer keywordBuffer_Short = new StringBuffer();
		//记录最后词元长度
		int lastLexemeLength = 0;
		//记录最后词元结束位置
		int lastLexemeEnd = -1;
		
		int shortCount = 0;
		int totalCount = 0;
		for(Lexeme l : lexemes){
			totalCount += l.getLength();
			//精简表达式
			if(l.getLength() > 1){
				keywordBuffer_Short.append(' ').append(l.getLexemeText());
				shortCount += l.getLength();
			}
			
			if(lastLexemeLength == 0){
				keywordBuffer.append(l.getLexemeText());				
			}else if(lastLexemeLength == 1 && l.getLength() == 1
					&& lastLexemeEnd == l.getBeginPosition()){//单字位置相邻，长度为一，合并)
				keywordBuffer.append(l.getLexemeText());
			}else{
				keywordBuffer.append(' ').append(l.getLexemeText());
				
			}
			lastLexemeLength = l.getLength();
			lastLexemeEnd = l.getEndPosition();
		}

		//借助lucene queryparser 生成SWMC Query
		QueryParser qp = new QueryParser(fieldName, new StandardAnalyzer());
		qp.setDefaultOperator(QueryParser.AND_OPERATOR);
		qp.setAutoGeneratePhraseQueries(true);
		
		if(quickMode && (shortCount * 1.0f / totalCount) > 0.5f){
			try {
				//System.out.println(keywordBuffer.toString());
				Query q = qp.parse(keywordBuffer_Short.toString());
				return q;
			} catch (ParseException e) {
				e.printStackTrace();
			}
			
		}else{
			if(keywordBuffer.length() > 0){
				try {
					//System.out.println(keywordBuffer.toString());
					Query q = qp.parse(keywordBuffer.toString());
					return q;
				} catch (ParseException e) {
					e.printStackTrace();
				}
			}
		}
		return null;
	}
}

Elasticsearch MacJerry elasticsearch 大数据搜索引擎
学习目标[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-BGW4RqWM-1635414988340)(es.assets/Snipaste_2020-07-06_13-03-45.png)]Elasticsearch简介与安装什么是Elasticsearch？ElasticSearch是一个基于Lucene的搜索服务器。它提供了一个分布式多用户能力的全文搜索引擎，基
Elasticsearch：基本概念、索引结构与优缺点分析 Leaton Lee elasticsearch 大数据搜索引擎
一、Elasticsearch基本概念Elasticsearch是一个基于Lucene构建的开源、分布式、RESTful搜索引擎，专为云计算环境设计，能够实现近乎实时的数据搜索和分析功能。核心概念解析文档(Document)Elasticsearch中的基本数据单元，使用JSON格式表示每个文档有唯一ID和类型示例：一条产品信息、一篇博客文章或一个客户记录索引(Index)文档的集合，类似于关系数
ES 和 lucene 的区别是什么？晚夜微雨问海棠呀 elasticsearch lucene 大数据
Elasticsearch(ES)和Lucene都是用于全文搜索和分析的工具，但它们在功能和使用场景上有一些重要的区别：基础与角色：Lucene是一个开源的信息检索软件库，提供了一个高性能、全功能的文本搜索引擎。它是许多搜索应用的核心，包括Elasticsearch。Elasticsearch是一个分布式搜索和分析引擎，构建在Lucene之上。它不仅提供了Lucene的所有功能，还增加了分布式计算
Lucence 和 Elasticsearch 的区别? 码出财富 elasticsearch 大数据搜索引擎
Lucene和Elasticsearch都是在信息检索和文本处理领域中广泛使用的工具，它们的主要区别如下：概念和定位Lucene：是一个基于Java的全文检索库，它提供了一套强大的底层索引和搜索功能的API。Lucene更像是一个工具包，开发人员可以基于它来构建自己的搜索应用程序，需要深入了解搜索的底层原理和算法，对开发者的技术要求较高。Elasticsearch：是一个基于Lucene的分布式搜
【云原生】Docker 部署 Elasticsearch 9 操作详解逆风飞翔的小叔运维 Docker 部署es9 Docker部署es Docker搭建es9 Elasticsearch9 Docker搭建es
目录一、前言二、Elasticsearch9新特性介绍2.1基于Lucene10重大升级2.2BetterBinaryQuantization（BBQ）2.3ElasticDistributionsofOpenTelemetry（EDOT）2.4LLM可观测性2.5攻击发现与自动导入2.6ES|QL增强2.7语义检索三、基于Docker部署Elasticsearch93.1Elasticsearc
深度解析Lucene IndexWriter 性能优化微笑听雨。 java 进阶教程 lucene indexWriter 全文检索性能调优内存缓冲
深度解析LuceneIndexWriter性能优化目标：在大规模写入、频繁更新的场景下，既保持吞吐量，又兼顾搜索实时性与系统稳定性。关键调优点内存缓冲：将RAMBufferSizeMB提升至128–1024MB，减少flush次数；必要时配合maxBufferedDocs。合并策略：使用TieredMergePolicy，典型参数为maxMergeAtOnce4–8、segmentsPerTier
Spring Boot 集成 Elasticsearch（含 ElasticsearchRestTemplate 示例）超级小忍 SpringBoot spring boot elasticsearch
Elasticsearch是一个基于Lucene的分布式搜索服务器，具有高效的全文检索能力。在现代应用中，尤其是需要强大搜索功能的系统中，Elasticsearch被广泛使用。SpringBoot提供了对Elasticsearch的集成支持，使得开发者可以轻松地将Elasticsearch集成到SpringBoot应用中，实现高效的搜索、分析等功能。本文将详细介绍如何在SpringBoot中集成E
从源码角度了解Elasticsaerch(分布式协调排序、深分页问题)
引文Elasticsearch基于Lucene所以很多系统实现都在其中,所以可以先看看Lucene的实现:https://blog.csdn.net/qq_35040959/article/details/147931034项目组件不像Kafka这种顶级项目核心性能组件全自己实现,ELK中有很多引用至第三方开放库;网络模型-Netty网络模型多重要不必多说,Elasticsearch基于Netty
ELK在Java的使用 hqxstudying ELK java 日志 elasticsearch
在Java应用里运用ELK（Elasticsearch、Logstash、Kibana）技术栈，能够实现日志的集中化管理、高效搜索以及直观可视化。下面将从基础概念入手，逐步深入讲解其使用方法。一、基础概念ELK技术栈由三款开源工具构成：Elasticsearch：作为分布式搜索引擎，它基于Lucene开发，具备强大的全文检索和数据分析能力。Logstash：属于数据收集引擎，可对多源数据进行收集、
基于lucene的案例开发：实时索引管理类IndexManager
转载请注明出处：http://blog.csdn.net/xiaojimanman/article/details/44015983http://www.llwjy.com/blogdetail/5757ce8c007754704b563dd6a47ca1ca.html个人的博客小站也搭建成功，网址：www.llwjy.com，欢迎大家来吐槽~在前一篇博客中，对实时索引的实现原理做了一些简单的介绍
Elasticsearch 海量数据写入与高效文本检索实践指南 weixin_52755040 运维 es
Elasticsearch海量数据写入与高效文本检索实践指南一、引言在大数据时代，企业和组织面临着海量数据的存储与检索需求。Elasticsearch（以下简称ES）作为一款基于Lucene的分布式搜索和分析引擎，凭借其高可扩展性、实时搜索和分析能力，成为处理海量数据写入与文本检索的热门选择。本文将深入探讨如何在ES中实现海量数据的高效写入，并利用其强大的功能进行精准的文本检索，帮助开发者和技术人
solr教程，值得刚接触搜索开发人员一看 LarryHai6 IT-文档存储架构全文检索 lucene 企业搜索
Solr调研总结开发类型全文检索相关开发Solr版本4.2文件内容本文介绍solr的功能使用及相关注意事项;主要包括以下内容:环境搭建及调试;两个核心配置文件介绍;维护索引;查询索引,和在查询中可以应用的高亮显示、拼写检查、搜索建议、分组统计、拼音检索等功能的使用方法。版本作者/修改人日期V1.0gzk2013-06-041.Solr是什么？Solr它是一种开放源码的、基于LuceneJava的搜
ES分片（Shard）和副本（Replica）的作用？如何合理分配？搞不懂语言的程序员 elasticsearch 中间件 elasticsearch 大数据搜索引擎
ES分片和副本一、分片（Shard）的作用数据水平扩展将索引拆分为多个分片（默认5个），实现海量数据分布式存储和并行计算读写负载均衡每个分片作为独立的Lucene索引，支持并发读写操作，提升吞吐量故障隔离能力单个分片故障不会导致整个索引不可用，其他分片仍可继续提供服务二、副本（Replica）的作用数据高可用每个分片的副本（默认1个）存储在不同节点，主分片故障时副本自动升级为主分片读取性能提升副本
规则包含使用分词和JDK自带流式stream处理效率对比--分词lucene-word过滤与JDK的contains方法对比苦思冥想行则将至 word分词数据过滤 java过滤数据过滤数据效率 20万关键字处理
目录前言：1、lucene分词工具的使用2、分词word与JDK的stream流式过滤实现测试结果2.1通过包含20万条数据与否，进行效率对比2.2打印执行时间差，来实现效率对比3、一次性触发20万条数据执行进行Mysql记录执行结果，参数，耗时，入参4、数据库截图展示统计结果，JDK的效率更高一些前言：在数据过滤以及处理的过程中，会用到分词工具对于大文本的信息内容进行处理，作为java开发，经常
Springboot基于ElasticSearch全文搜索引擎策略实现 LQzhang_11 JAVA 缓存 Spring 搜索引擎 spring boot elasticsearch
一、ElasticSearch概念简介ElasticSearch是一个基于Lucene的开源搜索引擎，具有分布式、多租户能力的全文搜索引擎。ElasticSearch的设计目标是实现分布式、可扩展和速度快的搜索架构，使得用户在数据量较大的情况下依然能够快速高效地对数据进行搜索和分析。ElasticSearch使用RESTfulAPI进行操作，支持结构化、非结构化数据的索引和搜索，适合用来解决大量数
基于Elasticsearch的搜索引擎简介 weixin_47233946 编程搜索引擎 elasticsearch 大数据
##一、Elasticsearch简介Elasticsearch（简称ES）是一个开源的、分布式、RESTful风格的搜索和数据分析引擎，基于ApacheLucene开发。它能够实现对海量结构化和非结构化数据的实时存储、搜索和分析，广泛应用于全文检索、日志分析、数据可视化等场景。##二、核心原理Elasticsearch以文档为核心，每条数据都以JSON格式存储。其底层采用倒排索引（Inverte
Elasticsearch 方法论 catkin_ws 数据库
人工智能、大数据快速发展的今天，对于TB甚至PB级大数据的快速检索已然成为刚需。Elasticsearch作为开源领域的后起之秀，从2010年至今得到飞跃式的发展。Elasticsearch以其开源、分布式、RESTFulAPI三大优势，已经成为当下风口中“会飞的猪”。阿里云2018年2月5日已开价50-200W年薪招聘技术人员参与Elasticsearch、Lucene内核优化、改进。如果说，你
69道Elasticsearch高频题整理(附答案背诵版) Zeyhra elasticsearch jenkins 大数据
简述什么是Elasticsearch？参考回答Elasticsearch是一个基于分布式架构的开源搜索引擎，使用全文检索引擎ApacheLucene作为底层技术实现。它能够提供强大的搜索、数据存储和分析功能，适用于海量数据的实时搜索和分析场景。Elasticsearch的核心特点全文检索：提供强大的全文检索能力，支持模糊搜索、前缀搜索、短语搜索等多种高级搜索功能。分布式架构：支持水平扩展，数据分布
面试专区|【69道Elasticsearch高频题整理(附答案背诵版)】尺小闹面试 elasticsearch 职场和发展
简述什么是Elasticsearch？Elasticsearch是一个基于Lucene的搜索服务器，它提供了一个分布式、多用户能力的全文搜索引擎，基于RESTfulweb接口。Elasticsearch是用Java语言开发的，并作为Apache许可条款下的开放源码发布，是一种流行的企业级搜索引擎。它用于云计算中，能够达到实时搜索，稳定，可靠，快速，安装使用方便。官方客户端在Java、.NET（C#
ElasticSearch的基本概念：索引类型文档和映射 AI天才研究院计算 AI大模型应用入门实战与进阶大数据人工智能语言模型 AI LLM Java Python 架构设计 Agent RPA 计算 AI大模型应用
1.背景介绍ElasticSearch是一个基于Lucene的分布式搜索引擎，它提供了一个简单易用的RESTfulAPI，可以快速地进行全文搜索、结构化搜索、分析和聚合等操作。在ElasticSearch中，最基本的概念包括索引、类型、文档和映射。本文将详细介绍这些概念的含义和联系，以及它们在ElasticSearch中的具体实现和应用。2.核心概念与联系2.1索引索引是ElasticSearch
ElasticSearch 2.x入门与快速实践爱美有喜技术漫谈 elasticsearch 分布式搜索引擎索引
IntroductionElasticSearch是一个基于ApacheLucene(TM)的开源搜索引擎。无论在开源还是专有领域，Lucene可以被认为是迄今为止最先进、性能最好的、功能最全的搜索引擎库。但是，Lucene只是一个库。想要使用它，你必须使用Java来作为开发语言并将其直接集成到你的应用中，更糟糕的是，Lucene非常复杂，你需要深入了解检索的相关知识来理解它是如何工作的。Elas
ElasticSearch es 插件开发 2501_90252573 elasticsearch 大数据搜索引擎
PythonLanguageSecurityPlugins安全插件扩展es的安全策略，比如控制api的访问权限等优秀插件代表：X-PackSnapshot/RestoreRepositoryPlugins快照/还原存储库插件扩展es的快照和恢复功能StorePlugins存储插件扩展es的存储方式，es默认使用的是Lucene存储数据的优秀插件代表：StoreSMBWindowsSMB2.插件开发
Elasticsearch 快速入门指南 Luck_ff0810 开发工具 Java elasticsearch elasticsearch 大数据搜索引擎
1.Elasticsearch简介Elasticsearch是一个基于Lucene的开源分布式搜索和分析引擎，由Elastic公司开发。它具有以下特点：分布式：可以轻松扩展到数百台服务器，处理PB级数据实时性：数据一旦被索引，立即可被搜索全文检索：强大的全文搜索能力RESTfulAPI：提供简单易用的JSON风格API多功能：不仅是搜索引擎，还是强大的分析引擎2.核心概念在深入Elasticsea
Elasticsearch 最全调优，最佳实践（二）蒋厚施 elasticsearch 大数据搜索引擎
接着上一篇Elasticsearch最全调优，最佳实践（一）15、在Elasticsearch中，是怎么根据一个词找到对应的倒排索引的？Lucene的索引过程，就是按照全文检索的基本过程，将倒排表写成此文件格式的过程。Lucene的搜索过程，就是按照此文件格式将索引进去的信息读出来，然后计算每篇文档打分(score)的过程。16、Elasticsearch在部署时，对Linux的设置有哪些优化方法
elasticsearch-7.3.1集群搭建 jiedaodezhuti elasticsearch elasticsearch
1、es介绍ElasticSearch是一个基于Lucene的搜索服务器。提供了分布式多用户的全文搜索引擎，用Java语言开发的，Apache许可条款下的开放源码发布，是一种流行的企业级搜索引擎。包含如下特性：分布式高可用搜索引擎：每个索引都可以配置分片的数量。每个分片都有一个或多个副本且分片都支持读写多租户：支持多个索引以及索引级配置，如碎片数、索引存储等。提供各种API：包括HTTPRestf
Elasticsearch相关面试题真实的菜 es elasticsearch
概念理解类1.请简要阐述Elasticsearch为何被定义为基于Lucene的Restful分布式实时全文搜索引擎？1.基于Lucene底层引擎：ES的核心搜索能力依赖于ApacheLucene库。Lucene是一个高性能、功能强大的全文检索工具包，提供了倒排索引、分词、评分机制等核心搜索功能。扩展封装：ES在Lucene基础上进行了分布式和高可用性封装，简化了Lucene的复杂API，使其更易
16款开源的全文搜索引擎 (1) 码农x马马开源搜索引擎 web安全安全 ddos 游戏网络
全文搜索引擎就是通过从互联网上提取的各个网站的信息（以网页文字为主）而建立的数据库中，检索与用户查询条件匹配的相关记录，然后按一定的排列顺序将结果返回给用户。1、ApacheLuceneJava全文搜索框架许可证：Apache-2.0开发语言：Java官网：https://lucene.apache.org/ApacheLucene是完全用Java编写的高性能、功能齐全的全文检索引擎架构，提供了完
Lucene多种数据类型使用说明学会了没 lucene mybatis java
Lucene作为一款高性能的全文检索引擎库，其核心功能围绕索引和搜索文本数据，但它也支持多种数据类型以满足复杂的应用场景。以下是Lucene支持的主要数据类型及其用途的详细说明：1.文本类型（Text）用途：全文搜索、分词处理。特点：分词（Tokenization）：文本字段会被分词器（如StandardAnalyzer）拆分为词项（Term），便于模糊匹配、短语查询等。存储形式：通常使用Text
基于Docker的Elasticsearch ARM64架构镜像构建实践小盒子_spring 字节与烟火 docker elasticsearch 架构
一、前言Elasticsearch(以下简称为ES)是一个分布式的免费开源搜索和分析引擎，适用于包括文本、数字、地理空间、结构化和非结构化数据等在内的所有类型的数据。Elasticsearch在ApacheLucene的基础上开发而成，由ElasticsearchN.V.（即现在的Elastic）于2010年首次发布。Elasticsearch以其简单的REST风格API、分布式特性、速度和可扩展
Elasticsearch、Solr、Lucene 深度对比：架构解析、性能实战与选型指南 danny-IT技术博客 lucene elasticsearch solr java 后端 spring boot
文章目录Elasticsearch、Solr、Lucene深度对比：架构解析、性能实战与选型指南一、内核级技术对比：从架构到原理1.1核心架构差异图解（1）Lucene单机索引流程（2）Solr集群架构（3）Elasticsearch分布式架构1.2索引机制深度解析（1）Lucene段合并策略（2）Elasticsearch实时写入流程二、性能压测：百万级数据实战2.1测试环境配置2.2索引性能对
mysql主从数据同步林鹤霄 mysql主从数据同步
配置mysql5.5主从服务器(转) 教程开始：一、安装MySQL 说明：在两台MySQL服务器192.168.21.169和192.168.21.168上分别进行如下操作，安装MySQL 5.5.22 二、配置MySQL主服务器（192.168.21.169）mysql -uroot -p &nb
oracle学习笔记 caoyong oracle
1、ORACLE的安装 a>、ORACLE的版本 8i,9i : i是internet 10g,11g : grid (网格) 12c : cloud (云计算) b>、10g不支持win7 &
数据库，SQL零基础入门天子之骄 sql 数据库入门基本术语
数据库，SQL零基础入门做网站肯定离不开数据库，本人之前没怎么具体接触SQL，这几天起早贪黑得各种入门，恶补脑洞。一些具体的知识点，可以让小白不再迷茫的术语，拿来与大家分享。数据库，永久数据的一个或多个大型结构化集合，通常与更新和查询数据的软件相关
pom.xml 一炮送你回车库 pom.xml
1、一级元素dependencies是可以被子项目继承的 2、一级元素dependencyManagement是定义该项目群里jar包版本号的，通常和一级元素properties一起使用，既然有继承，也肯定有一级元素modules来定义子元素 3、父项目里的一级元素<modules> <module>lcas-admin-war</module> <
sql查地区省市县 3213213333332132 sql mysql
-- db_yhm_city SELECT * FROM db_yhm_city WHERE class_parent_id = 1 -- 海南 class_id = 9 港、奥、台 class_id = 33、34、35 SELECT * FROM db_yhm_city WHERE class_parent_id =169 SELECT d1.cla
关于监听器那些让人头疼的事宝剑锋梅花香画图板监听器鼠标监听器
本人初学JAVA，对于界面开发我只能说有点蛋疼，用JAVA来做界面的话确实需要一定的耐心（不使用插件，就算使用插件的话也没好多少）既然Java提供了界面开发，老师又要求做，只能硬着头皮上啦。但是监听器还真是个难懂的地方，我是上了几次课才略微搞懂了些。
JAVA的遍历MAP darkranger map
Java Map遍历方式的选择 1. 阐述　　对于Java中Map的遍历方式，很多文章都推荐使用entrySet，认为其比keySet的效率高很多。理由是：entrySet方法一次拿到所有key和value的集合；而keySet拿到的只是key的集合，针对每个key，都要去Map中额外查找一次value，从而降低了总体效率。那么实际情况如何呢？　　为了解遍历性能的真实差距，包括在遍历ke
POJ 2312 Battle City 优先多列+bfs aijuans 搜索
来源：http://poj.org/problem?id=2312 题意：题目背景就是小时候玩的坦克大战，求从起点到终点最少需要多少步。已知S和R是不能走得，E是空的，可以走，B是砖，只有打掉后才可以通过。思路：很容易看出来这是一道广搜的题目，但是因为走E和走B所需要的时间不一样，因此不能用普通的队列存点。因为对于走B来说，要先打掉砖才能通过，所以我们可以理解为走B需要两步，而走E是指需要1
Hibernate与Jpa的关系，终于弄懂 avords java Hibernate 数据库 jpa
我知道Jpa是一种规范，而Hibernate是它的一种实现。除了Hibernate，还有EclipseLink(曾经的toplink)，OpenJPA等可供选择，所以使用Jpa的一个好处是，可以更换实现而不必改动太多代码。在play中定义Model时，使用的是jpa的annotations，比如javax.persistence.Entity, Table, Column, OneToMany
酸爽的console.log bee1314 console
在前端的开发中，console.log那是开发必备啊，简直直观。通过写小函数，组合大功能。更容易测试。但是在打版本时，就要删除console.log，打完版本进入开发状态又要添加，真不够爽。重复劳动太多。所以可以做些简单地封装，方便开发和上线。 /** * log.js hufeng * The safe wrapper for `console.xxx` functions *
哈佛教授：穷人和过于忙碌的人有一个共同思维特质 bijian1013 时间管理励志人生穷人过于忙碌
一个跨学科团队今年完成了一项对资源稀缺状况下人的思维方式的研究，结论是：穷人和过于忙碌的人有一个共同思维特质，即注意力被稀缺资源过分占据，引起认知和判断力的全面下降。这项研究是心理学、行为经济学和政策研究学者协作的典范。　　这个研究源于穆来纳森对自己拖延症的憎恨。他7岁从印度移民美国，很快就如鱼得水，哈佛毕业
other operate 征客丶 OS osx
一、Mac Finder 设置排序方式，预览栏在显示－》查看显示选项中二、有时预览显示时，卡死在那，有可能是一些临时文件夹被删除了，如：/private/tmp[有待验证] -------------------------------------------------------------------- 若有其他凝问或文中有错误，请及时向我指出，我好及时改正，同时也让我们一
【Scala五】分析Spark源代码总结的Scala语法三 bit1129 scala
1. If语句作为表达式 val properties = if (jobIdToActiveJob.contains(jobId)) { jobIdToActiveJob(stage.jobId).properties } else { // this stage will be assigned to "default" po
ZooKeeper 入门 BlueSkator 中间件 zk
ZooKeeper是一个高可用的分布式数据管理与系统协调框架。基于对Paxos算法的实现，使该框架保证了分布式环境中数据的强一致性，也正是基于这样的特性，使得ZooKeeper解决很多分布式问题。网上对ZK的应用场景也有不少介绍，本文将结合作者身边的项目例子，系统地对ZK的应用场景进行一个分门归类的介绍。值得注意的是，ZK并非天生就是为这些应用场景设计的，都是后来众多开发者根据其框架的特性，利
MySQL取得当前时间的函数是什么格式化日期的函数是什么 BreakingBad mysql Date
取得当前时间用 now() 就行。在数据库中格式化时间用DATE_FORMA T(date, format) . 根据格式串format 格式化日期或日期和时间值date，返回结果串。可用DATE_FORMAT( ) 来格式化DATE 或DATETIME 值，以便得到所希望的格式。根据format字符串格式化date值: %S, %s 两位数字形式的秒（ 00,01,
读《研磨设计模式》-代码笔记-组合模式 bylijinnan java 设计模式
声明：本文只为方便我个人查阅和理解，详细的分析以及源代码请移步原作者的博客http://chjavach.iteye.com/ import java.util.ArrayList; import java.util.List; abstract class Component { public abstract void printStruct(Str
4_JAVA+Oracle面试题(有答案) chenke oracle
基础测试题卷面上不能出现任何的涂写文字，所有的答案要求写在答题纸上，考卷不得带走。选择题 1、 What will happen when you attempt to compile and run the following code? （3） public class Static { static { int x = 5; // 在static内有效 } st
新一代工作流系统设计目标 comsci 工作算法脚本
用户只需要给工作流系统制定若干个需求，流程系统根据需求，并结合事先输入的组织机构和权限结构，调用若干算法，在流程展示版面上面显示出系统自动生成的流程图，然后由用户根据实际情况对该流程图进行微调，直到满意为止，流程在运行过程中，系统和用户可以根据情况对流程进行实时的调整，包括拓扑结构的调整，权限的调整，内置脚本的调整。。。。。在这个设计中，最难的地方是系统根据什么来生成流
oracle 行链接与行迁移 daizj oracle 行迁移
表里的一行对于一个数据块太大的情况有二种(一行在一个数据块里放不下) 第一种情况: INSERT的时候，INSERT时候行的大小就超一个块的大小。Oracle把这行的数据存储在一连串的数据块里(Oracle Stores the data for the row in a chain of data blocks)，这种情况称为行链接(Row Chain)，一般不可避免(除非使用更大的数据
[JShop]开源电子商务系统jshop的系统缓存实现 dinguangx jshop 电子商务
前言 jeeshop中通过SystemManager管理了大量的缓存数据，来提升系统的性能，但这些缓存数据全部都是存放于内存中的，无法满足特定场景的数据更新（如集群环境）。JShop对jeeshop的缓存机制进行了扩展，提供CacheProvider来辅助SystemManager管理这些缓存数据，通过CacheProvider,可以把缓存存放在内存,ehcache,redis，memcache
初三全学年难记忆单词 dcj3sjt126com english word
several 儿子；若干 shelf 架子 knowledge 知识；学问 librarian 图书管理员 abroad 到国外，在国外 surf 冲浪 wave 浪；波浪 twice 两次；两倍 describe 描写；叙述 especially 特别；尤其 attract 吸引 prize 奖品；奖赏 competition 比赛；竞争 event 大事；事件 O
sphinx实践 dcj3sjt126com sphinx
安装参考地址:http://briansnelson.com/How_to_install_Sphinx_on_Centos_Server yum install sphinx 如果失败的话使用下面的方式安装 wget http://sphinxsearch.com/files/sphinx-2.2.9-1.rhel6.x86_64.rpm yum loca
JPA之JPQL（三） frank1234 orm jpa JPQL
1 什么是JPQL JPQL是Java Persistence Query Language的简称，可以看成是JPA中的HQL， JPQL支持各种复杂查询。 2 检索单个对象 @Test public void querySingleObject1() { Query query = em.createQuery("sele
Remove Duplicates from Sorted Array II hcx2013 remove
Follow up for "Remove Duplicates":What if duplicates are allowed at most twice? For example,Given sorted array nums = [1,1,1,2,2,3], Your function should return length
Spring4新特性——Groovy Bean定义DSL jinnianshilongnian spring 4
Spring4新特性——泛型限定式依赖注入 Spring4新特性——核心容器的其他改进 Spring4新特性——Web开发的增强 Spring4新特性——集成Bean Validation 1.1(JSR-349)到SpringMVC Spring4新特性——Groovy Bean定义DSL Spring4新特性——更好的Java泛型操作API Spring4新
CentOS安装Mysql5.5 liuxingguome centos
CentOS下以RPM方式安装MySQL5.5 首先卸载系统自带Mysql： yum remove mysql mysql-server mysql-libs compat-mysql51 rm -rf /var/lib/mysql rm /etc/my.cnf 查看是否还有mysql软件： rpm -qa|grep mysql 去http://dev.mysql.c
第14章工具函数（下） onestopweb 函数
index.html <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> <html xmlns="http://www.w3.org/
POJ 1050 SaraWon 二维数组子矩阵最大和
POJ ACM第1050题的详细描述，请参照 http://acm.pku.edu.cn/JudgeOnline/problem?id=1050 题目意思：给定包含有正负整型的二维数组，找出所有子矩阵的和的最大值。如二维数组 0 -2 -7 0 9 2 -6 2 -4 1 -4 1 -1 8 0 -2 中和最大的子矩阵是 9 2 -4 1 -1 8 且最大和是15
Java8全新打造，英语学习supertool yangshangchuan java superword 闭包 java8 函数式编程
superword是一个Java实现的英文单词分析软件，主要研究英语单词音近形似转化规律、前缀后缀规律、词之间的相似性规律等等。Clean code、Fluent style、Java8 feature: Lambdas, Streams and Functional-style Programming。升学考试、工作求职、充电提高，都少不了英语的身影，英语对我们来说实在太重要

Lucene6.5.0 下中文分词IKAnalyzer编译和使用

前言

修改和编译IKAnalyzer

你可能感兴趣的:(Lucene)