Lucene初探,一个初级的LuceneDemo

公司的项目要用到全文检索所以不得不花些时间来看看,在看了几天的文档之后自己尝试着做出了这样的一个Demo
我们要实现的目标是,上传文件并且对上传文件的内容进行检索,检索的结果是文件名,上传时间,内容摘要。
首先给出Demo所用到的技术:
webwork+freemarker ,Lucene的版本事2.4最新版,在此中我们用到了dom4j最终的搜索的结果生成的是一个xml文件,然后用xslt格式化输出。
首先给出的是wobwork的配置文件xwork.xml
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE xwork PUBLIC "-//OpenSymphony Group//XWork 1.1.1//EN" "http://www.opensymphony.com/xwork/xwork-1.1.1.dtd">

<xwork>
    <include file="webwork-default.xml"/>
	<package name="main" extends="webwork-default">
		<action name="main" class="com.fengzhong.webLucene.action.MainAction">
			<result name="success" type="freemarker">
				/WEB-INF/ftl/index.ftl
			</result>
			<result name="error" type="freemarker">
				/WEB-INF/ftl/error.ftl
			</result>
		</action>
		<action name="fileUpload" class="com.fengzhong.webLucene.action.FileUploadAction">
			<result name="success" type="freemarker">
				/WEB-INF/ftl/success.ftl
			</result>
			<result name="input" type="freemarker">
				/WEB-INF/ftl/fileUpload.ftl
			</result>
			<result name="error" type="freemarker">
				/WEB-INF/ftl/error.ftl
			</result>
		</action>
		<action name="search" class="com.fengzhong.webLucene.action.SearchAction">
			<result name="success">
				/WEB-INF/resultSet.xml
			</result>
			<result name="input" type="freemarker">
				/WEB-INF/ftl/search.ftl
			</result>
		</action>
	</package>
</xwork>

文件上传处理的action:
FileUploadAction
package org.john.webLucene.action;

import java.io.File;
import java.io.FileInputStream;
import java.io.FileOutputStream;
import java.io.IOException;
import java.nio.channels.FileChannel;
import java.util.Date;

import org.john.webLucene.lucene.IndexUtil;
import org.john.webLucene.utils.DateUtil;

import com.opensymphony.xwork.ActionSupport;

public class FileUploadAction extends ActionSupport{
    private File doc;
    private String digest;
    private String docContentType;
    private String docFileName;
	public String getDocContentType() {
		return docContentType;
	}
	public void setDocContentType(String docContentType) {
		this.docContentType = docContentType;
	}
	public String getDocFileName() {
		return docFileName;
	}
	public void setDocFileName(String docFileName) {
		this.docFileName = docFileName;
	}
	@Override
	public String execute() throws Exception {
		FileChannel srcChannel=null;
		FileChannel dstChannel=null;
		 try {
		        srcChannel = new FileInputStream(doc).getChannel();
		        dstChannel = new FileOutputStream(new File("E:/luceneDemo/data/"+docFileName)).getChannel();
		        dstChannel.transferFrom(srcChannel, 0, srcChannel.size());
		        
		    } catch (IOException e) {
		    	return this.ERROR;
		    }
		    finally{
		    	srcChannel.close();
		        dstChannel.close();
		    }
		    File temp=new File("E:/luceneDemo/data/"+docFileName);
		    if(!temp.exists()){
		    	return this.ERROR;
		    }
		    String uploadDate=DateUtil.convertDateToString(new Date());
		    File data=new File("E:/luceneDemo/data/"+docFileName);
		    IndexUtil.createIndex(data,data.getName(), uploadDate, digest);
		return this.SUCCESS;
	}
	public String getDigest() {
		return digest;
	}
	public void setDigest(String digest) {
		this.digest = digest;
	}
	public File getDoc() {
		return doc;
	}
	public void setDoc(File doc) {
		this.doc = doc;
	}
}


用Lucene构建搜素应用时最重要的2件事情就是创建索引和搜索
以下给出的是索引的构建工具类IndexUtil:
package org.john.webLucene.lucene;

import java.io.File;
import java.io.FileReader;
import java.io.IOException;
import java.io.Reader;

import org.apache.lucene.analysis.Analyzer;
import org.apache.lucene.analysis.cn.ChineseAnalyzer;
import org.apache.lucene.analysis.standard.StandardAnalyzer;
import org.apache.lucene.document.Document;
import org.apache.lucene.document.Field;
import org.apache.lucene.index.CorruptIndexException;
import org.apache.lucene.index.IndexWriter;
import org.apache.lucene.store.LockObtainFailedException;
/**
 * 
 * @author John
 * 
 *
 */
public class IndexUtil {
	/**
	 * 
	 * @param data  要被建立索引的文件
	 * @param fileName 文件明
	 * @param uploadDate 上传日期
	 * @param digest 文件内容摘要
	 * @throws CorruptIndexException
	 * @throws LockObtainFailedException
	 * @throws IOException
	 */
   public static void createIndex(File data,String fileName,
		              String uploadDate,String digest) 
                throws CorruptIndexException, LockObtainFailedException, IOException{
	      File indexDir=new File("E:/luceneDemo/indexDir");
	      boolean isEmpty=false;
	      File files[]=indexDir.listFiles();
	      if(files.length==0){
	    	  isEmpty=true;
	      }
	      Analyzer analyzer=new ChineseAnalyzer();
	      IndexWriter indexWriter=new IndexWriter(indexDir,analyzer,isEmpty);
    	  Document doc=new Document();
    	  Reader reader=new FileReader(data);
          Field f1=new Field("fileName",fileName,Field.Store.YES, Field.Index.TOKENIZED);
          Field f2=new Field("uploadDate",uploadDate,Field.Store.YES, Field.Index.TOKENIZED);
          Field f3=new Field("content",reader);
          Field f4=new Field("digest",digest,Field.Store.YES, Field.Index.TOKENIZED);
          doc.add(f1);
          doc.add(f2);
          doc.add(f3);
          doc.add(f4);
          indexWriter.addDocument(doc);
          indexWriter.optimize();
          indexWriter.close();
   }
}


在索引构建完成之后就可以对上传文件的内容进行检索下面给出Lucene中搜索的代码
SearchUtils工具类:
public class SearchUtil {
    public static Hits Search(String keyword) throws IOException, ParseException{
    	Hits hits=null;
    	File indexDir=new File("E:/luceneDemo/indexDir");
    	FSDirectory fsDirectory=FSDirectory.getDirectory(indexDir);
    	IndexSearcher indexSearch=new IndexSearcher(fsDirectory);
    	Analyzer analyzer=new ChineseAnalyzer();
    	/*Term term=new Term("content",keyword);
    	TermQuery query=new TermQuery(term);
    	hits=indexSearch.search(query);*/
    	QueryParser queryParser=new QueryParser("content",analyzer);
    	Query query=queryParser.parse(keyword); 
    	hits=indexSearch.search(query);
    	return hits;
    }
}

附件是工程的打的包,由于大小的限制没有加入相关引用包,请自行加入。

你可能感兴趣的:(java,apache,Web,Lucene,全文检索)