lucene开发关键代码

Lucene
1.创建indexWriter-- commoms-io.jar||IKAnalyzer3.2.8.jar||lucene-core3.0.3.jar|| lucene-analyzers-3.0.3.jar
      1.indexWriter indexWriter = new IndexWriter(directory, analyzer, create,maxFieldLength)
            1.1.Directory directory = FSDirectory.open(indexDir);                           
            1.2.Analyzer analyzer = new IKAnalyzer();
            1.3.boolean create = true;是否创建
            1.4.MaxFieldLength maxFieldLength = MaxFieldLength.LIMITED;
      2.Document document = new Document();遍历需要搜索的文件夹,根据文件夹内文件数目创建Document对象,每个文件对应一个document
      3.document.add(new Field("fileContent", ParseUtil.parse(file.getAbsolutePath()), Store.YES, Index.ANALYZED));
            3.1读取office:poi插件-- poi-3.7里面的全部jar包
                        word:WordExtractor                  XWPFWordExtractor
                        excel:ExcelExtractor          XSSFExcelExtractor
                        ppt:PowerPointExtractor       XSLFPowerPointExtractor
                        WordExtractor extractor =new WordExtractor(in);
                        XWPFWordExtractor extractor = new XWPFWordExtractor(OPCPackage.open(in));
            3.2读取pdf:PDFBox插件--PDFBox-0.7.3.jar||ant.jar|| bcmail-jdk14-132.jar||bcprov -jdk14-132.jar||checkstyle-all-4.2.jar||FontBox-0.1.0-dev.jar
                        PDFTextStripper PDF文件提取类:
                              public PDFTextStripper()
                              public String getText(PDDocument doc)
                        PDDocument 加载 pdf文档方法:
                              public static PDDocument load(String filename)
                              public static PDDocument load(File file)
                              public static PDDocument load(InputStream input)
      4.indexWriter.addDocument(document);
      5.indexWriter.optimize();优化
      6.indexWriter.close();


2.创建indexSearcher
      1.indexSearcher indexSearcher = new IndexSearcher(directory, readOnly);
            1.1.Directory directory = FSDirectory.open(indexDir);
            1.2.boolean readOnly = true;
      2. Query query = new MultiFieldQueryParser(Version.LUCENE_30, new String[]{"fileName","fileContent"},new IKAnalyzer()).parse("使用");
            2.1.关键字查询
                  Query query = new TermQuery(new Term("fileName", "使用"));
            2.2.短语查询
                  PhraseQuery phraseQuery = new PhraseQuery();
                  phraseQuery.add(new Term("fileContent","天"));
                  phraseQuery.add(new Term("fileContent","安"));
                  phraseQuery.add(new Term("fileContent","门"));
            2.3.模糊查询
                  WildcardQuery wildcardQuery = new WildcardQuery(new Term("fileContent","*b*"));
            2.4.范围查询
                  TermRangeQuery rangeQuery = new TermRangeQuery("fileSize",NumericUtils.longToPrefixCoded(23l),NumericUtils.longToPrefixCoded(24l), true, true);
            2.5.多条件查询
                  BooleanQuery booleanQuery =new BooleanQuery();
                  booleanQuery.add(phraseQuery, Occur.MUST);
                  booleanQuery.add(rangeQuery, Occur.MUST);
      3.TopDocs topDocs = indexSearcher.search(query, 3);
      4.ScoreDoc[] scoreDocs = topDocs.scoreDocs;
            4.1.遍历返回结果---数组:scoreDoc=====>scoreDoc.doc__scoreDoc.score
      5.Document document = indexSearcher.doc(scoreDoc.doc);根据位置得到相应对象
      6.long fileSize = NumericUtils.prefixCodedToLong(document.get("fileSize"));取得对象中的数据
      7.indexSearcher.close();

你可能感兴趣的:(Lucene,分词器,全文搜索,索引读取,索引创建)