lxwt909

Lucene5学习之SpanQuery跨度查询

SpanQuery下的子类有好几个，我就放一篇里集中说说。SpanQuery即跨度查询，首先要理解跨度这个概念，Lucene里跨度是用Spans这个类定义的，源码如下：

/** Expert: an enumeration of span matches.  Used to implement span searching.
 * Each span represents a range of term positions within a document.  Matches
 * are enumerated in order, by increasing document number, within that by
 * increasing start position and finally by increasing end position. */
public abstract class Spans {
  /** Move to the next match, returning true iff any such exists. */
  public abstract boolean next() throws IOException;

  /** Skips to the first match beyond the current, whose document number is
   * greater than or equal to <i>target</i>.
   * <p>The behavior of this method is <b>undefined</b> when called with
   * <code> target &le; current</code>, or after the iterator has exhausted.
   * Both cases may result in unpredicted behavior.
   * <p>Returns true iff there is such
   * a match.  <p>Behaves as if written: <pre class="prettyprint">
   *   boolean skipTo(int target) {
   *     do {
   *       if (!next())
   *         return false;
   *     } while (target > doc());
   *     return true;
   *   }
   * </pre>
   * Most implementations are considerably more efficient than that.
   */
  public abstract boolean skipTo(int target) throws IOException;

  /** Returns the document number of the current match.  Initially invalid. */
  public abstract int doc();

  /** Returns the start position of the current match.  Initially invalid. */
  public abstract int start();

  /** Returns the end position of the current match.  Initially invalid. */
  public abstract int end();
  
  /**
   * Returns the payload data for the current span.
   * This is invalid until {@link #next()} is called for
   * the first time.
   * This method must not be called more than once after each call
   * of {@link #next()}. However, most payloads are loaded lazily,
   * so if the payload data for the current position is not needed,
   * this method may not be called at all for performance reasons. An ordered
   * SpanQuery does not lazy load, so if you have payloads in your index and
   * you do not want ordered SpanNearQuerys to collect payloads, you can
   * disable collection with a constructor option.<br>
   * <br>
    * Note that the return type is a collection, thus the ordering should not be relied upon.
    * <br/>
   * @lucene.experimental
   *
   * @return a List of byte arrays containing the data of this payload, otherwise null if isPayloadAvailable is false
   * @throws IOException if there is a low-level I/O error
    */
  // TODO: Remove warning after API has been finalized
  public abstract Collection<byte[]> getPayload() throws IOException;

  /**
   * Checks if a payload can be loaded at this position.
   * <p/>
   * Payloads can only be loaded once per call to
   * {@link #next()}.
   *
   * @return true if there is a payload available at this position that can be loaded
   */
  public abstract boolean isPayloadAvailable() throws IOException;
  
  /**
   * Returns the estimated cost of this spans.
   * <p>
   * This is generally an upper bound of the number of documents this iterator
   * might match, but may be a rough heuristic, hardcoded value, or otherwise
   * completely inaccurate.
   */
  public abstract long cost();
}

跨度里包含了匹配Term的起始位置和结束位置信息以及跨度价值估算值以及payload信息等等。

首先要说的就是SpanTermQuery，他和TermQuery用法很相似，唯一区别就是SapnTermQuery可以得到Term的span跨度信息，用法如下：

package com.yida.framework.lucene5.query;

import java.io.IOException;

import org.apache.lucene.analysis.Analyzer;
import org.apache.lucene.analysis.standard.StandardAnalyzer;
import org.apache.lucene.document.Document;
import org.apache.lucene.document.Field;
import org.apache.lucene.document.TextField;
import org.apache.lucene.index.DirectoryReader;
import org.apache.lucene.index.IndexReader;
import org.apache.lucene.index.IndexWriter;
import org.apache.lucene.index.IndexWriterConfig;
import org.apache.lucene.index.Term;
import org.apache.lucene.index.IndexWriterConfig.OpenMode;
import org.apache.lucene.search.AutomatonQuery;
import org.apache.lucene.search.IndexSearcher;
import org.apache.lucene.search.MultiTermQuery;
import org.apache.lucene.search.ScoreDoc;
import org.apache.lucene.search.TopDocs;
import org.apache.lucene.search.spans.SpanQuery;
import org.apache.lucene.search.spans.SpanTermQuery;
import org.apache.lucene.store.Directory;
import org.apache.lucene.store.RAMDirectory;
import org.apache.lucene.util.automaton.Automata;
import org.apache.lucene.util.automaton.Automaton;
/**
 * SpanTermQuery用法测试
 * @author Lanxiaowei
 *
 */
public class SpanTermQueryTest {
	public static void main(String[] args) throws IOException {
		Directory dir = new RAMDirectory();
        Analyzer analyzer = new StandardAnalyzer();
        IndexWriterConfig iwc = new IndexWriterConfig(analyzer);
        iwc.setOpenMode(OpenMode.CREATE);
        IndexWriter writer = new IndexWriter(dir, iwc);

        Document doc = new Document();
        doc.add(new TextField("text", "the quick brown fox jumps over the lazy dog", Field.Store.YES));
        writer.addDocument(doc);
        
        doc = new Document();
        doc.add(new TextField("text", "the quick red fox jumps over the sleepy cat", Field.Store.YES));
        writer.addDocument(doc);
        
        doc = new Document();
        doc.add(new TextField("text", "the quick brown fox jumps over the lazy dog", Field.Store.YES));
        writer.addDocument(doc);
        writer.close();

        IndexReader reader = DirectoryReader.open(dir);
        IndexSearcher searcher = new IndexSearcher(reader);
        
        String queryString = "red";
        SpanQuery query = new SpanTermQuery(new Term("text",queryString));
        
        TopDocs results = searcher.search(query, null, 100);
        ScoreDoc[] scoreDocs = results.scoreDocs;
        
        for (int i = 0; i < scoreDocs.length; ++i) {
            //System.out.println(searcher.explain(query, scoreDocs[i].doc));
        	int docID = scoreDocs[i].doc;
			Document document = searcher.doc(docID);
			String path = document.get("text");
			System.out.println("text:" + path);
        }
	}
}

SpanNearQuery：用来匹配两个Term之间的跨度的，即一个Term经过几个跨度可以到达另一个Term,slop为跨度因子，用来限制两个Term之间的最大跨度，不可能一个Term和另一个Term之间要经过十万八千个跨度才到达也算两者相近，这不符合常理。所以有个slop因子进行限制。还有一个inOrder参数要引起注意，它用来设置是否允许进行倒序跨度，什么意思？即TermA到TermB不一定是从左到右去匹配也可以从右到左，而从右到左就是倒序，inOrder为true即表示order(顺序)很重要不能倒序去匹配必须正向去匹配，false则反之。注意停用词不在slop统计范围内。

Slop的理解很重要：

在默认情况下slop的值是0, 就相当于TermQuery的精确匹配, 通过设置slop参数(比如"one five"匹配"one two three four five"就需要slop=3,如果slop=2就无法得到结果。这里我们可以认为slope是单词移动得次数，可以左移或者右移。这里特别提醒,PhraseQuery不保证前后单词的次序,在上面的例子中,"two one"就需要2个slop,也就是认为one 向左边移动2位, 就是能够匹配的”one two”如果是“five three one” 就需要slope=6才能匹配。

还有一个collectPayloads参数表示是否收集payload信息，关于payload后面再单独说。

SpanNearQuery的构造函数如下：

public SpanNearQuery(SpanQuery[] clauses, int slop, boolean inOrder, boolean collectPayloads) {

    // copy clauses array into an ArrayList
    this.clauses = new ArrayList<>(clauses.length);
    for (int i = 0; i < clauses.length; i++) {
      SpanQuery clause = clauses[i];
      if (field == null) {                               // check field
        field = clause.getField();
      } else if (clause.getField() != null && !clause.getField().equals(field)) {
        throw new IllegalArgumentException("Clauses must have same field.");
      }
      this.clauses.add(clause);
    }
    this.collectPayloads = collectPayloads;
    this.slop = slop;
    this.inOrder = inOrder;
  }

SpanNearQuery使用示例：

/**
 * SpanNearQuery测试
 * @author Lanxiaowei
 *
 */
public class SpanNearQueryTest {
	public static void main(String[] args) throws IOException {
		Directory dir = new RAMDirectory();
        Analyzer analyzer = new StandardAnalyzer();
        IndexWriterConfig iwc = new IndexWriterConfig(analyzer);
        iwc.setOpenMode(OpenMode.CREATE);
        IndexWriter writer = new IndexWriter(dir, iwc);

        Document doc = new Document();
        doc.add(new TextField("text", "the quick brown fox jumps over the lazy dog", Field.Store.YES));
        writer.addDocument(doc);
        
        doc = new Document();
        doc.add(new TextField("text", "the quick red fox jumps over the sleepy cat", Field.Store.YES));
        writer.addDocument(doc);
        
        doc = new Document();
        doc.add(new TextField("text", "the quick brown fox jumps over the lazy dog", Field.Store.YES));
        writer.addDocument(doc);
        writer.close();

        IndexReader reader = DirectoryReader.open(dir);
        IndexSearcher searcher = new IndexSearcher(reader);
        
        String queryStringStart = "dog";
        String queryStringEnd = "quick";
        SpanQuery queryStart = new SpanTermQuery(new Term("text",queryStringStart));
        SpanQuery queryEnd = new SpanTermQuery(new Term("text",queryStringEnd));
        SpanQuery spanNearQuery = new SpanNearQuery(
            new SpanQuery[] {queryStart,queryEnd}, 6, false, false);
        
        TopDocs results = searcher.search(spanNearQuery, null, 100);
        ScoreDoc[] scoreDocs = results.scoreDocs;
        
        for (int i = 0; i < scoreDocs.length; ++i) {
            //System.out.println(searcher.explain(query, scoreDocs[i].doc));
        	int docID = scoreDocs[i].doc;
			Document document = searcher.doc(docID);
			String path = document.get("text");
			System.out.println("text:" + path);
        }
	}
}

示例中dog要到达quick需要经过6个跨度，需要从右至左倒序匹配，所以inOrder设置为false,如果设置为true会导致查询不出来数据。

SpanNotQuery:使用场景是当使用SpanNearQuery时，如果两个Term从TermA到TermB有多种情况，即可能出现TermA或者TermB在索引中重复出现，则可能有多种情况，SpanNotQuery就是用来限制TermA和TermB之间不存在TermC,从而排除一些情况，实现更精确的控制。默认SpanNotQuery的构造函数是这样的：

/** Construct a SpanNotQuery matching spans from <code>include</code> which
   * have no overlap with spans from <code>exclude</code>.*/
  public SpanNotQuery(SpanQuery include, SpanQuery exclude) {
     this(include, exclude, 0, 0);
  }

显然这里的第一个参数include应该是SpanNearQuery，第二个参数就是用来做排除的。

SpanNotQuery另一个重载构造函数如下：

/** Construct a SpanNotQuery matching spans from <code>include</code> which
   * have no overlap with spans from <code>exclude</code> within 
   * <code>dist</code> tokens of <code>include</code>. */
  public SpanNotQuery(SpanQuery include, SpanQuery exclude, int dist) {
     this(include, exclude, dist, dist);
  }

它多加了一个dist参数，官方的解释是：Construct a SpanNotQuery matching spans from include which have no overlap with spans from exclude within dist tokens of include. 说白了就是，使用exclude限制以后匹配到以后，TermA和TermB之间间隔的字符长度做个限制，这就是dist的作用。

SpanNotQuery还有一个更复杂的构造函数重载：

/** Construct a SpanNotQuery matching spans from <code>include</code> which
   * have no overlap with spans from <code>exclude</code> within 
   * <code>pre</code> tokens before or <code>post</code> tokens of <code>include</code>. */
  public SpanNotQuery(SpanQuery include, SpanQuery exclude, int pre, int post) {
    this.include = include;
    this.exclude = exclude;
    this.pre = (pre >=0) ? pre : 0;
    this.post = (post >= 0) ? post : 0;

    if (include.getField() != null && exclude.getField() != null && !include.getField().equals(exclude.getField()))
      throw new IllegalArgumentException("Clauses must have same field.");
  }

最后一个post参数其实就是dist，pre参数就是限制exclude Term前面有几个字符。这样解释太抽象，用示例代码来说明吧：

package com.yida.framework.lucene5.query;

import java.io.IOException;

import org.apache.lucene.analysis.Analyzer;
import org.apache.lucene.analysis.standard.StandardAnalyzer;
import org.apache.lucene.document.Document;
import org.apache.lucene.document.Field;
import org.apache.lucene.document.TextField;
import org.apache.lucene.index.DirectoryReader;
import org.apache.lucene.index.IndexReader;
import org.apache.lucene.index.IndexWriter;
import org.apache.lucene.index.IndexWriterConfig;
import org.apache.lucene.index.Term;
import org.apache.lucene.index.IndexWriterConfig.OpenMode;
import org.apache.lucene.search.IndexSearcher;
import org.apache.lucene.search.ScoreDoc;
import org.apache.lucene.search.TopDocs;
import org.apache.lucene.search.spans.SpanNearQuery;
import org.apache.lucene.search.spans.SpanNotQuery;
import org.apache.lucene.search.spans.SpanQuery;
import org.apache.lucene.search.spans.SpanTermQuery;
import org.apache.lucene.store.Directory;
import org.apache.lucene.store.RAMDirectory;

/**
 * SpanNotQuery测试
 * @author Lanxiaowei
 *
 */
public class SpanNotQueryTest {
	public static void main(String[] args) throws IOException {
		Directory dir = new RAMDirectory();
        Analyzer analyzer = new StandardAnalyzer();
        IndexWriterConfig iwc = new IndexWriterConfig(analyzer);
        iwc.setOpenMode(OpenMode.CREATE);
        IndexWriter writer = new IndexWriter(dir, iwc);

        Document doc = new Document();
        doc.add(new TextField("text", "the quick brown fox jumps over the lazy dog", Field.Store.YES));
        writer.addDocument(doc);
        
        doc = new Document();
        doc.add(new TextField("text", "the quick red fox jumps over the sleepy cat", Field.Store.YES));
        writer.addDocument(doc);
        
        doc = new Document();
        doc.add(new TextField("text", "the quick brown fox quick gox jumps over the lazy dog", Field.Store.YES));
        writer.addDocument(doc);
        
        doc = new Document();
        doc.add(new TextField("text", "the quick brown adult slave nice fox winde felt testcase gox quick jumps over the lazy dog", Field.Store.YES));
        writer.addDocument(doc);
        
        doc = new Document();
        doc.add(new TextField("text", "the quick brown fox quick jumps over the lazy dog", Field.Store.YES));
        writer.addDocument(doc);
        writer.close();

        IndexReader reader = DirectoryReader.open(dir);
        IndexSearcher searcher = new IndexSearcher(reader);
        
        String queryStringStart = "dog";
        String queryStringEnd = "quick";
        String excludeString = "fox";
        SpanQuery queryStart = new SpanTermQuery(new Term("text",queryStringStart));
        SpanQuery queryEnd = new SpanTermQuery(new Term("text",queryStringEnd));
        SpanQuery excludeQuery = new SpanTermQuery(new Term("text",excludeString));
        SpanQuery spanNearQuery = new SpanNearQuery(
            new SpanQuery[] {queryStart,queryEnd}, 12, false, false);
        
        SpanNotQuery spanNotQuery = new SpanNotQuery(spanNearQuery, excludeQuery, 4,3);
        TopDocs results = searcher.search(spanNotQuery, null, 100);
        ScoreDoc[] scoreDocs = results.scoreDocs;
        
        for (int i = 0; i < scoreDocs.length; ++i) {
            //System.out.println(searcher.explain(query, scoreDocs[i].doc));
        	int docID = scoreDocs[i].doc;
			Document document = searcher.doc(docID);
			String path = document.get("text");
			System.out.println("text:" + path);
        }
	}
}

示例代码意思就是查询dog和quick之间没有fox的索引文档，自己运行示例代码参悟吧。

SpanOrQuery顾名思义就是把多个Span'Query用or连接起来，其实你也可以用BooleanQuery来代替SpanOrQuery,但SpanOrQuery会返回额外的Span跨度信息，它的构造函数如下：

SpanOrQuery(SpanQuery... clauses)

接收多个SpanQuery对象并用or连接起来，下面是SpanOrQuery示例代码：

package com.yida.framework.lucene5.query;

import java.io.IOException;

import org.apache.lucene.analysis.Analyzer;
import org.apache.lucene.analysis.standard.StandardAnalyzer;
import org.apache.lucene.document.Document;
import org.apache.lucene.document.Field;
import org.apache.lucene.document.TextField;
import org.apache.lucene.index.DirectoryReader;
import org.apache.lucene.index.IndexReader;
import org.apache.lucene.index.IndexWriter;
import org.apache.lucene.index.IndexWriterConfig;
import org.apache.lucene.index.Term;
import org.apache.lucene.index.IndexWriterConfig.OpenMode;
import org.apache.lucene.search.IndexSearcher;
import org.apache.lucene.search.ScoreDoc;
import org.apache.lucene.search.TopDocs;
import org.apache.lucene.search.spans.SpanNearQuery;
import org.apache.lucene.search.spans.SpanNotQuery;
import org.apache.lucene.search.spans.SpanOrQuery;
import org.apache.lucene.search.spans.SpanQuery;
import org.apache.lucene.search.spans.SpanTermQuery;
import org.apache.lucene.store.Directory;
import org.apache.lucene.store.RAMDirectory;

/**
 * SpanOrQuery测试
 * @author Lanxiaowei
 *
 */
public class SpanOrQueryTest {
	public static void main(String[] args) throws IOException {
		Directory dir = new RAMDirectory();
        Analyzer analyzer = new StandardAnalyzer();
        IndexWriterConfig iwc = new IndexWriterConfig(analyzer);
        iwc.setOpenMode(OpenMode.CREATE);
        IndexWriter writer = new IndexWriter(dir, iwc);

        Document doc = new Document();
        doc.add(new TextField("text", "the quick brown fox jumps over the lazy dog", Field.Store.YES));
        writer.addDocument(doc);
        
        doc = new Document();
        doc.add(new TextField("text", "the quick red fox jumps over the sleepy cat", Field.Store.YES));
        writer.addDocument(doc);
        
        doc = new Document();
        doc.add(new TextField("text", "the quick brown fox quick gox jumps over the lazy dog", Field.Store.YES));
        writer.addDocument(doc);
        
        doc = new Document();
        doc.add(new TextField("text", "the quick brown adult slave nice fox winde felt testcase gox quick jumps over the lazy dog", Field.Store.YES));
        writer.addDocument(doc);
        
        doc = new Document();
        doc.add(new TextField("text", "the quick brown adult sick slave nice fox winde felt testcase fox quick jumps over the lazy dog", Field.Store.YES));
        writer.addDocument(doc);
        
        doc = new Document();
        doc.add(new TextField("text", "the quick brown fox quick jumps over the lazy dog", Field.Store.YES));
        writer.addDocument(doc);
        writer.close();

        IndexReader reader = DirectoryReader.open(dir);
        IndexSearcher searcher = new IndexSearcher(reader);
        
        String queryStringStart = "dog";
        String queryStringEnd = "quick";
        String excludeString = "fox";
        String termString = "sick";
        SpanQuery queryStart = new SpanTermQuery(new Term("text",queryStringStart));
        SpanQuery queryEnd = new SpanTermQuery(new Term("text",queryStringEnd));
        SpanQuery excludeQuery = new SpanTermQuery(new Term("text",excludeString));
        SpanQuery spanNearQuery = new SpanNearQuery(
            new SpanQuery[] {queryStart,queryEnd}, 12, false, false);
        
        SpanNotQuery spanNotQuery = new SpanNotQuery(spanNearQuery, excludeQuery, 4,3);
        
        SpanQuery spanTermQuery = new SpanTermQuery(new Term("text",termString));
        
        SpanOrQuery spanOrQuery = new SpanOrQuery(spanNotQuery,spanTermQuery);
        
        TopDocs results = searcher.search(spanOrQuery, null, 100);
        ScoreDoc[] scoreDocs = results.scoreDocs;
        
        for (int i = 0; i < scoreDocs.length; ++i) {
            //System.out.println(searcher.explain(query, scoreDocs[i].doc));
        	int docID = scoreDocs[i].doc;
			Document document = searcher.doc(docID);
			String path = document.get("text");
			System.out.println("text:" + path);
        }
	}
}

SpanMultiTermQueryWrapper:就是一个Query转换器，用于把MultiTermQuery包装转换成SpanQuery的，具体使用示例，我贴下官方API里提供的示例代码吧：

WildcardQuery wildcard = new WildcardQuery(new Term("field", "bro?n"));
 SpanQuery spanWildcard = new SpanMultiTermQueryWrapper<WildcardQuery>(wildcard);

SpanPositionRangeQuery:这个query是用来限制匹配的情况是否分布在(start,end)这个区间内，区间索引从零开始计算，拿示例代码说话，

package com.yida.framework.lucene5.query;

import java.io.IOException;

import org.apache.lucene.analysis.Analyzer;
import org.apache.lucene.analysis.standard.StandardAnalyzer;
import org.apache.lucene.document.Document;
import org.apache.lucene.document.Field;
import org.apache.lucene.document.TextField;
import org.apache.lucene.index.DirectoryReader;
import org.apache.lucene.index.IndexReader;
import org.apache.lucene.index.IndexWriter;
import org.apache.lucene.index.IndexWriterConfig;
import org.apache.lucene.index.Term;
import org.apache.lucene.index.IndexWriterConfig.OpenMode;
import org.apache.lucene.search.FuzzyQuery;
import org.apache.lucene.search.IndexSearcher;
import org.apache.lucene.search.ScoreDoc;
import org.apache.lucene.search.TopDocs;
import org.apache.lucene.search.spans.SpanMultiTermQueryWrapper;
import org.apache.lucene.search.spans.SpanNearQuery;
import org.apache.lucene.search.spans.SpanNotQuery;
import org.apache.lucene.search.spans.SpanPositionRangeQuery;
import org.apache.lucene.search.spans.SpanQuery;
import org.apache.lucene.search.spans.SpanTermQuery;
import org.apache.lucene.store.Directory;
import org.apache.lucene.store.RAMDirectory;

/**
 * SpanPositionRangeQuery测试
 * @author Lanxiaowei
 *
 */
public class SpanPositionRangeQueryTest {
	public static void main(String[] args) throws IOException {
		Directory dir = new RAMDirectory();
        Analyzer analyzer = new StandardAnalyzer();
        IndexWriterConfig iwc = new IndexWriterConfig(analyzer);
        iwc.setOpenMode(OpenMode.CREATE);
        IndexWriter writer = new IndexWriter(dir, iwc);

        Document doc = new Document();
        doc.add(new TextField("text", "quick brown fox", Field.Store.YES));
        writer.addDocument(doc);
        
        doc = new Document();
        doc.add(new TextField("text", "jumps over lazy broun dog", Field.Store.YES));
        writer.addDocument(doc);
        
        doc = new Document();
        doc.add(new TextField("text", "jumps over extremely very lazy broxn dog", Field.Store.YES));
        writer.addDocument(doc);
        
        
        writer.close();

        IndexReader reader = DirectoryReader.open(dir);
        IndexSearcher searcher = new IndexSearcher(reader);
        
        FuzzyQuery fq = new FuzzyQuery(new Term("text", "broan"));
        SpanQuery sfq = new SpanMultiTermQueryWrapper<FuzzyQuery>(fq);
        
        SpanPositionRangeQuery spanPositionRangeQuery = new SpanPositionRangeQuery(sfq, 3, 5);
        
        TopDocs results = searcher.search(spanPositionRangeQuery, null, 100);
        ScoreDoc[] scoreDocs = results.scoreDocs;
        
        for (int i = 0; i < scoreDocs.length; ++i) {
            //System.out.println(searcher.explain(query, scoreDocs[i].doc));
        	int docID = scoreDocs[i].doc;
			Document document = searcher.doc(docID);
			String path = document.get("text");
			System.out.println("text:" + path);
        }
	}
}

稍微解释下上面的代码，首先呢，FuzzyQuery fq = new FuzzyQuery(new Term("text", "broan"));用来查询包含跟单词broan相似字符的索引文档，显然第一个索引文档不符合排除了一个，然后呢，我们new了一个SpanQuery包装器Wrapper，把FuzzyQuery转换成了SpanQuery,然后使用SpanPositionRangeQuery对匹配到的2种情况的落放的位置进行限制即跟broan相似的单词必须分布在(3,5)这个区间内，显然第3个索引文档是分布在(3,6)这个区间内，所以第3个索引文档被排除了，最后只返回第2个索引文档。

SpanPositionRangeQuery还有个子类SpanFirstQuery,其实SpanFirstQuery只不过是把SpanPositionRangeQuery构造函数里的start参数值设置为0，仅此而已，所以不用多说，你也懂的，它的构造函数如下：

SpanFirstQuery(SpanQuery match, int end) 
Construct a SpanFirstQuery matching spans in match whose end position is less than or equal to end.

这也就是为什么只有一个end,没有start,因为start默认为零，看源码：

SpanFirstQuery示例我就不提供了，略过。

最后一个要说的就是FieldMaskingSpanQuery，它用于在多个域之间查询，即把另一个域看作某个域，从而看起来就像在同一个域里查询，因为Lucene默认某个条件只能作用在单个域上，不支持跨域查询只能在同一个域里查询，所以有了FieldMaskingSpanQuery，，下面是示例代码：

package com.yida.framework.lucene5.query;

import java.io.IOException;

import org.apache.lucene.analysis.Analyzer;
import org.apache.lucene.analysis.standard.StandardAnalyzer;
import org.apache.lucene.document.Document;
import org.apache.lucene.document.Field;
import org.apache.lucene.index.DirectoryReader;
import org.apache.lucene.index.IndexReader;
import org.apache.lucene.index.IndexWriter;
import org.apache.lucene.index.IndexWriterConfig;
import org.apache.lucene.index.IndexWriterConfig.OpenMode;
import org.apache.lucene.index.Term;
import org.apache.lucene.search.IndexSearcher;
import org.apache.lucene.search.Query;
import org.apache.lucene.search.ScoreDoc;
import org.apache.lucene.search.TopDocs;
import org.apache.lucene.search.spans.FieldMaskingSpanQuery;
import org.apache.lucene.search.spans.SpanNearQuery;
import org.apache.lucene.search.spans.SpanQuery;
import org.apache.lucene.search.spans.SpanTermQuery;
import org.apache.lucene.store.Directory;
import org.apache.lucene.store.RAMDirectory;

/**
 * FieldMaskingSpanQuery测试
 * @author Lanxiaowei
 *
 */
public class FieldMaskingSpanQueryTest {
	public static void main(String[] args) throws IOException {
		Directory dir = new RAMDirectory();
        Analyzer analyzer = new StandardAnalyzer();
        IndexWriterConfig iwc = new IndexWriterConfig(analyzer);
        iwc.setOpenMode(OpenMode.CREATE);
        IndexWriter writer = new IndexWriter(dir, iwc);

        Document doc = new Document();

        doc.add(new Field("teacherid", "1", Field.Store.YES, Field.Index.NOT_ANALYZED));

        doc.add(new Field("studentfirstname", "james", Field.Store.YES, Field.Index.NOT_ANALYZED));
        
        doc.add(new Field("studentsurname", "jones", Field.Store.YES, Field.Index.NOT_ANALYZED));

        writer.addDocument(doc);
        
        
        //teacher2
        doc = new Document();

        doc.add(new Field("teacherid", "2", Field.Store.YES, Field.Index.NOT_ANALYZED));

        doc.add(new Field("studentfirstname", "james", Field.Store.YES, Field.Index.NOT_ANALYZED));

        doc.add(new Field("studentsurname", "smith", Field.Store.YES, Field.Index.NOT_ANALYZED));

        doc.add(new Field("studentfirstname", "sally", Field.Store.YES, Field.Index.NOT_ANALYZED));

        doc.add(new Field("studentsurname", "jones", Field.Store.YES, Field.Index.NOT_ANALYZED));

        writer.addDocument(doc);
        
        writer.close();

        IndexReader reader = DirectoryReader.open(dir);
        IndexSearcher searcher = new IndexSearcher(reader);
        
        SpanQuery q1  = new SpanTermQuery(new Term("studentfirstname", "james"));
        SpanQuery q2  = new SpanTermQuery(new Term("studentsurname", "jones"));
        
        SpanQuery q2m = new FieldMaskingSpanQuery(q2, "studentfirstname");

        Query query = new SpanNearQuery(new SpanQuery[]{q1, q2m}, -1, false);
        TopDocs results = searcher.search(query, null, 100);
        ScoreDoc[] scoreDocs = results.scoreDocs;
        
        for (int i = 0; i < scoreDocs.length; ++i) {
            //System.out.println(searcher.explain(query, scoreDocs[i].doc));
        	int docID = scoreDocs[i].doc;
			Document document = searcher.doc(docID);
			String teacherid = document.get("teacherid");
			System.out.println("teacherid:" + teacherid);
        }
	}
}

OK，SpanQuery就说这么多，接下来要说的就是PhraseQuery。

如果你还有什么问题请加我Ｑ-Q：7-3-6-0-3-1-3-0-5，

或者加裙
一起交流学习！

你可能感兴趣的:(Lucene,SpanQuery)

ElasticSearch 谈谈你对段合并的策略思想的认识用心去追梦 elasticsearch 大数据搜索引擎
段合并是Elasticsearch中的一个重要概念，它在数据索引和查询过程中起着关键的作用。Elasticsearch使用Lucene作为其全文搜索库，Lucene中使用的数据结构就是段（Segment）合并。段合并的策略思想主要体现在以下几个方面：提高查询性能：在Elasticsearch中，段合并的过程可以看作是对索引进行优化，通过合并将多个小的段合并成一个大的段，这样可以减少内存的使用，提高
ES架构及原理李澎昆 ES ES
Elasticsearch是一个兼有搜索引擎和NoSQL数据库功能的开源系统，基于Java/Lucene构建，可以用于全文搜索，结构化搜索以及近实时分析。说明：Lucene：只是一个框架，要充分利用它的功能，需要使用JAVA，并且在程序中集成Lucene，学习成本高，Lucene确实非常复杂。Elasticsearch是面向文档型数据库，这意味着它存储的是整个对象或者文档，它不但会存储它们，还会为
Elasticsearch段合并喵喵喵更多 java 运维分布式后端
欢迎访问本人博客查看原文：http://wangnan.techelasticsearch中每个索引都会创建一个到多个分片和零个到多个副本，这些分片或副本实质上都是lucene索引lucene索引是基于多个索引段创建，索引文件中绝大部分数据都是只写一次，读多次，而只有用于保存文档删除信息的文件才会被多次更改在某些时刻，当某种条件满足时，多个索引段会被拷贝合并到一个更大的索引段，而那些旧的索引段会被
Lucece评分公式OKapi BM25原理解析(中) 双人余_先生
背景：延续上篇写了TF/IDF的公式解析，本篇为BM25解析简单介绍。BM25起源于概率相关性模型，而不是矢量空间模型，但是该算法与Lucene的实际评分功能有很多共同点。两者都使用Term词频率，逆文档频率和字段长度归一化，但是每个因素的定义都略有不同。与其详细解释BM25公式，不如将重点放在BM25提供的实际优势上。BM25是一个词袋检索功能，它基于每个文档中出现的查询词对一组文档进行排名，而
分布式搜索引擎Elasticsearch——基础敲代码的旺财架构进阶 elasticsearch java 搜索引擎 ES-head
文章目录一、Lucene与Solr与Elasticsearch二、ES核心术语三、ES核心概念四、倒排索引五、ES的安装（centos7）1、下载地址（这里安装linux版本）2、解压压缩包3、修改配置文件(1)修改核心配置文件(2)修改JVM配置文件4、启动ES(1)添加系统用户并授权(2)ES启动(3)修改配置文件(4)再次启动ES六、安装ES-head插件（可视化管理插件）1、使用谷歌市场安
docker部署Elasticsearch和Kibana youm. docker docker elasticsearch 容器
1.Elasticsearch和Kibana介绍1.1什么是Elasticsearch？Elasticsearch是一个开源的分布式搜索和分析引擎，用于处理大规模数据的实时搜索、分析和存储。它构建在ApacheLucene搜索引擎库的基础上，提供了一个RESTfulAPI和易于使用的工具，使得在大数据量情况下进行搜索和分析变得高效和简单。1.2为什么使用Elasticsearch？Elastics
Elasticsearch中文本字段与关键字字段的聚合和排序问题好奇的菜鸟 Elasticsearch elasticsearch 大数据搜索引擎
引言Elasticsearch是一个强大的搜索引擎，它基于Lucene构建，提供了全文搜索、分析、聚合等功能。然而，在使用Elasticsearch时，我们可能会遇到一些特定的问题，比如在文本字段上进行聚合和排序操作时出现的错误。本文将详细解释这个问题，并提供解决方案。问题概述在使用Elasticsearch进行数据分析时，我们可能会尝试对文本字段进行聚合或排序。但是，Elasticsearch默
单机安装 ELK 日志分析系统 TheFlsah Linux
一、ELK介绍ELKStack是软件集合Elasticsearch、Logstash、Kibana的简称，它们都是开源软件。新增了一个FileBeat，它是一个轻量级的日志收集处理工具(Agent)，Filebeat占用资源少，适合于在各个服务器上搜集日志后传输给Logstash，官方也推荐此工具。Elasticsearch是一个基于Lucene的、支持全文索引的分布式存储和索引引擎，主要负责将日
Elastic Search常用命令胖毁青春，瘦解百病 ES es
1测试环境信息ElasticSearch服务器：192.168.0.100用户：docker启停：dockerstart/stop/restartelasticsearchKibana控制台：http://192.168.0.100:5601/app/kibana#/dev_tools/console2基本概念Elasticsearch也是基于Lucene的全文检索库，本质也是存储数据，很多概念与
ELK离线安装和配置流程 GB9125 运维开发 elasticsearch elk linux 运维开发
ELK离线安装和配置流程一、介绍ELK是一个开源的数据分析和可视化工具，由三个开源项目组成：Elasticsearch、Logstash和Kibana。Elasticsearch是一个基于Lucene库的分布式搜索和分析引擎；Logstash是一个用于收集、处理和转换数据的数据管道，它可以从各种来源读取数据，包括日志文件、系统事件、网络流量等；Kibana则是一个数据可视化平台，可以对从Elast
Elasticsearch详解es 思静语 elasticsearch elasticsearch 大数据搜索引擎
文章目录概述es架构为什么要使用ElasticSearchElasticSearch的优势使用场景es为什么这么快倒排索引如何保证ES和数据库的数据一致性监听binlog同步双写elasticsearch是如何实现master选举的Elasticsearch与Solr的区别概述ES全称是ElasticSearch，它是一个建立在全文搜索引擎库Lucene基础上的开源搜索和分析引擎。ES它本身具有分
Java——ikanalyzer分词·只用自定义词库 weixin_30902251 java 数据库 c/c++
需要包：IKAnalyzer2012_FF_hf1.jarlucene-core-5.5.4.jar需要文件：IKAnalyzer.cfg.xmlext.dicstopword.dic整理好的下载地址：http://download.csdn.net/detail/talkwah/9770635importjava.io.IOException;importjava.io.StringReader
Lucene实现自定义中文同义词分词器 WangJonney Lucene Lucene
----------------------------------------------------------lucene的分词_中文分词介绍----------------------------------------------------------Paoding:庖丁解牛分词器。已经没有更新了mmseg:使用搜狗的词库1.导入包（有两个包：1.带dic的，2.不带dic的）如果使用
选型搜索引擎之参考Elasticsearch 剑飞的编程思维 elasticsearch
简介Elasticsearch（简称ES）是一个基于ApacheLucene的开源、分布式、RESTful接口的全文搜索引擎。其设计用于云计算环境，能够达到实时搜索、稳定、可靠、快速、安装使用方便的效果。Elasticsearch是用Java开发的，并作为Apache许可条款下的开放源码发布，是当前流行的企业级搜索引擎。Elasticsearch的特点包括：分布式存储和搜索：Elasticsear
从入门到精通：Elasticsearch开发实践教程青年老年程序员 Elasticsearch学习 elasticsearch jenkins 大数据
Elasticsearch是一款开源的搜索引擎，它使用Lucene搜索库作为其核心搜索引擎。Elasticsearch使用RESTfulAPI进行交互，并支持多种数据类型的搜索和聚合。本教程将介绍Elasticsearch的基本原理，如何开发，以及如何在SpringBoot中使用Elasticsearch。Elasticsearch的原理Elasticsearch是一个分布式的文档存储和搜索引擎。
深入理解Lucene：开源全文搜索引擎的核心技术解析一休哥助手分布式系统算法搜索引擎 lucene 开源
1.介绍Lucene是什么？Lucene是一个开源的全文搜索引擎库，提供了强大的文本搜索和检索功能。它由Apache软件基金会维护和开发，采用Java语言编写，因其高性能、可扩展性和灵活性而备受欢迎。Lucene的作用和应用场景Lucene主要用于创建全文索引和执行文本搜索。其主要作用包括但不限于：在大型文本数据集中快速进行文本搜索和检索。实现网站、应用程序或系统中的搜索功能。构建文档管理系统、知
Elasticsearch基础知识与架构概述禅与计算机程序设计艺术 elasticsearch 架构 jenkins 大数据搜索引擎
1.背景介绍Elasticsearch是一个基于分布式搜索和分析引擎，它可以处理大量数据并提供实时搜索功能。在本文中，我们将深入了解Elasticsearch的基础知识和架构概述，并探讨其核心概念、算法原理、最佳实践、实际应用场景和未来发展趋势。1.背景介绍Elasticsearch是一款开源的搜索引擎，由ElasticCorporation开发。它基于Lucene库，具有高性能、可扩展性和实时性
视野 | OpenSearch，云厂商的新选择？ RadonDB 数据库搜索引擎 elasticsearch
王奇顾问软件工程师目前从事PaaS中间件服务（Redis/MongoDB/ELK等）开发工作，对NoSQL数据库有深入的研究以及丰富的二次开发经验，热衷对NoSQL数据库领域内的最新技术动态的学习，能够把握行业技术发展趋势。|最流行的全文搜索引擎Elasticsearch是一款广泛使用的开源分布式全文搜索引擎，源于ApacheLucene[1]，许可证为Apache2.0。由于出色的搜索引擎、高扩
Elasticsearch使用场景说明车马去闲闲丶 elasticsearch 大数据搜索引擎
Elasticsearch是一个基于Lucene的搜索服务器。它提供了一个分布式多租户能力的全文搜索引擎，基于RESTfulweb接口。Elasticsearch是用Java开发的，并作为Apache许可条款下的开放源码发布，是当前流行的企业级搜索引擎。它设计用于云计算中，能够达到实时搜索，稳定，可靠，快速，安装使用方便。以下是一些Elasticsearch的常见使用场景：全文搜索：Elastic
ElasticSearch学习笔记重生之Java再爱我一次 elasticsearch 学习笔记
ElasticSearch一、初识ES1.什么是ElasticSearch？ES的概念：ElasticSearch是一款非常强大的开源搜索引擎，可以帮助我们从海量数据中快速找到需要的内容。ElasticSearch结合Kibana、LogStach、Beats，也就是ElasticStack（ELK）。被广泛应用在日志数据分析、实时监控等领域。ES的发展：Lucene是一个Java语言的搜索引擎类
solr —— 1 全文检索Solr8.0第一部分苏打饼干没加心 solr
solr，毕设啊，快被写完吧1solr介绍什么是solrLucene与Solr与ES为什么要用slor2HelloWorld2.1项目安装部署2.2项目安装配置创建核心创建document(表)添加文件查询数据3solr后台管理页面详解控制面板5全文检索千万级别数据实战，全面剖析架构设计，大数据瓶颈突破6数据库导入索引BV1Dt411G7eF1solr介绍什么是solrsolr简化了程序员的操作L
（三十七）大数据实战——Solr服务的部署安装厉害哥哥吖大数据大数据 solr
前言Solr是一个基于ApacheLucene的开源搜索平台，它提供了强大的全文搜索、分布式搜索和数据分析功能。Solr可以用于构建高性能的搜索应用程序，支持从海量数据中快速检索和分析信息。Solr使用倒排索引和先进的搜索算法，可实现快速而准确的全文搜索。Solr可以在多个服务器上进行水平扩展，实现分布式搜索和负载均衡。Solr支持复杂的过滤、排序和范围查询，使您可以根据各种条件对搜索结果进行精确
《ElasticSearch技术解析与实战-朱林》云澜哥哥 ElasticSearch elasticsearch big data
《第一章：ElasticSearch入门》ElasticSearch简介：ElasticSearch是一个基于lucener构建的开源的，分布式的，resultful接口全文搜索引擎。ElasticSearch是一个分布式文档数据库。其中每个字段都是可以被索引的数据且可被搜索。ElasticSearch能够扩展到数以百计的服务器存储以及处理PB级的数据，它可以在很短的时间内存储，搜索，分析大量的数
阿里P8架构师谈：开源搜索引擎Lucene、Solr、Sphinx等优劣势比较 liuhuiteng 中间件中间件
开源搜索引擎分类1.Lucene系搜索引擎，java开发,包括：LuceneSolrElasticsearchKatta、Compass等都是基于Lucene封装。你可以想象Lucene系有多强大。2.Sphinx搜素引擎，c++开发,简单高性能。以下重点介绍最常用的开源搜素引擎：Lucene、Solr、Elasticsearch、Sphinx的特点和优劣势选型比较。Lucene1.Lucene简
16款开源的全文搜索引擎网络安全乔妮娜开源搜索引擎网络安全 web安全数据库安全前端
网络安全重磅福利：入门&进阶全套282G学习资源包免费分享！全文搜索引擎就是通过从互联网上提取的各个网站的信息（以网页文字为主）而建立的数据库中，检索与用户查询条件匹配的相关记录，然后按一定的排列顺序将结果返回给用户。1、ApacheLuceneJava全文搜索框架许可证：Apache-2.0开发语言：Java官网：https://lucene.apache.org/ApacheLucene是完全
Lucene初识 KhaosYang
Lucene是一种高性能、可伸缩的信息搜索（IR）库，在2000年开源，最初由鼎鼎大名的DougCutting开发，是基于Java实现的高性能的开源项目。Lucene采用了基于倒排表的设计原理，可以非常高效地实现文本查找，在底层采用了分段的存储模式，使它在读写时几乎完全避免了锁的出现，大大提升了读写性能。核心模块Lucene的写流程和读流程如图1所示。1.Lucene读写流程图其中，虚线箭头（A、
03-03 elasticsearch nan得糊涂
入门篇使用场景海量存储：支持分布式存储实时搜索：lucene倒排索引，海量数据下近乎实时搜索a.日志分析，es+logstash+kibanab.Github代码数据分析：支持数据分析及处理基本功能分布式的搜索引擎和数据分析引擎全文检索，结构化检索，数据分析海量数据实时处理根据这些功能，可以实现的使用场景某张表有海量数据，需要实时快速查询数据分析带来的问题ES用在海量数据实时查询，基本的数据分析等
Error CREATEing SolrCore 'index': Unable to create core: index Caused by: No enum constant org.apach 杉斯狼后台 Java solr enum 索引 lucene
ErrorCREATEingSolrCore'index':Unabletocreatecore:indexCausedby:Noenumconstantorg.apache.lucene.util.Version.LUCENE_48出错原因：solr版本配置不正确解决方法：在索引文件的目录下conf>solrconfig.xml4.8将4.8修改为4.7（你具体的版本，可以参照collectio
Elasticsearch的使用场景深入详解 Y T elasticsearch
Elasticsearch是一个基于Lucene的开源搜索引擎，它提供了一个分布式多用户能力，能够处理PB级别的结构化或非结构化数据。Elasticsearch的设计目标是实现一个可扩展的搜索解决方案，它适用于多种使用场景，以下是一些深入的使用场景详解：1.日志分析与监控Elasticsearch与Logstash和Kibana（统称为ELKStack）结合使用，可以构建强大的日志分析平台。它能够
Elasticsearch—概念、安装和配置 Sunflow007
13.jpg前言：Elasticsearch是一款很火热的，很优秀的，基于lucene的开源的分布式的搜索引擎，话不多说，本篇文章主要是Elasticsearch基本概念介绍、安装和配置。Elasticsearch的基本概念官方文档——BasicConcepts|ElasticsearchReference[6.4]|Elastic我们在学习关系型数据库和服务器的时候，接触到了一些概念如：data
[星球大战]阿纳金的背叛 comsci
本来杰迪圣殿的长老是不同意让阿纳金接受训练的......... 但是由于政治原因,长老会妥协了...这给邪恶的力量带来了机会所以......现代的地球联邦接受了这个教训...绝对不让某些年轻人进入学院
看懂它，你就可以任性的玩耍了！ aijuans JavaScript
javascript作为前端开发的标配技能，如果不掌握好它的三大特点：1.原型 2.作用域 3. 闭包 ,又怎么可以说你学好了这门语言呢？如果标配的技能都没有撑握好，怎么可以任性的玩耍呢？怎么验证自己学好了以上三个基本点呢，我找到一段不错的代码，稍加改动，如果能够读懂它，那么你就可以任性了。 function jClass(b
Java常用工具包 Jodd Kai_Ge java jodd
Jodd 是一个开源的 Java 工具集，包含一些实用的工具类和小型框架。简单，却很强大！写道 Jodd = Tools + IoC + MVC + DB + AOP + TX + JSON + HTML < 1.5 Mb Jodd 被分成众多模块，按需选择，其中工具类模块有： jodd-core &nb
SpringMvc下载 120153216 springMVC
@RequestMapping(value = WebUrlConstant.DOWNLOAD) public void download(HttpServletRequest request,HttpServletResponse response,String fileName) { OutputStream os = null; InputStream is = null;
Python 标准异常总结 2002wmj python
Python标准异常总结 AssertionError 断言语句（assert）失败 AttributeError 尝试访问未知的对象属性 EOFError 用户输入文件末尾标志EOF（Ctrl+d） FloatingPointError 浮点计算错误 GeneratorExit generator.close()方法被调用的时候 ImportError 导入模块失
SQL函数返回临时表结构的数据用于查询 357029540 SQL Server
这两天在做一个查询的SQL，这个SQL的一个条件是通过游标实现另外两张表查询出一个多条数据，这些数据都是INT类型，然后用IN条件进行查询，并且查询这两张表需要通过外部传入参数才能查询出所需数据，于是想到了用SQL函数返回值，并且也这样做了，由于是返回多条数据，所以把查询出来的INT类型值都拼接为了字符串，这时就遇到问题了，在查询SQL中因为条件是INT值，SQL函数的CAST和CONVERST都
java 时间格式化 | 比较大小| 时区个人笔记 7454103 java eclipse tomcat c MyEclipse
个人总结！不当之处多多包含！引用 1.0 如何设置 tomcat 的时区：位置：(catalina.bat---JAVA_OPTS 下面加上) set JAVA_OPT
时间获取Clander的用法 adminjun Clander 时间
/** * 得到几天前的时间 * @param d * @param day * @return */ public static Date getDateBefore(Date d,int day){ Calend
JVM初探与设置 aijuans java
JVM是Java Virtual Machine（Java虚拟机）的缩写，JVM是一种用于计算设备的规范，它是一个虚构出来的计算机，是通过在实际的计算机上仿真模拟各种计算机功能来实现的。Java虚拟机包括一套字节码指令集、一组寄存器、一个栈、一个垃圾回收堆和一个存储方法域。 JVM屏蔽了与具体操作系统平台相关的信息，使Java程序只需生成在Java虚拟机上运行的目标代码（字节码）,就可以在多种平台
SQL中ON和WHERE的区别 avords
SQL中ON和WHERE的区别数据库在通过连接两张或多张表来返回记录时，都会生成一张中间的临时表，然后再将这张临时表返回给用户。 www.2cto.com 在使用left jion时，on和where条件的区别如下： 1、 on条件是在生成临时表时使用的条件，它不管on中的条件是否为真，都会返回左边表中的记录。
说说自信 houxinyou 工作生活
自信的来源分为两种,一种是源于实力,一种源于头脑.实力是一个综合的评定,有自身的能力,能利用的资源等.比如我想去月亮上,要身体素质过硬,还要有飞船等等一系列的东西.这些都属于实力的一部分.而头脑不同,只要你头脑够简单就可以了!同样要上月亮上,你想,我一跳,1米,我多跳几下,跳个几年,应该就到了!什么?你说我会往下掉?你笨呀你!找个东西踩一下不就行了吗? 无论工作还
WEBLOGIC事务超时设置 bijian1013 weblogic jta 事务超时
系统中统计数据，由于调用统计过程，执行时间超过了weblogic设置的时间，提示如下错误：统计数据出错! 原因：The transaction is no longer active - status: 'Rolling Back. [Reason=weblogic.transaction.internal
两年已过去，再看该如何快速融入新团队 bingyingao java 互联网融入架构新团队
偶得的空闲，翻到了两年前的帖子该如何快速融入一个新团队，有所感触，就记下来，为下一个两年后的今天做参考。时隔两年半之后的今天，再来看当初的这个博客，别有一番滋味。而我已经于今年三月份离开了当初所在的团队，加入另外的一个项目组，2011年的这篇博客之后的时光，我很好的融入了那个团队，而直到现在和同事们关系都特别好。大家在短短一年半的时间离一起经历了一
【Spark七十七】Spark分析Nginx和Apache的access.log bit1129 apache
Spark分析Nginx和Apache的access.log，第一个问题是要对Nginx和Apache的access.log文件进行按行解析，按行解析就的方法是正则表达式： Nginx的access.log解析正则表达式 val PATTERN = """([^ ]*) ([^ ]*) ([^ ]*) (\\[.*\\]) (\&q
Erlang patch bookjovi erlang
Totally five patchs committed to erlang otp, just small patchs. IMO, erlang really is a interesting programming language, I really like its concurrency feature. but the functional programming style
log4j日志路径中加入日期 bro_feng java log4j
要用log4j使用记录日志，日志路径有每日的日期，文件大小5M新增文件。实现方式 log4j: <appender name="serviceLog" class="org.apache.log4j.RollingFileAppender"> <param name="Encoding" v
读《研磨设计模式》-代码笔记-桥接模式 bylijinnan java 设计模式
声明：本文只为方便我个人查阅和理解，详细的分析以及源代码请移步原作者的博客http://chjavach.iteye.com/ /** * 个人觉得关于桥接模式的例子，蜡笔和毛笔这个例子是最贴切的：http://www.cnblogs.com/zhenyulu/articles/67016.html * 笔和颜色是可分离的，蜡笔把两者耦合在一起了：一支蜡笔只有一种
windows7下SVN和Eclipse插件安装 chenyu19891124 eclipse插件
今天花了一天时间弄SVN和Eclipse插件的安装，今天弄好了。svn插件和Eclipse整合有两种方式，一种是直接下载插件包，二种是通过Eclipse在线更新。由于之前Eclipse版本和svn插件版本有差别，始终是没装上。最后在网上找到了适合的版本。所用的环境系统：windows7JDK：1.7svn插件包版本：1.8.16Eclipse：3.7.2工具下载地址：Eclipse下在地址：htt
[转帖]工作流引擎设计思路 comsci 设计模式工作应用服务器 workflow 企业应用
作为国内的同行，我非常希望在流程设计方面和大家交流，刚发现篇好文(那么好的文章，现在才发现，可惜)，关于流程设计的一些原理，个人觉得本文站得高，看得远，比俺的文章有深度，转载如下 ================================================================================= 自开博以来不断有朋友来探讨工作流引擎该如何
Linux 查看内存，CPU及硬盘大小的方法 daizj linux cpu 内存硬盘大小
一、查看CPU信息的命令 [root@R4 ~]# cat /proc/cpuinfo |grep "model name" && cat /proc/cpuinfo |grep "physical id" model name : Intel(R) Xeon(R) CPU X5450 @ 3.00GHz model name :
linux 踢出在线用户 dongwei_6688 linux
两个步骤： 1.用w命令找到要踢出的用户，比如下面： [root@localhost ~]# w 18:16:55 up 39 days, 8:27, 3 users, load average: 0.03, 0.03, 0.00 USER TTY FROM LOGIN@ IDLE JCPU PCPU WHAT
放手吧,就像不曾拥有过一样 dcj3sjt126com
内容提要：静悠悠编著的《放手吧就像不曾拥有过一样》集结“全球华语世界最舒缓心灵”的精华故事，触碰生命最深层次的感动，献给全世界亿万读者。《放手吧就像不曾拥有过一样》的作者衷心地祝愿每一位读者都给自己一个重新出发的理由，将那些令你痛苦的、扛起的、背负的，一并都放下吧！把憔悴的面容换做一种清淡的微笑，把沉重的步伐调节成春天五线谱上的音符，让自己踏着轻快的节奏，在人生的海面上悠然漂荡，享受宁静与
php二进制安全的含义 dcj3sjt126com PHP
PHP里，有string的概念。 string里，每个字符的大小为byte（与PHP相比，Java的每个字符为Character，是UTF8字符，C语言的每个字符可以在编译时选择）。 byte里，有ASCII代码的字符，例如ABC，123，abc，也有一些特殊字符，例如回车，退格之类的。特殊字符很多是不能显示的。或者说，他们的显示方式没有标准，例如编码65到哪儿都是字母A，编码97到哪儿都是字符
Linux下禁用T440s，X240的一体化触摸板(touchpad) gashero linux ThinkPad 触摸板
自打1月买了Thinkpad T440s就一直很火大，其中最让人恼火的莫过于触摸板。 Thinkpad的经典就包括用了小红点(TrackPoint)。但是小红点只能定位，还是需要鼠标的左右键的。但是自打T440s等开始启用了一体化触摸板，不再有实体的按键了。问题是要是好用也行。实际使用中，触摸板一堆问题，比如定位有抖动，以及按键时会有飘逸。这就导致了单击经常就
graph_dfs hcx2013 Graph
package edu.xidian.graph; class MyStack { private final int SIZE = 20; private int[] st; private int top; public MyStack() { st = new int[SIZE]; top = -1; } public void push(i
Spring4.1新特性——Spring核心部分及其他 jinnianshilongnian spring 4.1
目录 Spring4.1新特性——综述 Spring4.1新特性——Spring核心部分及其他 Spring4.1新特性——Spring缓存框架增强 Spring4.1新特性——异步调用和事件机制的异常处理 Spring4.1新特性——数据库集成测试脚本初始化 Spring4.1新特性——Spring MVC增强 Spring4.1新特性——页面自动化测试框架Spring MVC T
配置HiveServer2的安全策略之自定义用户名密码验证 liyonghui160com
具体从网上看 http://doc.mapr.com/display/MapR/Using+HiveServer2#UsingHiveServer2-ConfiguringCustomAuthentication LDAP Authentication using OpenLDAP Setting
一位30多的程序员生涯经验总结 pda158 编程工作生活咨询
1.客户在接触到产品之后，才会真正明白自己的需求。　　这是我在我的第一份工作上面学来的。只有当我们给客户展示产品的时候，他们才会意识到哪些是必须的。给出一个功能性原型设计远远比一张长长的文字表格要好。 2.只要有充足的时间，所有安全防御系统都将失败。　　安全防御现如今是全世界都在关注的大课题、大挑战。我们必须时时刻刻积极完善它，因为黑客只要有一次成功，就可以彻底打败你。 3.
分布式web服务架构的演变自由的奴隶 linux Web 应用服务器互联网
最开始，由于某些想法，于是在互联网上搭建了一个网站，这个时候甚至有可能主机都是租借的，但由于这篇文章我们只关注架构的演变历程，因此就假设这个时候已经是托管了一台主机，并且有一定的带宽了，这个时候由于网站具备了一定的特色，吸引了部分人访问，逐渐你发现系统的压力越来越高，响应速度越来越慢，而这个时候比较明显的是数据库和应用互相影响，应用出问题了，数据库也很容易出现问题，而数据库出问题的时候，应用也容易
初探Druid连接池之二——慢SQL日志记录 xingsan_zhang 日志连接池 druid 慢SQL
由于工作原因，这里先不说连接数据库部分的配置，后面会补上，直接进入慢SQL日志记录。 1.applicationContext.xml中增加如下配置： <bean abstract="true" id="mysql_database" class="com.alibaba.druid.pool.DruidDataSourc