lucene计算得分的时候要*coord,那这个coord是什么意思呢。就是这个document满足了多少个查询条件。如:
Document doc = new Document();
doc.add(new Field("title", "search engine", Field.Store.YES, Field.Index.ANALYZED));
doc.add(new Field("content", "good lucene luke lucene search server", Field.Store.YES, Field.Index.ANALYZED));
indexWriter.addDocument(doc);
当查询条件为title:search content:lucene的时候,
在title这个Field中("search engine")匹配上title:search,
在content这个Field中("good lucene luke lucene search server")匹配上content:lucene,也就是这个document满足所有的查询条件,coord为1.
公式是这样的:
public float coord(int overlap, int maxOverlap) {
return overlap / (float)maxOverlap;
}
源码的来龙去脉是这样的。
BooleanScorer2中
public float score() throws IOException {
coordinator.initDoc();
float sum = countingSumScorer.score();
return sum * coordinator.coordFactor();
}
float coordFactor() {
return coordFactors[nrMatchers];
}
private float[] coordFactors = null;
void init() { // use after all scorers have been added.
coordFactors = new float[maxCoord + 1];
Similarity sim = getSimilarity();
for (int i = 0; i <= maxCoord; i++) {
coordFactors[i] = sim.coord(i, maxCoord);
}
}
maxCoord就是查询语句中一共有多少个查询条件。
public void add(final Scorer scorer, boolean required, boolean prohibited) {
if (!prohibited) {
coordinator.maxCoord++;
}
可以看到每添加一个查询语句,只要不是条件非,maxCoord都会增加1.这个是总的查询条件。
那文档匹配的查询条件呢:
return new ConjunctionScorer(defaultSimilarity, requiredScorers) {
private int lastScoredDoc = -1;
public float score() throws IOException {
if (this.doc() >= lastScoredDoc) {
lastScoredDoc = this.doc();
coordinator.nrMatchers += requiredNrMatchers;
}
是由requiredNrMatchers决定的。
countingConjunctionSumScorer(optionalScorers);
countingConjunctionSumScorer(requiredScorers);
optionalScorers表示的是should的条件,requiredScorers是must条件。所以满足的条件就是满足should和must的条件总和。