Theory Behind Relevance Scoring

Lucene(Elasticsearch使用了lucene的打分机制)使用布尔模型来找到匹配的文档,并使用一个称为实际评分函数的公式来计算相关性。 该公式从term频率/逆文档频率和向量空间模型中借用概念,但增加了更多特征,如协调因子、域长度归一化和term/query条件的boost。

Boolean Model

The Boolean model simply applies the AND, OR, and NOT conditions expressed in the query to find all the documents that match. A query for
布尔模型,通过在query中使用 AND OR NOT 等条件表达式,来查找匹配的文档,一个queyrfull AND text AND search AND (elasticsearch OR lucene)will include only documents that contain all of the terms full, text, and search, and either elasticsearch or lucene.
将包含

Term frequency

Inverse document frequency

Field-length norm

Putting it together

Vector Space Model

你可能感兴趣的:(Theory Behind Relevance Scoring)