solr5.5(5)——Similarity简单介绍

1.Similarity改变solr的打分机制,用solr默认的排序,发现得到的结果并不符合用户体验的需求,研究了几天后,也学了点皮毛,废话不多说。

我的是直接把solr客户端项目直接放到建立好了的web项目中,并且直接将solr-5.5.3\server\solr-webapp\webapp里面所有的都放到web文件夹中

solr5.5(5)——Similarity简单介绍_第1张图片

web.xml的solrHome目录指定就不说了

2.solr本身已经提供了几种算法

org.apache.solr.search.similarities.BM25SimilarityFactory
org.apache.solr.search.similarities.DefaultSimilarityFactory
org.apache.solr.search.similarities.DFRSimilarityFactory
org.apache.solr.search.similarities.IBSimilarityFactory
org.apache.solr.search.similarities.LMDirichletSimilarityFactory
org.apache.solr.search.similarities.LMJelinekMercerSimilarityFactory
org.apache.solr.search.similarities.SchemaSimilarityFactory
然后在managed-schema里面添加全局的

这个是必须填写的,不然直接在参数类型用会报错

Caused by: org.apache.solr.common.SolrException: Can't load schema E:\solr\solr20170714\solrHome\classify\conf\managed-schema: FieldType 'text_ik' is configured with a similarity, but the global similarity does not support it: class org.apache.solr.search.similarities.ClassicSimilarityFactory


	
        
        
    
如果是单个相似度排序的话,例如
如果是复合的话,建议加权重如query.set("defType"," edismax"), query.set("qf"," pro_name^100 pro_nStore_description^50")



3.自定义
package cn.com.cacuq.similarity;

import org.apache.lucene.search.similarities.Similarity;
import org.apache.solr.schema.SimilarityFactory;

/**
 * Created by Administrator on 2017/7/31 0031.
 */
public class MySimilarityFactory extends SimilarityFactory {

    public Similarity getSimilarity() {
        return new MySimilarity();
    }
}

package cn.com.cacuq.similarity;

import org.apache.lucene.index.FieldInvertState;
import org.apache.lucene.search.similarities.DefaultSimilarity;

/**
 * Created by Administrator on 2017/7/31 0031.
 */
public class MySimilarity extends DefaultSimilarity{

    /**
     * freq 表示 term 在一个document的出现次数,这里设置为1.0f表示不考滤这个因素影响
     * */
    @Override
    public float tf(float freq) {
        return 1.0F;
    }

    /**
     * 这里表示匹配的docuemnt在全部document的影响因素,同理也不考滤
     * */
    @Override
    public float idf(long docFreq, long numDocs) {
        return 1.0F;
    }

    @Override
    public float sloppyFreq(int distance) {
        return 1.0F;
    }

    @Override
    public float queryNorm(float sumOfSquaredWeights) {
        return 1.0F;
    }

    /**
     * 这里表示每一个Document中所有匹配的关键字与当前关键字的匹配比例因素影响,同理也不考滤
     * */
    @Override
    public float coord(int overlap, int maxOverlap) {
        return 1.0F;
    }

    @Override
    public float lengthNorm(FieldInvertState state) {
        return 1.0F;
    }

    protected boolean discountOverlaps = false;

    public void setDiscountOverlaps(boolean v) {
        discountOverlaps = v;
    }

    public boolean getDiscountOverlaps() {
        return discountOverlaps;
    }

    public String toString(){
        return "MySimilarity";
    }

}

然后在改变fieldType里面的similarity里面的class用自定的cn.com.cacuq.similarity.MySimilarityFactory

处理前结果

solr5.5(5)——Similarity简单介绍_第2张图片

处理后结果

solr5.5(5)——Similarity简单介绍_第3张图片

很明显的到的结果更加符合用户体验,欢迎观看!!!

你可能感兴趣的:(搜索引擎)