solr基础

1、参考
http://blog.csdn.net/awj3584/article/details/16963525

查询参数


名称 描述
q 查询字符串,必须的。
fq filter query。使用Filter Query可以充分利用Filter Query Cache,提高检索性能。作用:在q查询符合结果中同时是fq查询符合的,例如:q=mm&fq=date_time:[20081001 TO 20091031],找关键字mm,并且date_time是20081001到20091031之间的。
fl field list。指定返回结果字段。以空格“ ”或逗号“,”分隔。
start 用于分页定义结果起始记录数,默认为0。
rows 用于分页定义结果每页返回记录数,默认为10。
sort 排序,格式:sort=<field name>+<desc asc>[,<field name>+<desc asc>]… 。示例:(inStock desc, price asc)表示先 “inStock” 降序, 再 “price” 升序,默认是相关性降序。
df 默认的查询字段,一般默认指定。
q.op 覆盖schema.xml的defaultOperator(有空格时用"AND"还是用"OR"操作逻辑),一般默认指定。必须大写
wt writer type。指定查询输出结构格式,默认为“xml”。在solrconfig.xml中定义了查询输出格式:xml、json、python、ruby、php、phps、custom。
qt query type,指定查询使用的Query Handler,默认为“standard”。
explainOther 设置当debugQuery=true时,显示其他的查询说明。
defType 设置查询解析器名称。
timeAllowed 设置查询超时时间。
omitHeader 设置是否忽略查询结果返回头信息,默认为“false”。
indent 返回的结果是否缩进,默认关闭,用 indent=true,on 开启,一般调试json,php,phps,ruby输出才有必要用这个参数。
version 查询语法的版本,建议不使用它,由服务器指定默认值。
debugQuery 设置返回结果是否显示Debug信息。


使用dismax中的bf参数

使用明确为函数查询的参数,比如说dismax中的bf(boost function)这个参数。  注意:bf这个参数是可以接受多个函数查询的,它们之间用空格隔开,它们还可以带上权重。所以,当我们使用bf这个参数的时候,我们必须保证单个函数中是没有空格出现的,不然程序有可能会以为是两个函数。

示例:

q=dismax&bf="ord(popularity)^0.5 recip(rord(price),1,1000,1000)^0.3




高亮显示
solr 默认已经配置了highlight 组件(详见 SOLR_HOME/conf/sorlconfig.xml)。通常我出只需要这样请求http://localhost:8983/solr/ collection1 /select? q=%E4%B8%AD%E5%9B%BD&start=0&rows=1&fl=content+path+&wt=xml&indent=true&hl=true&hl.fl=content

       可以看到与比一般的请求多了两个参数 "hl=true" 和 "hl.fl= content " 。

"hl=true" 是开启高亮,"hl.fl= content " 是告诉solr 对 name 字段进行高亮(如果你想对多个字段进行高亮,可以继续添加字段,字段间用逗号隔开,如 "hl.fl=name,name2,name3")。 高亮内容与关键匹配的地方,默认将会被 "<em>" 和 "</em>" 包围。还可以使用hl.simple.pre" 和 "hl.simple.post"参数设置前后标签.

使用SolrJ方法基本一样也是设置这些个参数,只不过是SolrJ封装起来了,代码如下:

SolrQuery query = new SolrQuery();

query.set("q","*.*");

query.setHighlight(true); // 开启高亮组件

query.addHighlightField("content");// 高亮字段

query.setHighlightSimplePre(PRE_TAG);// 标记

query.setHighlightSimplePost(POST_TAG);

QueryResponse rsp =server.query(query)

//…上面取结果的代码

//取出高亮结果

if (rsp.getHighlighting() != null) {

  if (rsp.getHighlighting().get(id) != null) {//先通过结果中的ID到高亮集合中取出文档高亮信息

    Map<String, List<String>> map = rsp.getHighlighting().get(id);//取出高亮片段

    if (map.get(name) != null) {

      for (String s : map.get(name)) {

        System.out.println(s);

      }

    }

}




2、solr基础
参考
http://www.solr.cc/blog/?p=1296

http://wiki.apache.org/solr/Solrj

查询:
public class SolrJSearcher {

    public static void main(String[] args) throws SolrServerException {
        HttpSolrServer solr = new HttpSolrServer("http://10.155.11.74:8080/solr");

        ModifiableSolrParams params = new ModifiableSolrParams();
        params.set("q","cat:electronics");
        params.set("defType", "edismax");
        params.set("start", "0");
        params.set("fq", "book");

        QueryResponse response = solr.query(params);
        SolrDocumentList results = response.getResults();
        for (int i = 0; i < results.size(); i++) {
            System.out.println(results.get(i));
        }
    }
}



删除与添加:
public class SolrAddData {

    public static void main(String[] args) throws IOException, SolrServerException {
        SolrServer server = new HttpSolrServer("http://10.155.11.74:8080/solr");

        server.deleteByQuery("*:*");

        SolrInputDocument doc1 = new SolrInputDocument();
        doc1.addField("id", "id1", 1.0f);
        doc1.addField("name", "doc1", 1.0f);
        doc1.addField("price", 10);

        SolrInputDocument doc2 = new SolrInputDocument();
        doc2.addField("id", "id2", 1.0f);
        doc2.addField("name", "doc2", 1.0f);
        doc2.addField("price", 20);

        //索引的集合
        Collection<SolrInputDocument> docs = new ArrayList<SolrInputDocument>();
        docs.add(doc1);
        docs.add(doc2);

        server.add(docs);

        server.commit();

    }
}


3、实战
全量增加索引
/**
 * 同步全量的数据岛搜索引擎,用于按照频道号了播客名进行搜索
 * @author guoxin
 *
 */
@Service
public class SynAnchorRoomIndexsJob  {
    private static Logger logger = LoggerFactory.getLogger(SynAnchorRoomIndexsJob.class);

    @Autowired
    UserRoomService userRoomService;

    @Value("${solr.url.s-zb-room}")
    private String solrUrlSerchAnchor;

	/**
	 * 执行同步的方法
	 */
	public void work(){
        logger.info("Begin syn all anchor and room indexs to solr");
        List<UserRoom> dlist=null;
        Collection<SolrInputDocument> coll=null;
        SolrServer solr=null;
        try {
            dlist=userRoomService.getDdshowUserList$Master();
            if( CollectionUtils.isEmpty(dlist)){
                logger.warn("anchor count is 0,please check system");
                return ;
            }
            coll=new ArrayList<SolrInputDocument>();
            solr=new HttpSolrServer(solrUrlSerchAnchor);
            for (UserRoom userRoom:dlist){
                //构建数据
                String nickName=userRoom.getNickName();
                SolrInputDocument document = new SolrInputDocument();
                document.addField("zb_id",userRoom.getOriginUserId());
                document.addField("room_id",userRoom.getRoomId());
                document.addField("zb_name",nickName);
                document.addField("zb_name_bz",nickName);
                //将播客名先分词再转拼音
                document.addField("zb_pyname",ParseUtil.getStringPinYin(ParseUtil.parse(nickName)+" "+nickName));
                coll.add(document);
            }
            //先清除索引
            solr.deleteByQuery("*:*");
            //再增加索引
            solr.add(coll);
            //最后提交
            solr.commit();
            logger.info("End syn all anchor and room indexs to solr,room count is {}",coll.size());
        } catch (SolrServerException e) {
            logger.error("Faile to syn all anchor and room indexs to solr",e);
        } catch (IOException e) {
            logger.error("Faile to syn all anchor and room indexs to solr",e);
        }finally {
            if(solr!=null)
                solr.shutdown();
        }
    }
}


获取索引

package com.youku.ddshow.handler.search;


import com.google.common.collect.Lists;
import org.apache.commons.collections.MapUtils;
import org.apache.commons.lang.ArrayUtils;
import org.apache.commons.lang.StringUtils;
import org.apache.solr.client.solrj.SolrQuery;
import org.apache.solr.client.solrj.SolrServerException;
import org.apache.solr.client.solrj.impl.HttpSolrServer;
import org.apache.solr.client.solrj.impl.XMLResponseParser;
import org.apache.solr.client.solrj.response.QueryResponse;
import org.apache.solr.common.SolrDocument;
import org.apache.solr.common.SolrDocumentList;
import org.apache.solr.common.params.CommonParams;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import org.springframework.beans.factory.InitializingBean;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.beans.factory.annotation.Qualifier;

import java.util.List;
import java.util.Map;
import java.util.Properties;

/**
 * Created by jeffier on 2014/10/21.
 */
public class SearchHandler implements InitializingBean {

    private Logger logger = LoggerFactory.getLogger(getClass());

    public static final String FIELD_ROOM_ID = "room_id";

    public static final String FIELD_ZB_ID = "zb_id";

    public static final String FIELD_ZB_NAME = "zb_name";

    public static final String FIELD_ZB_PY_NAME = "zb_pyname";

    public static final String FIELD_ZB_NAME_BZ = "zb_name_bz";

    public static final String FIELD_SCORE = "score"; // 非索引字段,查询后的匹配值,用作排序

    private static final int DEFAULT_PAGESIZE = 10;

    public static final String DEFAULT_PREFIX = "<span>";

    public static final String DEFAULT_SUBFIX = "</span>";

    private static final List<String> HIGHT_LIGHT_FIELDS = Lists.newArrayList(FIELD_ZB_NAME, FIELD_ZB_NAME_BZ);

    private static final List<String> RETURN_FIELDS = Lists.newArrayList(FIELD_ROOM_ID, FIELD_ZB_NAME, FIELD_ZB_ID, FIELD_ZB_PY_NAME);

    @Autowired
    @Qualifier("configProperties")
    protected Properties configProperties;

    private HttpSolrServer server = null;

    /**
     * 搜索接口,没有高亮处理
     *
     * @param input
     * @param pageNo
     * @param pageSize
     */
    public SearchResult search(String input, int pageNo, int pageSize) {
        return search(input, pageNo, pageSize, DEFAULT_PREFIX, DEFAULT_SUBFIX, null);
    }

    /**
     * 搜索并自定义排序条件
     *
     * @param input    搜索关键词
     * @param pageNo
     * @param pageSize
     * @param fields   不指定则为null,按匹配度排序
     * @param prefix   搜索高亮前缀,和subfix配合使用,两者都不为空时加高亮处理
     * @param subfix   搜索高亮后缀,和prefix配合使用,两者都不为空时加高亮处理
     * @return
     */
    public SearchResult search(String input, int pageNo, int pageSize, String prefix, String subfix, SortField... fields) {
        if (StringUtils.isBlank(input)) {
            return SearchResult.EMPTY;
        }

        if (pageNo <= 0) {
            pageNo = 1;
        }

        if (pageSize <= 0) {
            pageSize = DEFAULT_PAGESIZE;
        }

        SolrQuery query = buildQuery(input, (pageNo - 1) * pageSize, pageSize, prefix, subfix, fields); // 构建query

        try {
            QueryResponse response = server.query(query); // 查询并解析搜索结果
            return resolveResult(response, prefix, subfix);
        } catch (SolrServerException e) {
            logger.error("query{} error", input, e);
            return SearchResult.EMPTY;
        } catch (Exception e) {
            logger.error("extract search result {} error", input, e);
            return SearchResult.EMPTY;
        }
    }

    /**
     * 解析搜索结果
     *
     * @param response
     * @param prefix
     * @param subfix
     * @return
     */
    private SearchResult resolveResult(QueryResponse response, String prefix, String subfix) {
        long numFound = response.getResults().getNumFound();
        List<SearchResult.Result> results = Lists.newArrayList();

        SolrDocumentList documents = response.getResults();
        Map<String, Map<String, List<String>>> highlight = response.getHighlighting();

        if (null != documents && !documents.isEmpty()) {
            int userId = 0, roomId = 0;
            String name = null, highlightName = null;
            for (SolrDocument document : documents) {
                userId = (Integer) document.getFieldValue(FIELD_ZB_ID);
                roomId = (Integer) document.getFieldValue(FIELD_ROOM_ID);
                name = (String) document.getFieldValue(FIELD_ZB_NAME);

                if (StringUtils.isNotBlank(prefix) && StringUtils.isNotBlank(subfix)) { // 高亮相关
                    if (MapUtils.isEmpty(highlight) || MapUtils.isEmpty(highlight.get(String.valueOf(roomId)))) {
                        highlightName = name;
                    } else {
                        for (String field : HIGHT_LIGHT_FIELDS) {
                            if (highlight.get(String.valueOf(roomId)).containsKey(field)) {
                                highlightName = highlight.get(String.valueOf(roomId)).get(field).get(0);
                                break;
                            }
                        }

                        if (StringUtils.isBlank(highlightName)) {
                            highlightName = name;
                        }
                    }
                } else {
                    highlightName = name;
                }

                results.add(new SearchResult.Result(roomId, userId, name, highlightName));
            }
        }

        return new SearchResult(numFound, results);
    }

    /**
     * 构造query
     *
     * @param input
     * @param start
     * @param rows
     * @param prefix
     * @param subfix
     * @param fields
     * @return
     */
    private SolrQuery buildQuery(String input, int start, int rows, String prefix, String subfix, SortField... fields) {
        SolrQuery query = new SolrQuery();
        query.setQuery(input);
        query.setStart(start);
        query.setRows(rows);

        for (String field : RETURN_FIELDS) { // 返回字段
            query.addField(field);
        }

        //writer type。指定查询输出结构格式,默认为“xml”。在solrconfig.xml中定义了查询输出格式:xml、json、python、ruby、php、phps、custom。
        query.set(CommonParams.WT, "xml");
        //设置查询解析器名称。
        query.set("defType", "dismax");
        //返回的结果是否缩进,默认关闭,用 indent=true,on 开启,一般调试json,php,phps,ruby输出才有必要用这个参数。
        query.set("indent", false);
        //query type,指定查询使用的Query Handler,默认为“standard”。
        query.set("qf", String.format("%s %s %s %s", FIELD_ROOM_ID, FIELD_ZB_NAME, FIELD_ZB_PY_NAME, FIELD_ZB_NAME_BZ)); // 关键词从哪些字段搜索
        /* query.set("bf", ""); // 非搜索字段参与排序 */

        query.addSort(FIELD_SCORE, SolrQuery.ORDER.desc); // 排序相关,默认第一排序条件
        if (ArrayUtils.isNotEmpty(fields)) { // 自定义的其他排序条件
            for (SortField field : fields) {
                query.addSort(field.getField(), field.getOrder());
            }
        }

        if (StringUtils.isNotBlank(prefix) && StringUtils.isNotBlank(subfix)) { // 高亮相关
            query.setHighlight(true).setHighlightSnippets(1);
            query.setHighlightFragsize(50); // 每个分片的最大长度,默认100。太小可能高亮不劝,太大可能摘要太长。

            for (String field : HIGHT_LIGHT_FIELDS) {
                query.addHighlightField(field);
            }
            query.setHighlightSimplePre("<span>");
            query.setHighlightSimplePost("</span>");
        }

        return query;
    }


    /**
     * 初始化solrserver
     *
     * @throws Exception
     */
    @Override
    public void afterPropertiesSet() throws Exception {
        server = new HttpSolrServer(configProperties.getProperty("search.base.url"));
        server.setMaxRetries(1);
        server.setParser(new XMLResponseParser());
        server.setConnectionTimeout(5000);
        server.setDefaultMaxConnectionsPerHost(100);
        server.setMaxTotalConnections(100);
        server.setFollowRedirects(false);
        server.setAllowCompression(true);
    }
}



4、suggest功能
参考
http://yingbin920.iteye.com/blog/1568769

Suggest字段的选择
root@m1hadoop:/opt/solr/data/solr/collection1/conf# vim schema.xml
<field name="suggestion" type="string" indexed="true" stored="true" termVectors="true" multiValued="true"/>


配置Suggest模块
vim solrconfig.xml
<searchComponent class="solr.SpellCheckComponent" name="suggest">
        <str name="queryAnalyzerFieldType">string</str>
        <lst name="spellchecker">
            <str name="name">suggest</str>
            <str name="classname">org.apache.solr.spelling.suggest.Suggester</str>
            <str name="lookupImpl">org.apache.solr.spelling.suggest.tst.TSTLookup</str>
            <str name="field">suggestion</str>
            <!-- the indexed field to derive suggestions from -->
            <float name="threshold">0.0001</float>
            <str name="spellcheckIndexDir">spellchecker</str>
            <str name="comparatorClass">freq</str>
            <str name="buildOnOptimize">true</str>
            <!--<str name="buildOnCommit">true</str>-->
        </lst>
    </searchComponent>
    <requestHandler class="org.apache.solr.handler.component.SearchHandler" name="/suggest">
        <lst name="defaults">
            <str name="spellcheck">true</str>
            <str name="spellcheck.dictionary">suggest</str>
            <str name="spellcheck.onlyMorePopular">true</str>
            <str name="spellcheck.extendedResults">false</str>
            <str name="spellcheck.count">10</str>
            <str name="spellcheck.collate">true</str>
        </lst>
        <arr name="components">
            <str>suggest</str>
        </arr>
    </requestHandler>


向suggestion 中添加doc索引数据

vim /opt/solr/solr-4.10.3/example/exampledocs/solr.xml
<add>
<doc>
  <field name="id">SOLR1000</field>
  <field name="name">Solr, the Enterprise Search Server</field>
  <field name="manu">Apache Software Foundation</field>
  <field name="cat">software</field>
  <field name="cat">search</field>
  <field name="features">Advanced Full-Text Search Capabilities using Lucene</field>
  <field name="features">Optimized for High Volume Web Traffic</field>
  <field name="features">Standards Based Open Interfaces - XML and HTTP</field>
  <field name="features">Comprehensive HTML Administration Interfaces</field>
  <field name="features">Scalability - Efficient Replication to other Solr Search Servers</field>
  <field name="features">Flexible and Adaptable with XML configuration and Schema</field>
  <field name="features">Good unicode support: h&#xE9;llo (hello with an accent over the e)</field>
  <field name="price">0</field>
  <field name="popularity">10</field>
  <field name="inStock">true</field>
  <field name="incubationdate_dt">2006-01-17T00:00:00.000Z</field>
  <field name="suggestion">Advanced Full-Text Search Capabilities using Lucene</field>
  <field name="suggestion">Optimized for High Volume Web Traffic</field>
  <field name="suggestion">Standards Based Open Interfaces - XML and HTTP</field>
  <field name="suggestion">Comprehensive HTML Administration Interfaces</field>
  <field name="suggestion">Scalability - Efficient Replication to other Solr Search Servers</field>
  <field name="suggestion">Flexible and Adaptable with XML configuration and Schema</field>
  <field name="suggestion">Good unicode support: h&#xE9;llo (hello with an accent over the e)</field>
</doc>
</add>


执行添加数据命令如下:
root@m1hadoop:/opt/solr/solr-4.10.3/example/exampledocs# java -Durl=http://10.155.11.74:8080/solr/update -Dcommit=yes -jar post.jar *.xml
SimplePostTool version 1.5
Posting files to base url http://10.155.11.74:8080/solr/update using content-type application/xml..
POSTing file gb18030-example.xml
POSTing file hd.xml
POSTing file ipod_other.xml
POSTing file ipod_video.xml
POSTing file manufacturers.xml
POSTing file mem.xml
POSTing file money.xml
POSTing file monitor2.xml
POSTing file monitor.xml
POSTing file mp500.xml
POSTing file sd500.xml
POSTing file solr.xml
POSTing file utf8-example.xml
POSTing file vidcard.xml
14 files indexed.
COMMITting Solr index changes to http://10.155.11.74:8080/solr/update.


测试:
配置完成之后,重启Solr后,(进行reload)访问如下链接

solr基础_第1张图片

你可能感兴趣的:(Solr)