1、参考
http://blog.csdn.net/awj3584/article/details/16963525
查询参数
名称 |
描述 |
q |
查询字符串,必须的。 |
fq |
filter query。使用Filter Query可以充分利用Filter Query Cache,提高检索性能。作用:在q查询符合结果中同时是fq查询符合的,例如:q=mm&fq=date_time:[20081001 TO 20091031],找关键字mm,并且date_time是20081001到20091031之间的。 |
fl |
field list。指定返回结果字段。以空格“ ”或逗号“,”分隔。 |
start |
用于分页定义结果起始记录数,默认为0。 |
rows |
用于分页定义结果每页返回记录数,默认为10。 |
sort |
排序,格式:sort=<field name>+<desc asc>[,<field name>+<desc asc>]… 。示例:(inStock desc, price asc)表示先 “inStock” 降序, 再 “price” 升序,默认是相关性降序。 |
df |
默认的查询字段,一般默认指定。 |
q.op |
覆盖schema.xml的defaultOperator(有空格时用"AND"还是用"OR"操作逻辑),一般默认指定。必须大写 |
wt |
writer type。指定查询输出结构格式,默认为“xml”。在solrconfig.xml中定义了查询输出格式:xml、json、python、ruby、php、phps、custom。 |
qt |
query type,指定查询使用的Query Handler,默认为“standard”。 |
explainOther |
设置当debugQuery=true时,显示其他的查询说明。 |
defType |
设置查询解析器名称。 |
timeAllowed |
设置查询超时时间。 |
omitHeader |
设置是否忽略查询结果返回头信息,默认为“false”。 |
indent |
返回的结果是否缩进,默认关闭,用 indent=true,on 开启,一般调试json,php,phps,ruby输出才有必要用这个参数。 |
version |
查询语法的版本,建议不使用它,由服务器指定默认值。 |
debugQuery |
设置返回结果是否显示Debug信息。 |
使用dismax中的bf参数
使用明确为函数查询的参数,比如说dismax中的bf(boost function)这个参数。 注意:bf这个参数是可以接受多个函数查询的,它们之间用空格隔开,它们还可以带上权重。所以,当我们使用bf这个参数的时候,我们必须保证单个函数中是没有空格出现的,不然程序有可能会以为是两个函数。
示例:
q=dismax&bf="ord(popularity)^0.5 recip(rord(price),1,1000,1000)^0.3
高亮显示
solr 默认已经配置了highlight 组件(详见 SOLR_HOME/conf/sorlconfig.xml)。通常我出只需要这样请求http://localhost:8983/solr/ collection1 /select? q=%E4%B8%AD%E5%9B%BD&start=0&rows=1&fl=content+path+&wt=xml&indent=true&hl=true&hl.fl=content
可以看到与比一般的请求多了两个参数 "hl=true" 和 "hl.fl= content " 。
"hl=true" 是开启高亮,"hl.fl= content " 是告诉solr 对 name 字段进行高亮(如果你想对多个字段进行高亮,可以继续添加字段,字段间用逗号隔开,如 "hl.fl=name,name2,name3")。 高亮内容与关键匹配的地方,默认将会被 "<em>" 和 "</em>" 包围。还可以使用hl.simple.pre" 和 "hl.simple.post"参数设置前后标签.
使用SolrJ方法基本一样也是设置这些个参数,只不过是SolrJ封装起来了,代码如下:
SolrQuery query = new SolrQuery();
query.set("q","*.*");
query.setHighlight(true); // 开启高亮组件
query.addHighlightField("content");// 高亮字段
query.setHighlightSimplePre(PRE_TAG);// 标记
query.setHighlightSimplePost(POST_TAG);
QueryResponse rsp =server.query(query)
//…上面取结果的代码
//取出高亮结果
if (rsp.getHighlighting() != null) {
if (rsp.getHighlighting().get(id) != null) {//先通过结果中的ID到高亮集合中取出文档高亮信息
Map<String, List<String>> map = rsp.getHighlighting().get(id);//取出高亮片段
if (map.get(name) != null) {
for (String s : map.get(name)) {
System.out.println(s);
}
}
}
2、solr基础
参考
http://www.solr.cc/blog/?p=1296
http://wiki.apache.org/solr/Solrj
查询:
public class SolrJSearcher {
public static void main(String[] args) throws SolrServerException {
HttpSolrServer solr = new HttpSolrServer("http://10.155.11.74:8080/solr");
ModifiableSolrParams params = new ModifiableSolrParams();
params.set("q","cat:electronics");
params.set("defType", "edismax");
params.set("start", "0");
params.set("fq", "book");
QueryResponse response = solr.query(params);
SolrDocumentList results = response.getResults();
for (int i = 0; i < results.size(); i++) {
System.out.println(results.get(i));
}
}
}
删除与添加:
public class SolrAddData {
public static void main(String[] args) throws IOException, SolrServerException {
SolrServer server = new HttpSolrServer("http://10.155.11.74:8080/solr");
server.deleteByQuery("*:*");
SolrInputDocument doc1 = new SolrInputDocument();
doc1.addField("id", "id1", 1.0f);
doc1.addField("name", "doc1", 1.0f);
doc1.addField("price", 10);
SolrInputDocument doc2 = new SolrInputDocument();
doc2.addField("id", "id2", 1.0f);
doc2.addField("name", "doc2", 1.0f);
doc2.addField("price", 20);
//索引的集合
Collection<SolrInputDocument> docs = new ArrayList<SolrInputDocument>();
docs.add(doc1);
docs.add(doc2);
server.add(docs);
server.commit();
}
}
3、实战
全量增加索引
/**
* 同步全量的数据岛搜索引擎,用于按照频道号了播客名进行搜索
* @author guoxin
*
*/
@Service
public class SynAnchorRoomIndexsJob {
private static Logger logger = LoggerFactory.getLogger(SynAnchorRoomIndexsJob.class);
@Autowired
UserRoomService userRoomService;
@Value("${solr.url.s-zb-room}")
private String solrUrlSerchAnchor;
/**
* 执行同步的方法
*/
public void work(){
logger.info("Begin syn all anchor and room indexs to solr");
List<UserRoom> dlist=null;
Collection<SolrInputDocument> coll=null;
SolrServer solr=null;
try {
dlist=userRoomService.getDdshowUserList$Master();
if( CollectionUtils.isEmpty(dlist)){
logger.warn("anchor count is 0,please check system");
return ;
}
coll=new ArrayList<SolrInputDocument>();
solr=new HttpSolrServer(solrUrlSerchAnchor);
for (UserRoom userRoom:dlist){
//构建数据
String nickName=userRoom.getNickName();
SolrInputDocument document = new SolrInputDocument();
document.addField("zb_id",userRoom.getOriginUserId());
document.addField("room_id",userRoom.getRoomId());
document.addField("zb_name",nickName);
document.addField("zb_name_bz",nickName);
//将播客名先分词再转拼音
document.addField("zb_pyname",ParseUtil.getStringPinYin(ParseUtil.parse(nickName)+" "+nickName));
coll.add(document);
}
//先清除索引
solr.deleteByQuery("*:*");
//再增加索引
solr.add(coll);
//最后提交
solr.commit();
logger.info("End syn all anchor and room indexs to solr,room count is {}",coll.size());
} catch (SolrServerException e) {
logger.error("Faile to syn all anchor and room indexs to solr",e);
} catch (IOException e) {
logger.error("Faile to syn all anchor and room indexs to solr",e);
}finally {
if(solr!=null)
solr.shutdown();
}
}
}
获取索引
package com.youku.ddshow.handler.search;
import com.google.common.collect.Lists;
import org.apache.commons.collections.MapUtils;
import org.apache.commons.lang.ArrayUtils;
import org.apache.commons.lang.StringUtils;
import org.apache.solr.client.solrj.SolrQuery;
import org.apache.solr.client.solrj.SolrServerException;
import org.apache.solr.client.solrj.impl.HttpSolrServer;
import org.apache.solr.client.solrj.impl.XMLResponseParser;
import org.apache.solr.client.solrj.response.QueryResponse;
import org.apache.solr.common.SolrDocument;
import org.apache.solr.common.SolrDocumentList;
import org.apache.solr.common.params.CommonParams;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import org.springframework.beans.factory.InitializingBean;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.beans.factory.annotation.Qualifier;
import java.util.List;
import java.util.Map;
import java.util.Properties;
/**
* Created by jeffier on 2014/10/21.
*/
public class SearchHandler implements InitializingBean {
private Logger logger = LoggerFactory.getLogger(getClass());
public static final String FIELD_ROOM_ID = "room_id";
public static final String FIELD_ZB_ID = "zb_id";
public static final String FIELD_ZB_NAME = "zb_name";
public static final String FIELD_ZB_PY_NAME = "zb_pyname";
public static final String FIELD_ZB_NAME_BZ = "zb_name_bz";
public static final String FIELD_SCORE = "score"; // 非索引字段,查询后的匹配值,用作排序
private static final int DEFAULT_PAGESIZE = 10;
public static final String DEFAULT_PREFIX = "<span>";
public static final String DEFAULT_SUBFIX = "</span>";
private static final List<String> HIGHT_LIGHT_FIELDS = Lists.newArrayList(FIELD_ZB_NAME, FIELD_ZB_NAME_BZ);
private static final List<String> RETURN_FIELDS = Lists.newArrayList(FIELD_ROOM_ID, FIELD_ZB_NAME, FIELD_ZB_ID, FIELD_ZB_PY_NAME);
@Autowired
@Qualifier("configProperties")
protected Properties configProperties;
private HttpSolrServer server = null;
/**
* 搜索接口,没有高亮处理
*
* @param input
* @param pageNo
* @param pageSize
*/
public SearchResult search(String input, int pageNo, int pageSize) {
return search(input, pageNo, pageSize, DEFAULT_PREFIX, DEFAULT_SUBFIX, null);
}
/**
* 搜索并自定义排序条件
*
* @param input 搜索关键词
* @param pageNo
* @param pageSize
* @param fields 不指定则为null,按匹配度排序
* @param prefix 搜索高亮前缀,和subfix配合使用,两者都不为空时加高亮处理
* @param subfix 搜索高亮后缀,和prefix配合使用,两者都不为空时加高亮处理
* @return
*/
public SearchResult search(String input, int pageNo, int pageSize, String prefix, String subfix, SortField... fields) {
if (StringUtils.isBlank(input)) {
return SearchResult.EMPTY;
}
if (pageNo <= 0) {
pageNo = 1;
}
if (pageSize <= 0) {
pageSize = DEFAULT_PAGESIZE;
}
SolrQuery query = buildQuery(input, (pageNo - 1) * pageSize, pageSize, prefix, subfix, fields); // 构建query
try {
QueryResponse response = server.query(query); // 查询并解析搜索结果
return resolveResult(response, prefix, subfix);
} catch (SolrServerException e) {
logger.error("query{} error", input, e);
return SearchResult.EMPTY;
} catch (Exception e) {
logger.error("extract search result {} error", input, e);
return SearchResult.EMPTY;
}
}
/**
* 解析搜索结果
*
* @param response
* @param prefix
* @param subfix
* @return
*/
private SearchResult resolveResult(QueryResponse response, String prefix, String subfix) {
long numFound = response.getResults().getNumFound();
List<SearchResult.Result> results = Lists.newArrayList();
SolrDocumentList documents = response.getResults();
Map<String, Map<String, List<String>>> highlight = response.getHighlighting();
if (null != documents && !documents.isEmpty()) {
int userId = 0, roomId = 0;
String name = null, highlightName = null;
for (SolrDocument document : documents) {
userId = (Integer) document.getFieldValue(FIELD_ZB_ID);
roomId = (Integer) document.getFieldValue(FIELD_ROOM_ID);
name = (String) document.getFieldValue(FIELD_ZB_NAME);
if (StringUtils.isNotBlank(prefix) && StringUtils.isNotBlank(subfix)) { // 高亮相关
if (MapUtils.isEmpty(highlight) || MapUtils.isEmpty(highlight.get(String.valueOf(roomId)))) {
highlightName = name;
} else {
for (String field : HIGHT_LIGHT_FIELDS) {
if (highlight.get(String.valueOf(roomId)).containsKey(field)) {
highlightName = highlight.get(String.valueOf(roomId)).get(field).get(0);
break;
}
}
if (StringUtils.isBlank(highlightName)) {
highlightName = name;
}
}
} else {
highlightName = name;
}
results.add(new SearchResult.Result(roomId, userId, name, highlightName));
}
}
return new SearchResult(numFound, results);
}
/**
* 构造query
*
* @param input
* @param start
* @param rows
* @param prefix
* @param subfix
* @param fields
* @return
*/
private SolrQuery buildQuery(String input, int start, int rows, String prefix, String subfix, SortField... fields) {
SolrQuery query = new SolrQuery();
query.setQuery(input);
query.setStart(start);
query.setRows(rows);
for (String field : RETURN_FIELDS) { // 返回字段
query.addField(field);
}
//writer type。指定查询输出结构格式,默认为“xml”。在solrconfig.xml中定义了查询输出格式:xml、json、python、ruby、php、phps、custom。
query.set(CommonParams.WT, "xml");
//设置查询解析器名称。
query.set("defType", "dismax");
//返回的结果是否缩进,默认关闭,用 indent=true,on 开启,一般调试json,php,phps,ruby输出才有必要用这个参数。
query.set("indent", false);
//query type,指定查询使用的Query Handler,默认为“standard”。
query.set("qf", String.format("%s %s %s %s", FIELD_ROOM_ID, FIELD_ZB_NAME, FIELD_ZB_PY_NAME, FIELD_ZB_NAME_BZ)); // 关键词从哪些字段搜索
/* query.set("bf", ""); // 非搜索字段参与排序 */
query.addSort(FIELD_SCORE, SolrQuery.ORDER.desc); // 排序相关,默认第一排序条件
if (ArrayUtils.isNotEmpty(fields)) { // 自定义的其他排序条件
for (SortField field : fields) {
query.addSort(field.getField(), field.getOrder());
}
}
if (StringUtils.isNotBlank(prefix) && StringUtils.isNotBlank(subfix)) { // 高亮相关
query.setHighlight(true).setHighlightSnippets(1);
query.setHighlightFragsize(50); // 每个分片的最大长度,默认100。太小可能高亮不劝,太大可能摘要太长。
for (String field : HIGHT_LIGHT_FIELDS) {
query.addHighlightField(field);
}
query.setHighlightSimplePre("<span>");
query.setHighlightSimplePost("</span>");
}
return query;
}
/**
* 初始化solrserver
*
* @throws Exception
*/
@Override
public void afterPropertiesSet() throws Exception {
server = new HttpSolrServer(configProperties.getProperty("search.base.url"));
server.setMaxRetries(1);
server.setParser(new XMLResponseParser());
server.setConnectionTimeout(5000);
server.setDefaultMaxConnectionsPerHost(100);
server.setMaxTotalConnections(100);
server.setFollowRedirects(false);
server.setAllowCompression(true);
}
}
4、suggest功能
参考
http://yingbin920.iteye.com/blog/1568769
Suggest字段的选择
root@m1hadoop:/opt/solr/data/solr/collection1/conf# vim schema.xml
<field name="suggestion" type="string" indexed="true" stored="true" termVectors="true" multiValued="true"/>
配置Suggest模块
vim solrconfig.xml
<searchComponent class="solr.SpellCheckComponent" name="suggest">
<str name="queryAnalyzerFieldType">string</str>
<lst name="spellchecker">
<str name="name">suggest</str>
<str name="classname">org.apache.solr.spelling.suggest.Suggester</str>
<str name="lookupImpl">org.apache.solr.spelling.suggest.tst.TSTLookup</str>
<str name="field">suggestion</str>
<!-- the indexed field to derive suggestions from -->
<float name="threshold">0.0001</float>
<str name="spellcheckIndexDir">spellchecker</str>
<str name="comparatorClass">freq</str>
<str name="buildOnOptimize">true</str>
<!--<str name="buildOnCommit">true</str>-->
</lst>
</searchComponent>
<requestHandler class="org.apache.solr.handler.component.SearchHandler" name="/suggest">
<lst name="defaults">
<str name="spellcheck">true</str>
<str name="spellcheck.dictionary">suggest</str>
<str name="spellcheck.onlyMorePopular">true</str>
<str name="spellcheck.extendedResults">false</str>
<str name="spellcheck.count">10</str>
<str name="spellcheck.collate">true</str>
</lst>
<arr name="components">
<str>suggest</str>
</arr>
</requestHandler>
向suggestion 中添加doc索引数据
vim /opt/solr/solr-4.10.3/example/exampledocs/solr.xml
<add>
<doc>
<field name="id">SOLR1000</field>
<field name="name">Solr, the Enterprise Search Server</field>
<field name="manu">Apache Software Foundation</field>
<field name="cat">software</field>
<field name="cat">search</field>
<field name="features">Advanced Full-Text Search Capabilities using Lucene</field>
<field name="features">Optimized for High Volume Web Traffic</field>
<field name="features">Standards Based Open Interfaces - XML and HTTP</field>
<field name="features">Comprehensive HTML Administration Interfaces</field>
<field name="features">Scalability - Efficient Replication to other Solr Search Servers</field>
<field name="features">Flexible and Adaptable with XML configuration and Schema</field>
<field name="features">Good unicode support: héllo (hello with an accent over the e)</field>
<field name="price">0</field>
<field name="popularity">10</field>
<field name="inStock">true</field>
<field name="incubationdate_dt">2006-01-17T00:00:00.000Z</field>
<field name="suggestion">Advanced Full-Text Search Capabilities using Lucene</field>
<field name="suggestion">Optimized for High Volume Web Traffic</field>
<field name="suggestion">Standards Based Open Interfaces - XML and HTTP</field>
<field name="suggestion">Comprehensive HTML Administration Interfaces</field>
<field name="suggestion">Scalability - Efficient Replication to other Solr Search Servers</field>
<field name="suggestion">Flexible and Adaptable with XML configuration and Schema</field>
<field name="suggestion">Good unicode support: héllo (hello with an accent over the e)</field>
</doc>
</add>
执行添加数据命令如下:
root@m1hadoop:/opt/solr/solr-4.10.3/example/exampledocs# java -Durl=http://10.155.11.74:8080/solr/update -Dcommit=yes -jar post.jar *.xml
SimplePostTool version 1.5
Posting files to base url http://10.155.11.74:8080/solr/update using content-type application/xml..
POSTing file gb18030-example.xml
POSTing file hd.xml
POSTing file ipod_other.xml
POSTing file ipod_video.xml
POSTing file manufacturers.xml
POSTing file mem.xml
POSTing file money.xml
POSTing file monitor2.xml
POSTing file monitor.xml
POSTing file mp500.xml
POSTing file sd500.xml
POSTing file solr.xml
POSTing file utf8-example.xml
POSTing file vidcard.xml
14 files indexed.
COMMITting Solr index changes to http://10.155.11.74:8080/solr/update.
测试:
配置完成之后,重启Solr后,(进行reload)访问如下链接