通过之前两篇文章的学习之后,使用solr对mysql进行数据导入以及增量索引应该都会了!
(还不清楚的童鞋请查看以下博文进行学习:http://blog.csdn.net/weijonathan/article/details/16962257 , http://blog.csdn.net/weijonathan/article/details/16961299)
接下来我们学习下如果从Solr中读取我们想要的数据。同时你也可以结合Solr的web界面进行验证,看看你的查询结果是否正确。
环境准备:
从之前下载的solr安装包中解压获取以下jar包
/dist:
apache-solr-solrj-*.jar
/dist/solrj-lib:
commons-codec-1.3.jar
commons-httpclient-3.1.jar
commons-io-1.4.jar
jcl-over-slf4j-1.5.5.jar
slf4j-api-1.5.5.jar
/lib:
slf4j-jdk14-1.5.5.jar
或者如果你通过maven进行jar包管理的。可以使用以下maven库添加所需要的jar包
<dependency> <artifactId>solr-solrj</artifactId> <groupId>org.apache.solr</groupId> <version>1.4.0</version> <type>jar</type> <scope>compile</scope> </dependency>如果需要使用到EmbeddedSolrServer,那么需要导入core包。
<dependency> <artifactId>solr-core</artifactId> <groupId>org.apache.solr</groupId> <version>1.4.0</version> <type>jar</type> <scope>compile</scope> </dependency>还有两个依赖包
<dependency> <groupId>javax.servlet</groupId> <artifactId>servlet-api</artifactId> <version>2.5</version> </dependency>
<dependency> <groupId>org.slf4j</groupId> <artifactId>slf4j-simple</artifactId> <version>1.5.6</version> </dependency>环境准备好之后,我们先来看下使用HttpSolrServer创建连接
String url = "http://${ip}:${port}"; /* HttpSolrServer is thread-safe and if you are using the following constructor, you *MUST* re-use the same instance for all requests. If instances are created on the fly, it can cause a connection leak. The recommended practice is to keep a static instance of HttpSolrServer per solr server url and share it for all requests. See https://issues.apache.org/jira/browse/SOLR-861 for more details */ SolrServer server = new HttpSolrServer( url );你还可以在创建连接的时候设置相应的一些连接属性
String url = "http://<span style="font-family: Arial, Helvetica, sans-serif;">${ip}:${port}</span><span style="font-family: Arial, Helvetica, sans-serif;">"</span> HttpSolrServer server = new HttpSolrServer( url ); server.setMaxRetries(1); // defaults to 0. > 1 not recommended. server.setConnectionTimeout(5000); // 5 seconds to establish TCP // Setting the XML response parser is only required for cross // version compatibility and only when one side is 1.4.1 or // earlier and the other side is 3.1 or later. server.setParser(new XMLResponseParser()); // binary parser is used by default // The following settings are provided here for completeness. // They will not normally be required, and should only be used // after consulting javadocs to know whether they are truly required. server.setSoTimeout(1000); // socket read timeout server.setDefaultMaxConnectionsPerHost(100); server.setMaxTotalConnections(100); server.setFollowRedirects(false); // defaults to false // allowCompression defaults to false. // Server side must support gzip or deflate for this to have any effect. server.setAllowCompression(true);我想大伙很多都是使用实体来接收返回的数据,这样的话方便管理,那么看下SolrJ里面是如何定义实体的。
其实SolrJ中定义实体和平时没有太大区别。就是多了一个Annotation注解,用来标志与solr entry属性对应。
import org.apache.solr.client.solrj.beans.Field; public class Item { @Field String id; @Field("cat") String[] categories; @Field List<String> features; }除了设置在字段上,我们还可以设置在set方法上。
@Field("cat") public void setCategory(String[] c){ this.categories = c; }添加数据:
首先获取SolrServer
SolrServer server = new HttpSolrServer("http://${ip}:${port}");如果要删除所有的索引的话
server.deleteByQuery( "*:*" );// CAUTION: deletes everything!使用我们定义的Bean来往solr插入数据
Item item = new Item(); item.id = "one"; item.categories = new String[] { "aaa", "bbb", "ccc" }; server.addBean(item);如果需要一次插入多个的话。插入一个List<Bean>即可
List<Item> beans ; //add Item objects to the list server.addBeans(beans);
你可以通过以下形式在一个HTTP请求中更改你所有的索引。这个是最优化的方式
HttpSolrServer server = new HttpSolrServer(); Iterator<SolrInputDocument> iter = new Iterator<SolrInputDocument>(){ public boolean hasNext() { boolean result ; // set the result to true false to say if you have more documensts return result; } public SolrInputDocument next() { SolrInputDocument result = null; // construct a new document here and set it to result return result; } }; server.add(iter);