Solr已经发布3.5版本了,同时它是基于Lucene 3.5的。我们在基于Solr进行二次开发之前,首先要搭建起一个搜索服务器,在熟悉Solr的基本功能的基础上,可以根据实际应用的需要进行个性化定制开发。因为Solr提供了一种插件机制,我们可以根据自己的需要进行定制,然后在Solr的配置文件中(solrconfig.xml)进行配置即可达到预期的要求。在Solr的发行包中给出了一个配置的例子,我们可以直接将其发布到Web容器中,通过浏览器访问来进行测试,具体如何配置,下面根据从易到难的方式,对每种方式进行详细的介绍。
这种方式,我们是直接使用Solr发行包给定的WAR包,一般来说通过它快速了解Solr是很有用的,而对于满足实际需要的开发还远远不够。
按照下面的步骤,进行安装、配置、验证:
第1步:将apache-solr-3.5.0\apache-solr-3.5.0\example下面的multicore拷贝到apache-tomcat-6.0.32\conf下面;
multicore目录下面包含了Solr的基本配置。Solr支持配置多个实例,亦即,可以启动多个实例来服务于前端不同的搜索请求,每个实例对应一个core,而这样多个core的配置是通过multicore\solr.xml进行配置的,然后在multicore下面的每个目录中对应着每个core的详细配置,具体包括schema.xml(配置与Lucene的Field、Analyzer等相关的内容)、solrconfig.xml(这个是Solr实例核心的配置)。
另外,如果在solrconfig.xml中没有指定<dataDir>索引目录配置,则默认会生成apache-tomcat-6.0.32\conf\multicore\data\index目录,该目录下面存储索引文件。
第2步:将apache-solr-3.5.0\apache-solr-3.5.0\dist下面的apache-solr-3.5.0.war拷贝到apache-tomcat-6.0.32\webapps目录下面;
这个不用过多解释,就是通过使用一个Web归档文件(WAR)来部署一个Web应用,我们的应用就是Solr搜索应用程序。
第3步:配置WAR程序的Context:在apache-tomcat-6.0.32\conf\Catalina\localhost下面(如果目录不存在,则手动创建),创建文件apache-solr-3.5.0.xml;
Context配置文件apache-solr-3.5.0.xml的内容如下所示:
<Context docBase="${catalina.home}/webapps/apache-solr-3.5.0.war" debug="0" crossContext="true" > <Environment name="solr/home" type="java.lang.String" value="${catalina.home}/conf/multicore" override="true" /> </Context>
docBase指定了我们的WAR文件的位置,上面的“solr/home”非常关键,在Web容器启动以后会加载Solr的基本配置并初始化相应的组件实例,它会根据指定的“solr/home”配置的路径去搜索相关的配置,例如,上面我们将“solr/home”指向了目录apache-tomcat-6.0.32\conf\multicore。
第4步:设置Solr的字符集;
默认Solr使用了UTF-8字符集编码,如果你的Tomcat不是的话,在执行中文搜索的时候可能会出现乱码。如果你的Tomcat默认8080端口请求字符集就是UTF-8,并且想使用这个默认的端口提供搜索服务,则可以修改apache-tomcat-6.0.32\conf\server.xml文件的内容,如下所示:
<Connector port="8080" protocol="HTTP/1.1" connectionTimeout="20000" URIEncoding="UTF-8" redirectPort="8443" />上面我们增加了一个URIEncoding="UTF-8"的配置。
如果想使用一个新的未被占用的端口,则可以在apache-tomcat-6.0.32\conf\server.xml中增加一个配置,例如使用8888端口,配置内容如下所示:
<Connector port="8888" protocol="HTTP/1.1" connectionTimeout="20000" URIEncoding="UTF-8" redirectPort="8443" />第5步:验证
通过上面的步骤,现在可以启动Tomcat服务器,在浏览器地址栏这种输入http://localhost:8080/apache-solr-3.5.0/,你会看到如下内容:
Welcome to Solr! Admin core0 Admin core1则说明已经安装成功了。你可以根据链接,点击进去,浏览一下Solr提供的管理界面及其相关的管理功能。
这种方式,我们不再使用Solr默认提供,并对我们非常透明的WAR包来搭建,而是根据Solr发行包中的相关内容来搭建,更确切地说,我们把Solr在一个开发工具上搭建起来,暂且不考虑源码层面的内容。我比较习惯使用MyEclipse,我使用了MyEclipse Enterprise Workbench 8.0集成开发环境。
遵循下面的步骤,就可以实现:
<env-entry> <env-entry-name>solr/home</env-entry-name> <env-entry-value>E:\Develop\myeclipse\workspace\solr35\multicore</env-entry-value> <env-entry-type>java.lang.String</env-entry-type> </env-entry>实际上,就是指定了Web容器启动后,Solr加载实例的相关配置和索引数据的目录。
另外,这样是直接在web.xml中进行了硬编码配置,如果solr/home变化了,每次都需要修改web.xml文件。还有一种方式是,直接增加Web容器的启动选项来指定,如下所示:
-Dsolr.solr.home=E:\Develop\myeclipse\workspace\solr35\multicore这样,配置就更加灵活了,非常方便。
通过上面的配置,可以启动Tomcat服务器了,并通过访问http://localhost:8080/solr35来进行验证。
基于源码搭建的好处的就是,我们在开发过程中可以方便地进行调试跟踪,这样也能够便于更深入地了解Solr框架的执行机制。Solr是基于Lucene这个开源搜索引擎库开发的框架,通过了解Solr的源代码,你可以更深入地熟悉如何在Lucene之上构建适合自己的搜索应用,甚至你完全可以将Solr改造成自己需要的应用程序。一般来说,我们使用Solr搭建搜索服务器的适合,完全可以不需要熟悉Lucene是怎么样实现索引和全文检索的,但是在Solr上进行开发调试,如调试搜索的相关度时,就需要对Lucene有一定的了解,才能在调优的过程中事半功倍。
基于源码的搭建,我采用了一种Lucene和Solr的源代码都可以进行修改,即将Lucene和Solr的代码导入的开发环境中。具体如何导入,因为代码都是开源的,你可以使用任何方法实现,不再累述。这里,我们简单说一下,我将solr和Lucene分别导入到了两个工程中:Lucene Java Project、Solr Web Project。我把工程的.classpath文件粘贴一下,以供参考:
Lucene Java Project的.classpath文件内容如下:
<?xml version="1.0" encoding="UTF-8"?> <classpath> <classpathentry kind="src" path="src/lucene/src/java"/> <classpathentry kind="src" path="src/lucene/contrib/analyzers/common/src/java"/> <classpathentry kind="src" path="src/lucene/contrib/analyzers/common/src/test"/> <classpathentry kind="src" path="src/lucene/contrib/analyzers/smartcn/src/java"/> <classpathentry kind="src" path="src/lucene/contrib/analyzers/smartcn/src/test"/> <classpathentry kind="src" path="src/lucene/contrib/analyzers/stempel/src/java"/> <classpathentry kind="src" path="src/lucene/contrib/analyzers/stempel/src/test"/> <classpathentry kind="src" path="src/lucene/contrib/benchmark/src/java"/> <classpathentry kind="src" path="src/lucene/contrib/benchmark/src/test"/> <classpathentry kind="src" path="src/lucene/contrib/demo/src/java"/> <classpathentry kind="src" path="src/lucene/contrib/demo/src/test"/> <classpathentry kind="src" path="src/lucene/contrib/facet/src/java"/> <classpathentry kind="src" path="src/lucene/contrib/facet/src/test"/> <classpathentry kind="src" path="src/lucene/contrib/facet/src/examples"/> <classpathentry kind="src" path="src/lucene/contrib/grouping/src/java"/> <classpathentry kind="src" path="src/lucene/contrib/grouping/src/test"/> <classpathentry kind="src" path="src/lucene/contrib/highlighter/src/java"/> <classpathentry kind="src" path="src/lucene/contrib/highlighter/src/test"/> <classpathentry kind="src" path="src/lucene/contrib/icu/src/java"/> <classpathentry kind="src" path="src/lucene/contrib/icu/src/tools/java"/> <classpathentry kind="src" path="src/lucene/contrib/icu/src/test"/> <classpathentry kind="src" path="src/lucene/contrib/instantiated/src/java"/> <classpathentry kind="src" path="src/lucene/contrib/instantiated/src/test"/> <classpathentry kind="src" path="src/lucene/contrib/join/src/java"/> <classpathentry kind="src" path="src/lucene/contrib/join/src/test"/> <classpathentry kind="src" path="src/lucene/contrib/memory/src/java"/> <classpathentry kind="src" path="src/lucene/contrib/memory/src/test"/> <classpathentry kind="src" path="src/lucene/contrib/misc/src/java"/> <classpathentry kind="src" path="src/lucene/contrib/misc/src/test"/> <classpathentry kind="src" path="src/lucene/contrib/queries/src/java"/> <classpathentry kind="src" path="src/lucene/contrib/queries/src/test"/> <classpathentry kind="src" path="src/lucene/contrib/queryparser/src/java"/> <classpathentry kind="src" path="src/lucene/contrib/queryparser/src/test"/> <classpathentry kind="src" path="src/lucene/contrib/remote/src/java"/> <classpathentry kind="src" path="src/lucene/contrib/remote/src/test"/> <classpathentry kind="src" path="src/lucene/contrib/spatial/src/java"/> <classpathentry kind="src" path="src/lucene/contrib/spatial/src/test"/> <classpathentry kind="src" path="src/lucene/contrib/spellchecker/src/java"/> <classpathentry kind="src" path="src/lucene/contrib/spellchecker/src/test"/> <classpathentry kind="src" path="src/lucene/contrib/xml-query-parser/src/java"/> <classpathentry kind="src" path="src/lucene/contrib/xml-query-parser/src/test"/> <classpathentry kind="src" path="src/lucene/contrib/xml-query-parser/src/demo/java"/> <classpathentry kind="src" path="src/lucene/test-framework/src/java"/> <classpathentry kind="con" path="org.eclipse.jdt.launching.JRE_CONTAINER/org.eclipse.jdt.internal.debug.ui.launcher.StandardVMType/JavaSE-1.6"/> <classpathentry kind="con" path="org.eclipse.jdt.USER_LIBRARY/Contributions Dependences"/> <classpathentry kind="con" path="org.eclipse.jdt.USER_LIBRARY/Lucene Contrib Dependences"/> <classpathentry kind="con" path="org.eclipse.jdt.USER_LIBRARY/JUnit 4.7"/> <classpathentry kind="output" path="bin"/> </classpath>Solr Web Project的.classpath文件内容如下:
<?xml version="1.0" encoding="UTF-8"?> <classpath> <classpathentry kind="src" path="src/solr/solrj/src/java"/> <classpathentry kind="src" path="src/solr/solrj/src/test"/> <classpathentry kind="src" path="src/solr/core/src/java"/> <classpathentry kind="src" path="src/solr/core/src/test"/> <classpathentry kind="src" path="src/solr/contrib/analysis-extras/src/java"/> <classpathentry kind="src" path="src/solr/contrib/analysis-extras/src/test"/> <classpathentry kind="src" path="src/solr/contrib/clustering/src/java"/> <classpathentry kind="src" path="src/solr/contrib/clustering/src/test"/> <classpathentry kind="src" path="src/solr/contrib/dataimporthandler/src/java"/> <classpathentry kind="src" path="src/solr/contrib/dataimporthandler/src/test"/> <classpathentry kind="src" path="src/solr/contrib/dataimporthandler-extras/src/java"/> <classpathentry kind="src" path="src/solr/contrib/dataimporthandler-extras/src/test"/> <classpathentry kind="src" path="src/solr/contrib/extraction/src/java"/> <classpathentry kind="src" path="src/solr/contrib/extraction/src/test"/> <classpathentry kind="src" path="src/solr/contrib/langid/src/java"/> <classpathentry kind="src" path="src/solr/contrib/langid/src/test"/> <classpathentry kind="src" path="src/solr/contrib/uima/src/java"/> <classpathentry kind="src" path="src/solr/contrib/uima/src/test"/> <classpathentry kind="src" path="src/solr/contrib/velocity/src/java"/> <classpathentry kind="src" path="src/solr/contrib/velocity/src/test"/> <classpathentry kind="src" path="src/solr/test-framework/src/java"/> <classpathentry kind="con" path="org.eclipse.jdt.launching.JRE_CONTAINER"/> <classpathentry kind="con" path="melibrary.com.genuitec.eclipse.j2eedt.core.MYECLIPSE_JAVAEE_5_CONTAINER"/> <classpathentry kind="con" path="org.eclipse.jdt.USER_LIBRARY/Solr Contrib Dependences"/> <classpathentry kind="lib" path="WebRoot/WEB-INF/lib/apache-solr-noggit-r1099557.jar"/> <classpathentry kind="lib" path="WebRoot/WEB-INF/lib/commons-codec-1.5.jar"/> <classpathentry kind="lib" path="WebRoot/WEB-INF/lib/commons-csv-1.0-SNAPSHOT-r966014.jar"/> <classpathentry kind="lib" path="WebRoot/WEB-INF/lib/commons-fileupload-1.2.1.jar"/> <classpathentry kind="lib" path="WebRoot/WEB-INF/lib/commons-httpclient-3.1.jar"/> <classpathentry kind="lib" path="WebRoot/WEB-INF/lib/commons-io-1.4.jar"/> <classpathentry kind="lib" path="WebRoot/WEB-INF/lib/commons-lang-2.4.jar"/> <classpathentry kind="lib" path="WebRoot/WEB-INF/lib/easymock-2.2.jar"/> <classpathentry kind="lib" path="WebRoot/WEB-INF/lib/geronimo-stax-api_1.0_spec-1.0.1.jar"/> <classpathentry kind="lib" path="WebRoot/WEB-INF/lib/guava-r05.jar"/> <classpathentry kind="lib" path="WebRoot/WEB-INF/lib/jcl-over-slf4j-1.6.1.jar"/> <classpathentry kind="lib" path="WebRoot/WEB-INF/lib/junit-4.7.jar"/> <classpathentry kind="lib" path="WebRoot/WEB-INF/lib/log4j-over-slf4j-1.6.1.jar"/> <classpathentry kind="lib" path="WebRoot/WEB-INF/lib/servlet-api-2.4.jar"/> <classpathentry kind="lib" path="WebRoot/WEB-INF/lib/slf4j-api-1.6.1.jar"/> <classpathentry kind="lib" path="WebRoot/WEB-INF/lib/slf4j-jdk14-1.6.1.jar"/> <classpathentry kind="lib" path="WebRoot/WEB-INF/lib/wstx-asl-3.2.7.jar"/> <classpathentry combineaccessrules="false" kind="src" path="/lucene35"/> <classpathentry kind="lib" path="WebRoot/WEB-INF/lib/core-3.1.1.jar"/> <classpathentry kind="lib" path="WebRoot/WEB-INF/lib/jetty-6.1.26-patched-JETTY-1340.jar"/> <classpathentry kind="lib" path="WebRoot/WEB-INF/lib/jetty-util-6.1.26-patched-JETTY-1340.jar"/> <classpathentry kind="lib" path="WebRoot/WEB-INF/lib/jsp-2.1-glassfish-2.1.v20091210.jar"/> <classpathentry kind="lib" path="WebRoot/WEB-INF/lib/jsp-2.1-jetty-6.1.26.jar"/> <classpathentry kind="lib" path="WebRoot/WEB-INF/lib/jsp-api-2.1-glassfish-2.1.v20091210.jar"/> <classpathentry kind="lib" path="WebRoot/WEB-INF/lib/servlet-api-2.5-20081211.jar"/> <classpathentry kind="output" path="WebRoot/WEB-INF/classes"/> </classpath>搭建起来开发环境,你可以更加深入的学习Solr了。
这里,分享一些有关学习开发Solr的参考链接,可能在实际学习和开发中,会起到一些帮助。
01. Solr http://lucene.apache.org/solr/ http://wiki.apache.org/solr/SolrResources http://lucene.apache.org/solr/features.html http://lucene.apache.org/solr/tutorial.html http://wiki.apache.org/solr https://issues.apache.org/jira/browse/SOLR http://wiki.apache.org/solr/FAQ http://wiki.apache.org/solr/SolrRelevancyFAQ http://wiki.apache.org/solr/SolrRelevancyCookbook http://khaidoan.wikidot.com/solr http://yonik.wordpress.com/ 02. Solr mailing list http://mail-archives.apache.org/mod_mbox/lucene-solr-user/ http://lucene.472066.n3.nabble.com/Solr-f472067.html 03. Solr Insntallation http://wiki.apache.org/solr/SolrTomcat http://wiki.apache.org/solr/SolrJetty http://wiki.apache.org/solr/SolrResin http://wiki.apache.org/solr/SolrJBoss http://wiki.apache.org/solr/SolrWebSphere http://wiki.apache.org/solr/SolrWeblogic http://wiki.apache.org/solr/SolrGlassfish http://redmine.synyx.org/projects/opencms-solr-module/wiki/Integrating_Solr_into_an_existing_application 04. Solr Basis http://wiki.apache.org/solr/SolrTerminology http://wiki.apache.org/solr/SchemaXml http://svn.apache.org/viewvc/lucene/dev/trunk/solr/example/solr/conf/schema.xml?view=markup http://wiki.apache.org/solr/SolrConfigXml http://svn.apache.org/repos/asf/lucene/dev/trunk/solr/example/solr/conf/solrconfig.xml http://wiki.apache.org/solr/UpdateXmlMessages http://wiki.apache.org/solr/CommonQueryParameters http://wiki.apache.org/solr/LocalParams http://wiki.apache.org/solr/SolrQuerySyntax http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters http://wiki.apache.org/solr/SolrInstall http://wiki.apache.org/solr/mySolr 05. Solr Functions http://wiki.apache.org/solr/FunctionQuery http://www.lucidimagination.com/blog/2009/11/20/fun-with-solr-functions/ http://www.supermind.org/blog/756/how-to-write-a-custom-solr-functionquery http://search.lucidimagination.com/search/out?u=http%3A%2F%2Flucene.472066.n3.nabble.com%2Fhow-do-I-create-custom-function-that-uses-multiple-ValuesSources-tp1402645p1402645.html 06. Solr Payload http://wiki.apache.org/solr/Payloads http://lucene.472066.n3.nabble.com/How-to-use-Payloads-with-Solr-td679691.html http://stephenpope.co.uk/?p=32 http://lucene.472066.n3.nabble.com/synonym-payload-boosting-td563432.html 07. Solr Dismax http://wiki.apache.org/solr/DisMax http://www.lucidimagination.com/blog/2010/05/23/whats-a-dismax/ http://wiki.apache.org/solr/DisMaxRequestHandler http://wiki.apache.org/solr/DisMaxQParserPlugin http://lucene.apache.org/solr/api/org/apache/solr/util/doc-files/min-should-match.html http://lucene.472066.n3.nabble.com/Dismax-Boosting-td1906083.html http://lucene.472066.n3.nabble.com/configure-dismax-requesthandlar-for-boost-a-field-td3137239.html http://code.google.com/p/solr-boostmax/ http://www.lucidimagination.com/blog/2009/03/31/nested-queries-in-solr/ 08. Solr Performance http://wiki.apache.org/solr/SolrPerformanceData http://wiki.apache.org/solr/SolrPerformanceFactors http://wiki.apache.org/solr/BenchmarkingSolr http://code.google.com/p/solrmeter/ 09. Solr Cache http://wiki.apache.org/solr/SolrCaching http://wiki.apache.org/solr/SolrConfigXml#HTTP_Caching http://wiki.apache.org/solr/SolrAndHTTPCaches http://java.dzone.com/news/optimization-%E2%80%93-filter-cache?mz=33057-solr_lucene 10. Solr Faceted Search http://wiki.apache.org/solr/SolrFacetingOverview http://wiki.apache.org/solr/SimpleFacetParameters http://www.lucidimagination.com/Community/Hear-from-the-Experts/Articles/Faceted-Search-Solr http://www.lucidimagination.com/devzone/technical-articles/faceted-search-solr http://stackoverflow.com/questions/3068810/solr-multiple-facet-dates 11. Solr Distributed http://wiki.apache.org/solr/DistributedSearch http://wiki.apache.org/solr/CollectionDistribution http://wiki.apache.org/solr/SolrReplication http://wiki.apache.org/solr/WritingDistributedSearchComponents http://wiki.apache.org/solr/SolrCloud http://wiki.apache.org/solr/SolrCollectionDistributionScripts http://wiki.apache.org/solr/SolrCollectionDistributionStatusStats http://wiki.apache.org/solr/SolrCollectionDistributionOperationsOutline 12. Solr Clustering http://wiki.apache.org/solr/ClusteringComponent http://www.lucidimagination.com/blog/2009/09/28/solrs-new-clustering-capabilities/ 13. Solr Near Realtime Search http://wiki.apache.org/solr/NearRealtimeSearch http://www.lucidimagination.com/blog/2011/07/11/benchmarking-the-new-solr-%E2%80%98near-realtime%E2%80%99-improvements/ 14. Solr Spellchecker http://wiki.apache.org/solr/SpellCheckComponent http://wiki.apache.org/solr/SpellCheckingAnalysis