Solr/Lucene分布式搜索,Solr Integrate katta step3

前面的两篇介绍了安装katta及ZooKeeper,后边来介绍katta的Node.

我们回到step1 后边提到的solr-katta-plugin项目,源码导入后会出现很多的错误,在项目中继承了solr-core,和solrj中的类尝试着把访问修饰private改为protected.

如:solr-core org.apache.solr.handler.component.SearchHandler类中的shardHandlerFactory成员变量


protected ShardHandlerFactory shardHandlerFactory = new HttpShardHandlerFactory();
同时借鉴 https://issues.apache.org/jira/browse/SOLR-1395 Tomliu的做法,把



the bugs is :
1. solr's ShardDoc.java, ShardFieldSortedHitQueue line 210 :
        final float f1 = e1.score == null ? 0.00f : e1.score;
        final float f2 = e2.score == null ? 0.00f : e2.score;
等等.直到项目基本错误解决.


下载solr并且copy其中的apache-solr-3.6.1/example/solr到//data下.

目录结构如:


solr
  --bin/
  --conf/
    --schama.xml
    --solrconfig.xml
    --...
  --data/
    --index/
    --spellchecker/
  --solr.xml
  --README.txt
我们视/data/solr 为solr.home,改变${solr.home}/solr.xml为:



<solr persistent="false">

  <!--
  adminPath: RequestHandler path to manage cores.  
    If 'null' (or absent), cores will not be manageable via request handler
  -->
  <cores adminPath="/admin/cores" defaultCoreName="proxy">
    <core name="proxy" instanceDir="proxy" />
  </cores>
</solr>
改变后的目录结构为[ 在solr.home 中新建proxy文件夹并把conf/ data/放到proxy文件夹下面],如:



solr/
  --proxy/
    --bin/
    --conf/
      --schama.xml
      --solrconfig.xml
      --....
    --data/
      --index/
      --spellchecker
  --solr.xml
上面的步骤在我尝试的时候费神了很久,为什么是这样的?我在代码发现的:org.apache.solr.katta.DeployableSolrKattaServer



        final public static String serverNameDefault = "proxy";
	final public static String serverNameProperty = "solr.server.name";

	final public static String solrHomeDefault = "solrHome";
	final public static String solrHomeProperty = "solr.home";

	final public static String solrConfigFileDefault = "solr.xml";
	final public static String solrConfigFileProperty = "solr.config.name";
 public static String getServerName() {
      return System.getProperty(serverNameProperty, serverNameDefault);
 }
所以solr.home/solr.xml中配置的默认的solr coreName必须为proxy,否则启动Node的时候需要加上-Dsolr.server.name=youname,这里我不更改了使用它的配置


更改solrconfig.xml中增加[可加在<requestHandler name="search" class="solr.SearchHandler">之前,并去掉search的default=true]:


     <requestHandler name="standard" class="solr.MultiEmbeddedSearchHandler" default="true">
    <!-- default values for query parameters -->
     <lst name="defaults">
       <str name="echoParams">explicit</str>
       <int name="rows">10</int>
     </lst>
  </requestHandler>
  
ok,我们在把配置好了的solr复制一份为solrhome1.这时我们有了/data/solrhome1


现在copy ${katta_install}/conf/ katta.node.properties, katta.zk.properties, log4j.properties到solr-katta-plugin项目的test/下[test为测试源码目录],更改katta.node.properties

#node.server.class=net.sf.katta.lib.lucene.LuceneServer
node.server.class=org.apache.solr.katta.DeployableSolrKattaServer


创建一个启动类:


public class Launcher {

    public final static DateFormat DATE_FORMAT = new SimpleDateFormat("yy_MM_dd_HH_mm_ss");

    static {
        System.setProperty("solr.home", "/data/solr");

        System.setProperty("solr.server.name", "proxy");

        System.setProperty("solr.directoryFactory", "solr.MMapDirectoryFactory");

        System.setProperty("katta.log.dir", "/data/logs");

        System.setProperty("katta.log.file", "katta_" + DATE_FORMAT.format(new Date()) + ".log");

    }

    public Launcher() {


    }

    public static void main(String[] args) throws Exception {
        Katta.main(new String[]{"startNode", "-c", "org.apache.solr.katta.DeployableSolrKattaServer"});
    }
}
运行该类,我们可以看到它加载了solr



10-18 14:22:02 [INFO 级别][ core.SolrCore ]类 (83 行) QuerySenderListener done.
10-18 14:22:02 [INFO 级别][ component.SpellCheckComponent ]类 (673 行) Loading spell index for spellchecker: default
10-18 14:22:02 [INFO 级别][ core.SolrCore ]类 (1325 行) [] Registered new searcher Searcher@70f5d656 main
10-18 14:22:02 [INFO 级别][ ipc.Server ]类 (328 行) Starting SocketReader
10-18 14:22:02 [INFO 级别][ node.Node ]类 (226 行) DeployableSolrKattaServer server started on : zhenqin-K45VM:20000
10-18 14:22:02 [INFO 级别][ ipc.Server ]类 (598 行) IPC Server Responder: starting
10-18 14:22:02 [INFO 级别][ ipc.Server ]类 (434 行) IPC Server listener on 20000: starting
10-18 14:22:02 [INFO 级别][ ipc.Server ]类 (1358 行) IPC Server handler 0 on 20000: starting


然后,回到终端执行:$bin/katta addIndex solrhome1 file:///data/solrhome1

当然,我们也可以把该solrhome1打包放入HDFS,$bin/katta addIndex solrhome1 hdfs://localhost:9000/solr/solrhome1.zip


$ bin/katta addIndex solrhome1 file:///data/solrhome1
.....
deployed index 'solrhome1' in 4840 ms
这是可以看见Eclipse控制台输出了:



10-18 14:24:02 [INFO 级别][ node.Node ]类 (279 行) executing ShardDeployOperation:3cca2475:[solrhome1#proxy]
10-18 14:24:02 [INFO 级别][ node.AbstractShardOperation ]类 (55 行) deploy shard 'solrhome1#proxy'
10-18 14:24:02 [INFO 级别][ node.ShardManager ]类 (104 行) install shard 'solrhome1#proxy' from file:/media/Study/data/katta/testIndexA/proxy
10-18 14:24:03 [WARN 级别][ util.NativeCodeLoader ]类 (52 行) Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
zhenqin-K45VM:20000 addShard solrhome1#proxy solrhome:/media/Study/data/katta/katta-shards/zhenqin-K45VM_20000/solrhome1#proxy
10-18 14:24:04 [INFO 级别][ core.SolrResourceLoader ]类 (103 行) new SolrResourceLoader for directory: '/media/Study/data/katta/katta-shards/zhenqin-K45VM_20000/solrhome1#proxy/'
10-18 14:24:04 [INFO 级别][ core.CoreContainer ]类 (448 行) Creating SolrCore 'solrhome1#proxy' using instanceDir: /media/Study/data/katta/katta-shards/zhenqin-K45VM_20000/solrhome1#proxy
10-18 14:24:04 [INFO 级别][ core.SolrResourceLoader ]类 (103 行) new SolrResourceLoader for directory: '/media/Study/data/katta/katta-shards/zhenqin-K45VM_20000/solrhome1#proxy/'
10-18 14:24:05 [WARN 级别][ core.SolrConfig ]类 (148 行) <indexDefaults> and <mainIndex> configuration sections are deprecated (but still work). Please use <indexConfig> instead.
10-18 14:24:05 [INFO 级别][ core.SolrConfig ]类 (162 行) Using Lucene MatchVersion: LUCENE_36
10-18 14:24:05 [INFO 级别][ core.Config ]类 (248 行) Loaded SolrConfig: solrconfig.xml
10-18 14:24:05 [INFO 级别][ schema.IndexSchema ]类 (413 行) Reading Solr Schema
10-18 14:24:05 [INFO 级别][ schema.IndexSchema ]类 (427 行) Schema name=collection1
10-18 14:24:05 [INFO 级别][ node.AbstractShardOperation ]类 (75 行) publish shard 'solrhome1#proxy'
10-18 14:24:05 [INFO 级别][ core.SolrCore ]类 (1325 行) [solrhome1#proxy] Registered new searcher Searcher@61db327f main
10-18 14:24:05 [INFO 级别][ core.SolrCore ]类 (43 行) QuerySenderListener sending requests to Searcher@48b4e8e2 main
10-18 14:24:05 [INFO 级别][ core.SolrCore ]类 (1386 行) [solrhome1#proxy] webapp=null path=null params={event=firstSearcher&q=solr+first+searching..} hits=0 status=0 QTime=6 
10-18 14:24:05 [INFO 级别][ core.SolrCore ]类 (83 行) QuerySenderListener done.
10-18 14:24:05 [INFO 级别][ component.SpellCheckComponent ]类 (673 行) Loading spell index for spellchecker: default
10-18 14:24:05 [INFO 级别][ search.SolrIndexSearcher ]类 (256 行) Closing Searcher@61db327f main
	fieldValueCache{lookups=0,hits=0,hitratio=0.00,inserts=0,evictions=0,size=0,warmupTime=0,cumulative_lookups=0,cumulative_hits=0,cumulative_hitratio=0.00,cumulative_inserts=0,cumulative_evictions=0}
	filterCache{lookups=0,hits=0,hitratio=0.00,inserts=0,evictions=0,size=0,warmupTime=0,cumulative_lookups=0,cumulative_hits=0,cumulative_hitratio=0.00,cumulative_inserts=0,cumulative_evictions=0}
	queryResultCache{lookups=0,hits=0,hitratio=0.00,inserts=1,evictions=0,size=1,warmupTime=0,cumulative_lookups=0,cumulative_hits=0,cumulative_hitratio=0.00,cumulative_inserts=0,cumulative_evictions=0}
	documentCache{lookups=0,hits=0,hitratio=0.00,inserts=0,evictions=0,size=0,warmupTime=0,cumulative_lookups=0,cumulative_hits=0,cumulative_hitratio=0.00,cumulative_inserts=0,cumulative_evictions=0}
10-18 14:24:05 [INFO 级别][ core.SolrCore ]类 (1325 行) [solrhome1#proxy] Registered new searcher Searcher@48b4e8e2 main
至此,一个Node加载一个Shard完成.

step3 is over!



你可能感兴趣的:(Lucene,Solr,Distributed,integrate,Katta)