This command starts up a Solr server and bootstraps a new solr cluster.
cd example
java -Dbootstrap_confdir=./solr/collection1/conf -Dcollection.configName=myconf -DzkRun -DnumShards=2 -jar start.jar
?-DzkRun causes an embedded zookeeper server to be run as part of this Solr server. 将zookeeper服务器嵌入到SOLR服务器当中, 作为SOLR服务器的一部分来运行
?-Dbootstrap_confdir=./solr/collection1/conf Since we don't yet have a config in zookeeper, this parameter causes the local configuration directory ./solr/conf to be uploaded as the "myconf" config. The name "myconf" is taken from the "collection.configName" param below.
解释:-Dbootstrap_confdir=./solr/collection1/conf 将./solr/collection1/conf 里的配置文件上传到zookeeper分布式文件系统(zookeeper distributed filesystem)中,命名为collection.configName属性的值,下面将这个值设置为了myconf
?-Dcollection.configName=myconf sets the config to use for the new collection. Omitting this param will cause the config name to default to "configuration1".
?-DnumShards=2 the number of logical partitions we plan on splitting the index into. 想将索引分成几块,这个值就设为几
If at any point you wish to start over fresh or experiment with different configurations, you can delete all of the cloud state contained within zookeeper by simply deleting the solr/zoo_data directory after shutting down the servers.
如果想重新配置zookeeper节点(比如多加一个Shard,numShards=3),把solr/zoo_data 文件夹删除然后重新java -Dbootstrap_confdir=./solr/collection1/conf -Dcollection.configName=myconf -DzkRun -DnumShards=2 -jar start.jar就可以了。
如果不想重新配置zookeeper节点,下次启动jetty时只要java -DzkRun -jar start.jar就行了,上面那个命令只是第一次配置zookeeper用
cd exampledocs
java -Durl=http://localhost:8983/solr/collection1/update -jar post.jar ipod_video.xml
java -Durl=http://localhost:8983/solr/collection1/update -jar post.jar monitor.xml
java -Durl=http://localhost:8983/solr/collection1/update -jar post.jar mem.xml
向8983端口的服务器创建索引,创建完后查看一下codeAdmin里有5个docs.
java -Durl=http://localhost:7574/solr/collection1/update -jar post.jar monitor2.xml
向7574端口的服务器创建索引,创建完后查看一下codeAdmin里有1个docs.
此时浏览器里输入:http://localhost:8983/solr/collection1/select?q=*:* ,在8983端口服务器上执行查询,返回的结果集为6条docs,而不是8983节点自己的5条docs,说明分布式搜索起作用了,把7574节点的一个doc也返回了。
同理输入http://localhost:7574/solr/collection1/select?q=*:*,查出来的结果集也为6条,而我们只向7574节点创建了一个doc。
再分别复制上面两个example :
cp -r example exampleB
cp -r example2 example2B
再将这两个服务器启动:
cd exampleB
java -Djetty.port=8900 -DzkHost=localhost:9983 -jar start.jar
cd example2B
java -Djetty.port=7500 -DzkHost=localhost:9983 -jar start.jar
Because we have been telling Solr that we want two logical shards, starting instances 3 and 4 are assigned to be additional replicas of those shards automatically.
因为我们已经告诉solr我们只需要两个shard,此时这两个实例自动作为上面两个shard的副本(replicas)
我们访问http://localhost:7500/solr/collection1/select?q=*:*
Send this query multiple times and observe the logs from the solr servers. You should be able to observe Solr load balancing the requests (done via LBHttpSolrServer ?) across replicas, using different servers to satisfy each request. There will be a log statement for the top-level request in the server the browser sends the request to, and then a log statement for each sub-request that are merged to produce the complete response.
多访问几次这个地址,然后观察控制台的日志输出,
你会观察到SOLR对请求自动进行了负载均衡,浏览器发送请求的顶级服务器(top-level request in the server)先打印一个日志信息,接下来的一个请求被负载均衡转发到其他服务器,在其他服务器上打印一个日志信息
To demonstrate fail-over for high availability, press CTRL-C in the window running any one of the Solr servers except the instance running ZooKeeper. (We'll talk about ZooKeeper redundancy in Example C.) Once that server instance terminates, send another query request to any of the remaining servers that are up. You should continue to see the full results.
只要不关闭zookeeper 嵌入的那个solr server,每个shard下至少有一台服务器可用,就能返回完整的查询结果。
SolrCloud can continue to serve results without interruption as long as at least one server hosts every shard. You can demonstrate this by judiciously shutting down various instances and looking for results. If you have killed all of the servers for a particular shard, requests to other servers will result in a 503 error. To return just the documents that are available in the shards that are still alive (and avoid the error), add the following query parameter: shards.tolerant=true
如果将一个shard下的所有服务器都关闭的话,再执行查询就会报503错误,除非在查询URL中再加上这个参数shards.tolerant=true ,就可以容错,不过返回的结果集就不完整了,只会返回可用的shard的结果集。
while adding more ZooKeeper nodes will help some with read performance, it will slightly hurt write performance.
zookeeper节点越多,读性能越好,对写性能有轻微的损害