13.1 Getting started with SolrCloud
13.1.1Starting Solr in cloud mode
单机建立一个集群应用,一个端口模拟一个solr
cd $SOLR_INSTALL/
cp -r example/ shard1/
13.1.2 Motivation behind the SolrCloud architecture
■ Scalability
■ High availability
■ Consistency
■ Simplicity
■ Elasticity
----------------------------------------
■ Scalability
*replication可以提高容错性,并且提供query的并行性
我们的目标是linearly scalable,但实际上增加资源要增加额外的管理开销,所以只能接近这个目标
一个Solr的index至多21亿的文档(int64的ID),解决方法是索引分片shard
大文档和多field需要更多的内存和更快的磁盘IO,解决:Add RAM and faster disks
Index吞吐量:需要每秒索引数千文档,解决:分布式索引
query量:使用“复制”并行query
query复杂性(facet,sort等):使用shard和replication
-----------------------------------------------------------------
■ High availability(高可靠性)
从商业的角度考虑问题:How much you can spend
failover失败备缓
数据冗余:失败时不用复制数据到正常机器
1 Unexpected outages that affect a subset of the nodes in your cluster due to issues
such as hardware faults and loss of network connectivity
2 Planned outages due to upgrades and system maintenance tasks
3 Degraded service due to heavy system load
4 Disasters that take your entire cluster/data center offline
Solr提供单数据中心的高可靠性,多数据中心还未提供支持
服务的两种架构:1.所有的node都提供index和query2.master nodes提供index,slave nodes提供query
minimize downtime during upgrades:rolling restart
另一种outage:过载,query返回过慢,在用户端是不能容许的!
解决:可靠的管理系统,快速添加node的能力
高级话题:硬件层优化,如RAID等
--------------------------------------------
■ Consistency
根据CAP原则,可用性与一致性不可兼得?
更新操作必须在所有replicas上成功,否则整个操作失败。solr不允许replicas上的query返回不同版本的文档。
Solr目前对不一致性是0容忍的。
-----------------------------------------------
■ SIMPLICITY
*一但集群启动,操作不比单机复杂
*fail node恢复简单:自动同步
Zookeeper可以看成黑盒技术,处理初始化就不用太管了。
ELASTICITY
扩展系统的能力:shard继续分成更小的shard,增加replica
---------------------------------------------------------------
13.2 Core concepts
13.2.1 Collections vs. cores
Collections提供一个schema的整个服务,可有多个cores组成,每个core是一个shard或replica?。
shard是互不相交的索引分片,replica是shard的复制,一个shard有多个replica,其中一个是leader
13.2.2 ZooKeeper
■ Centralized configuration storage and distribution
■ Detection and notification when the cluster state changes
■ Shard-leader election
成熟稳定广泛应用
ZOOKEEPER DATA MODEL
组织数据为类似于文件系统的分层结构,每层称为znode,包含基本的元数据,每个znode最多存1mb数据。ZooKeeper不是用来做数据存储系统的,只存小的元数据。
一个中心概念:ephemeral znode,短暂的znode?由客户端连接使其保持actvie。如果客户端失去连接,短暂zndoe被自动删除。
一个Solr的node加入集群,Zookeeper会为其创建znode,如果该node失联,Zookeeper还会通知其他node
ZNODE WATCHER
任何客户端应用都可以注册为watcher,znode改变,Zookeeper就会通知watcher
PRODUCTION CONFIGURATION
对于产品来讲,配置一个独立的Zookeeper全体,有3个node组成
zkHost参数将Zookeeper的服务器和端口传给Solr
ZOOKEEPER CLIENT TIMEOUT
Zookeeper检视solr状态的超时参数,默认15秒
CENTRALIZED CONFIGURATION STORAGE AND DISTRIBUTION
solrconfig和schema都被提交到Zookeeper上!
13.2.3 Choosing the number of shards and replicas
有文档数,文档大小,index,query吞吐量,query复杂性,index增长等因素决定。12章Solr产品化有讲
13.2.4 Cluster-state management
active,inactive等
13.2.5 Shard-leader election
shard leader接受更新请求,并发布到replicas上使其同步,Specifically,
■ Accepts update requests for the shard
■ Increments the value of the _version_ field on the updated document and enforces optimistic locking
■ Writes the document to its update log
■ Sends the update (in parallel) to all replicas and blocks until a response is received
shard leader在query时没有额外的责任
13.2.6 Important SolrCloud configuration settings
solr.xml有<solrcould>标签
HOST:向Zookeeper提供ip和端口,产品化时最好使用host name,更可视化,并且易于更新(更新dns
具体425-426
***********************************************************
13.3 Distributed indexing
客户单的角度,index没有改变。服务器端index改变巨大,
13.3.1 Document shard assignment
document router:文档路由,决定文档分配到哪个shard
两个solr提供的策略:compositeId (default) and implicit(不讨论,路由需要客户端编程完成,定制化路由)
每个shard分配32位的hash range,范围平均分配到每个shard
该算法使用unique document ID计算hash,分配到该范围的shard中
计算需要快速且对shard公平。
使用MurmurHash算法
13.3.2 Adding documents
SolrJ提供新的SolrServer实现:CloudSolrServer,是index更鲁棒
CloudSolrServer读取zookeeper的cluster-state,直到shard leader,因为update request要先路由到leader,CloudSolrServer可以直接发给leader节省时间
具体步骤略读P430-431
一批文档CloudSolrServer自动分组,高吞吐量index到正确的shard上
13.3.3 NRT
实际上是soft commit,略
13.3.4 Node recovery
■ Peer sync—If the outage was short-lived and the recovering node missed only a few updates, it will recover by pulling updates from the shard leader’s update log. The upper limit on missed updates is currently hardcoded to 100. If the number of missed updates exceeds this limit, the recovering node pulls a full index snapshot from the shard leader.
■ Snapshot replication—If a node is offline for an extended period of time such that it becomes too far out of sync with the shard leader, it uses Solr’s HTTPbased replication, based on the snapshot of the index.
-----------------------------------------------------------------------------
13.4 Distributed search