Elasticsearch集群学习笔记

1、分配感知

主分片不会被移动，但是副本分片将被移动到拥有不同node.rack 参数值的节点。分片的分配是一个很方便的特性，用来防止中心点失败导致的故障。常见的方法是按照地点、机架，甚至是虚拟机来划分集群的拓扑

基于分片的分配

cluster.routing.allocation.awareness.attributes: rack
node.rack: 1

基于区域的分配

cluster.routing.allocation.awareness.attributes: zone
cluster.routing.allocation.force.zone.values: us-east, us-west

2、监控瓶颈

集群健康API的请求

“curl -XGET 'localhost:9200/_cluster/health?pretty';
请求的答复是这样的：

　　　　{
　　　　　"cluster_name" : "elasticiq",
　　　　　"status" : "green",  ←---集群状态指示器：方便的集群整体健康指示器
　　　　　"timed_out" : false,
　　　　　"number_of_nodes" : 2,  ←---集群中节点的总数量
　　　　　"number_of_data_nodes" : 2,
　　　　　"active_primary_shards" : 10,  ←---在集群中，存放数据的节点总数量
　　　　　"active_shards" : 10,  ←---集群中全部索引的主分片总数量
　　　　　"relocating_shards" : 0,  ←---集群中全部索引的所有分片、包括主分片和副本分片的总数量
　　　　　"initializing_shards" : 0,  ←---当下正在多个节点间移动的分片数量
　　　　　"unassigned_shards" : 0  ←---新创建的分片数量
　　　　}  ←---集群中定义的、却未能发现的分片数量”

慢日志

慢查询日志和慢索引日志

curl -XPUT 'http://lzz:9201/_settings/' -d
{
    "index.indexing.slowlog.level": "INFO",
    "index.indexing.slowlog.threshold.index.warn": "10s",
    "index.indexing.slowlog.threshold.index.info": "5s",
    "index.indexing.slowlog.threshold.index.debug": "2s",
    "index.indexing.slowlog.threshold.index.trace": "50ms",
    "index.indexing.slowlog.source": "1000",
    "index.search.slowlog.level": "INFO",
    "index.search.slowlog.threshold.query.warn": "10s",
    "index.search.slowlog.threshold.query.info": "5s",
    "index.search.slowlog.threshold.query.debug": "2s",
    "index.search.slowlog.threshold.query.trace": "50ms",
    "index.search.slowlog.threshold.fetch.warn": "1s",
    "index.search.slowlog.threshold.fetch.info": "800ms",
    "index.search.slowlog.threshold.fetch.debug": "500ms",
    "index.search.slowlog.threshold.fetch.trace": "20ms"
}

在logging.yml文件中,可以配置存放日志输出的实际文件，以及其他一些日志功能.

index_search_slow_log_file:
　type: dailyRollingFile
　file: ${path.logs}/${cluster.name}_index_search_slowlog.log
　datePattern: "'.'yyyy-MM-dd"
　layout:
　　type: pattern
　　conversionPattern: "[%d{ISO8601}][%-5p][%-25c] %m%n"

3、线程池

集群中的每个节点通过线程池来管理CPU和内存的使用，ES通过线程池来优化节点性能。
线程池有两种类型，分别是fixed和cache，类型详解如下：

fixed

fixed的线程池类型保持固定数量的线程来处理请求，等待的执行的请求则使用后援队列。其中配置文件中，threadpool.bulk.queue_size 线程的数量，默认为CPU核数的5倍。

cache

cache类型线程池不限制线程的数量，只要有等待执行的请求，就会创建一个新的线程。