[elasticsearch笔记] 跨集群&&集群&&API

Master Node

职责

  • 处理创建、删除索引等请求
  • 决定分片被分配到哪个节点
  • 维护并更新 Cluster State

最佳实践

  • Master非常重要,要考虑解决单点问题
  • 为集群设置多个Master节点,每个节点只承担Master的单一角色

Master Node 及 Master Eligible Nodes

  • 默认每个节点启动都是 Master Eligible Node。可通过 node.master:false 禁止
  • 第一个 Master Eligible Node 节点启动时,会将自己选举成Master节点
  • 集群状态信息,维护了一个集群中必要的信息
    • 所有节点信息
    • 所有索引和其相关的Mapping与Setting信息
    • 分片路由信息
  • 每个节点都保存了集群的状态信息
  • 只有Master可以修改集群信息并同步到其他节点

选主流程

  • 第一个 Master Eligible Node 节点启动时,会将自己选举成Master节点
  • 互相Ping对方,Node Id低的会成为被选举的节点
  • 如果发现被选中的节点丢失,就会选举出新的Master节点

脑裂问题

  • Quorum = (master节点总数/2)+1. 集群中Master节点数> quorum,才可以进行选主流程,Master必须再这样的集群中,可避免脑裂问题

跨集群搜索

  • meta信息过多,会导致active master成为瓶颈
  • 早期跨集群搜索通过 Tribe Node 实现,问题比较多
    • 以Client Node的方式加入集群。集群Master节点变更,需要Tribe Node回应才能继续
    • Tribe Node不存Cluster State信息,一旦重启,初始化很慢
    • 多个集群存在索引重名时,只能设置一种prefer规则
  • ES 5.3 引入跨集群搜索功能:Cross Cluster Search
    • 任何节点都可扮演 federated 节点
    • 轻量级方式将搜索请求进行代理,转发到各个集群

启动集群

bin/elasticsearch -E node.name=cluster0node -E cluster.name=cluster0 -E path.data=cluster0_data -E discovery.type=single-node -E http.port=9200 -E transport.port=9300
bin/elasticsearch -E node.name=cluster1node -E cluster.name=cluster1 -E path.data=cluster1_data -E discovery.type=single-node -E http.port=9201 -E transport.port=9301
bin/elasticsearch -E node.name=cluster2node -E cluster.name=cluster2 -E path.data=cluster2_data -E discovery.type=single-node -E http.port=9202 -E transport.port=9302

##在每个集群做如下设置

PUT _cluster/settings
{
  "persistent": {
    "cluster": {
      "remote": {
        "cluster0": {
          "seeds": [
            "127.0.0.1:9300"
          ],
          "transport.ping_schedule": "30s"
        },
        "cluster1": {
          "seeds": [
            "127.0.0.1:9301"
          ],
          "transport.compress": true,
          "skip_unavailable": true
        },
        "cluster2": {
          "seeds": [
            "127.0.0.1:9302"
          ]
        }
      }
    }
  }
}


curl -XPUT "http://localhost:9200/_cluster/settings" -H 'Content-Type: application/json' -d'
{"persistent":{"cluster":{"remote":{"cluster0":{"seeds":["127.0.0.1:9300"],"transport.ping_schedule":"30s"},"cluster1":{"seeds":["127.0.0.1:9301"],"transport.compress":true,"skip_unavailable":true},"cluster2":{"seeds":["127.0.0.1:9302"]}}}}}'

curl -XPUT "http://localhost:9201/_cluster/settings" -H 'Content-Type: application/json' -d'
{"persistent":{"cluster":{"remote":{"cluster0":{"seeds":["127.0.0.1:9300"],"transport.ping_schedule":"30s"},"cluster1":{"seeds":["127.0.0.1:9301"],"transport.compress":true,"skip_unavailable":true},"cluster2":{"seeds":["127.0.0.1:9302"]}}}}}'

curl -XPUT "http://localhost:9202/_cluster/settings" -H 'Content-Type: application/json' -d'
{"persistent":{"cluster":{"remote":{"cluster0":{"seeds":["127.0.0.1:9300"],"transport.ping_schedule":"30s"},"cluster1":{"seeds":["127.0.0.1:9301"],"transport.compress":true,"skip_unavailable":true},"cluster2":{"seeds":["127.0.0.1:9302"]}}}}}'

写入数据

curl -XPOST "http://localhost:9200/users/_doc" -H 'Content-Type: application/json' -d'
{"name":"user1","age":10}'

curl -XPOST "http://localhost:9201/users/_doc" -H 'Content-Type: application/json' -d'
{"name":"user2","age":20}'

curl -XPOST "http://localhost:9202/users/_doc" -H 'Content-Type: application/json' -d'
{"name":"user3","age":30}'

跨集群搜索

GET /users,cluster1:users,cluster2:users/_search
{
  "query": {
    "range": {
      "age": {
        "gte": 20,
        "lte": 40
      }
    }
  }
}

demo from ES

# If no filters are given, the default is to select all nodes
GET /_nodes
# Explicitly select all nodes
GET /_nodes/_all
# Select just the local node
GET /_nodes/_local
# Select the elected master node
GET /_nodes/_master
# Select nodes by name, which can include wildcards
GET /_nodes/node_name_goes_here
GET /_nodes/node_name_goes_*
# Select nodes by address, which can include wildcards
GET /_nodes/10.0.0.3,10.0.0.4
GET /_nodes/10.0.0.*
# Select nodes by role
GET /_nodes/_all,master:false
GET /_nodes/data:true,ingest:true
GET /_nodes/coordinating_only:true
# Select nodes by custom attribute (e.g. with something like `node.attr.rack: 2` in the configuration file)
GET /_nodes/rack:2
GET /_nodes/ra*:2
GET /_nodes/ra*:2*

GET _cluster/health
GET /_cluster/health/twitter_v1,twitter_v2
GET /_cluster/health?wait_for_status=yellow&timeout=50s
GET /_cluster/health/twitter?level=shards

GET /_cluster/state
GET /_cluster/state/metadata,routing_table/twitter_v1,twitter_v2
GET /_cluster/state/_all/twitter_v1,twitter_v2
GET /_cluster/state/blocks
GET /_cluster/stats?human&pretty
GET /_cluster/stats/nodes/node1,node*,master:false

GET /_cluster/pending_tasks

POST /_cluster/reroute
{
  "commands": [
    {
      "move": {
        "index": "test",
        "shard": 0,
        "from_node": "node1",
        "to_node": "node2"
      }
    },
    {
      "allocate_replica": {
        "index": "test",
        "shard": 1,
        "node": "node3"
      }
    }
  ]
}

#
# order of precedence: transient, persistent, elasticsearch.yml
# It’s best to set all cluster-wide settings with the settings API and use the elasticsearch.yml file only for local configurations.
#
GET /_cluster/settings
GET /_cluster/settings?include_defaults=true
PUT /_cluster/settings
{
  "persistent": {
    "indices.recovery.max_bytes_per_sec": "50mb"
  }
}PUT /_cluster/settings?flat_settings=true
{
    "transient" : {
        "indices.recovery.max_bytes_per_sec" : "20mb"
    }
}
PUT /_cluster/settings?flat_settings=true
{
  "transient": {
    "indices.recovery.max_bytes_per_sec": "20mb"
  }
}
#
# null means default
#
PUT /_cluster/settings
{
  "transient": {
    "indices.recovery.max_bytes_per_sec": null
  }
}
PUT /_cluster/settings
{
  "persistent": {
    "indices.recovery.max_bytes_per_sec": null
  }
}
PUT /_cluster/settings
{
    "transient" : {
        "indices.recovery.*" : null
    }
}

GET /_nodes/stats
GET /_nodes/es7_01,es7_02/stats/process
GET /_nodes/stats/os,process
GET /_nodes/stats/indices
GET /_nodes/172.19.0.2/stats/process
GET /_nodes/stats/indices/fielddata
GET /_nodes/stats/indices/fielddata?level=indices&fields=docs,get
GET /_nodes/stats/indices/fielddata?level=shards&fields=field1,field2
GET /_nodes/stats/indices/fielddata?fields=field*
GET /_nodes/stats?groups=_all
GET /_nodes/stats/indices?groups=foo,bar
GET /_nodes
GET /_nodesGET /_nodes/process/es7_01,es7_02
GET /_nodes/process
GET /_nodes/_all/process
GET /_nodes/es7_01,es7_02/jvm,process
GET /_nodes/es7_01,es7_02/info/jvm,process
GET /_nodes/es7_01,es7_02/_all
GET /_nodes/plugins
# ingest - if set, the result will contain details about the available processors per node
GET /_nodes/ingest

GET _nodes/usage
GET _nodes/es7_02/usage

GET /_remote/info

GET _tasks
GET _tasks?nodes=es7_01,es7_02
GET _tasks?nodes=es7_01,es7_02&actions=cluster:*
GET _tasks/P9t-IW47Q0qmIMpGSxXnEw:113339
GET _tasks?actions=*search&detailed
GET _cat/tasks?detailed&v
GET /_nodes/hot_threads
GET /_nodes/es7_01,es7_02/hot_threads
GET /_cluster/allocation/explain
POST /_cluster/voting_config_exclusions/es7_01
DELETE /_cluster/voting_config_exclusions

参考

  • 跨集群搜索

你可能感兴趣的:(elasticsearch)