光汇石油面试(部分)

  • 你们用的什么注册中心?
  • 了解zk的选举机制吗?
  • es查询的原理or流程?我答Lucene与倒排索引。Q:上层呢?参考答案1
  1. Query phase:
    When a search request is sent to a node, that node becomes the coordinating node. It is the job of this node to broadcast the search request to all involved shards, and to gather their responses into a globally sorted result set that it can return to the client:
    … forwards the search request to a primary or replica copy of every shard in the index. Each shard executes the query locally and adds the results into a local sorted priority queue.
    Each shard returns the doc IDs and sort values of all the docs in its priority queue to the coordinating node, which merges these values into its own priority queue to produce a globally sorted list of results.
  2. Fetch Phase:
    The coordinating node identifies which documents need to be fetched and issues a multi GET request to the relevant shards.
    Once all documents have been fetched, the coordinating node returns the results to the client.

有个相关扩展问题:为什么es增加副本的个数可以增加搜索的吞吐量?我原来觉得多副本只能增加系统可用性,想不通为什么也能增加搜索吞吐量。参考答案:
一个请求时只会选取一个副本分片去做查询,但是有多个请求时,多个请求会分发到不同的副本上去,从而减轻那分片的压力。2

search requests can be handled by a primary shard or by any of its replicas. This is how more replicas (when combined with more hardware) can increase search throughput. A coordinating node will round-robin through all shard copies on subsequent requests in order to spread the load.1


  1. https://www.elastic.co/guide/en/elasticsearch/guide/current/_query_phase.html (注意对应的中文版文档翻译得有歧义,让人误以为协调节点将查询请求转发到索引的每个主分片和副本分片) ↩︎ ↩︎

  2. es多副本为什么能提高吞吐量 ↩︎

你可能感兴趣的:(面试,elasticsearch,prometheus,日志/监控,elasticsearch,es)