
node query cache


query cache缓存查询结果,但只缓存filter类型的查询。


在5.1.1中移除了term query的缓存。因为term query和filter query二者查询时间相差不多。

curl -XPOST 'localhost:9200/_search?pretty' -H 'Content-Type: application/json' -d'
  "query": {
        "term" : { "user" : "Kimchy" } 


Find matching docs.
The term query looks up the term XHDK-A-1293-#fJ3 in the inverted index and retrieves the list of documents that contain that term. In this case, only document 1 has the term we are looking for.
Build a bitset.
The filter then builds a bitset--an array of 1s and 0s—that describes which documents contain the term. Matching documents receive a 1 bit. In our example, the bitset would be [1,0,0,0]. Internally, this is represented as a "roaring bitmap", which can efficiently encode both sparse and dense sets.
Iterate over the bitset(s)
Once the bitsets are generated for each query, Elasticsearch iterates over the bitsets to find the set of matching documents that satisfy all filtering criteria. The order of execution is decided heuristically, but generally the most sparse bitset is iterated on first (since it excludes the largest number of documents).
Increment the usage counter.
Elasticsearch can cache non-scoring queries for faster access, but its silly to cache something that is used only rarely. Non-scoring queries are already quite fast due to the inverted index, so we only want to cache queries we know will be used again in the future to prevent resource wastage.
To do this, Elasticsearch tracks the history of query usage on a per-index basis. If a query is used more than a few times in the last 256 queries, it is cached in memory. And when the bitset is cached, caching is omitted on segments that have fewer than 10,000 documents (or less than 3% of the total index size). These small segments tend to disappear quickly anyway and it is a waste to associate a cache with them.


Frequently used filters will be cached automatically by Elasticsearch, to speed up performance.
Filter context is in effect whenever a query clause is passed to a filter parameter, such as the filter or must_not parameters in the bool query, the filter parameter in the constant_score query, or the filter aggregation.

1 range
2 bool/must_not
3 bool/filter
4 constant_score/filter
5 filter aggregation

shard request cache

shard request cache会缓存以下内容:

By default, the requests cache will only cache the results of search requests where size=0, so it will not cache hits, but it will cache, aggregations, and suggestions.Most queries that use now (see Date Mathedit) cannot be cached.

2 aggregations
3 suggestions


缓存的结果会随着shard的refresh而无效。因此越长的refresh interval,在不超出deadline的情况下缓存可用的时间就越长。当缓存满时,最近最少使用的缓存将被清除。


curl -XPUT 'localhost:9200/my_index?pretty' -H 'Content-Type: application/json' -d'
  "settings": {
    "index.requests.cache.enable": false


curl -XGET 'localhost:9200/my_index/_search?request_cache=true&pretty' -H 'Content-Type: application/json' -d'
  "size": 0,
  "aggs": {
    "popular_colors": {
      "terms": {
        "field": "colors"


cache key

提交的json body被作为cache key。如果你的json body发生了改变,则不能利用缓存。即使是同一个请求,但是条件的顺序不同,也不行。
可通过indices.requests.cache.size: 2% 设置cache的大小。

fielddata cache 

field data与doc_values作用一样,都是让我们在inverted index倒排索引的基础上做aggregation统计、sort排序。

当第一次在analyzed字段(只有analyzed字段使用fielddata,其余使用doc_values)上进行agg、sort或通过脚本访问时,就会触发该字段fielddata cache的加载,这种缓存的“segment”级别的,但有新的segment打开时,旧的缓存不会重新加载,而是直接把新的segment对应的fielddata cache加载到内存。一个fielddata被加载,那么在fielddata cache对应的segment生命周期范围内都会驻留在内存中。也即,当段合并时会触发合并后更大段的fielddata cache加载。

Fielddata Cache设置
2.indices.breaker.fielddata.limit:此参数设置Fielddata断路器限制大小(公式:预计算内存 + 现有内存 <= 断路器设置内存限制),默认是60%JVM堆内存,当查询尝试加载更多数据到内存时会抛异常(以此来阻止JVM OOM发生)


shard query cache

fielddata cache
