今天在看elasticsearch日志的时候发现有好多错误,如下:
breaker.CircuitBreakingException:[FIELDDATA] Data too large,data for [_type] would be large than litmit of [250895009/2.3gb].
看到这个错误,第一反应就是内存不够用了.查了下资料,确实是内存不够用了,不过这个内存是field data 的缓存不够用了.field data的缓存就是在我们进行查询的时候,
比如排序,或者聚合查询的时候,将field的value加载到内存中来提高查询速度.当数据量过大的时候,我们分配的内存存储不了这么多数据的时候就会报错.
这块内存的大小可以通过参数进行设置:官网解析如下
Setting | Description |
---|---|
|
The max size of the field data cache,eg |
|
[experimental] This functionality is experimental and may be changed or removed completely in a future release. A time based setting that expiresfield data after a certain time of inactivity. Defaults to |
设置了缓存的大小,但是也还会出现内存不够用的情况,为了避免出现内存溢出OutOfMemoryError的情况,es中设置了一个叫Circuit Breaker的机制.可以叫他为断路器.我的
理解就是,根据查询的时候根据一定的规则去计算以下所需要的内存,如果所需内存大于我们设置的内存大小,这个查询就直接失败,报错,防止出现OutOfMemoryError.
当然,这个也是可以通过参数控制的.
circuit breaker分为以下两种,field data circuit breaker和request circuit breaker.我只见过第一种,没有遇到过第二种.
总的限制:
indices.breaker.total.limit
Starting limit for overall parent breaker, defaults to 70% of JVM heap
indices.breaker.fielddata.limit
Limit for fielddata breaker, defaults to 60% of JVM heap indices.breaker.fielddata.overhead
(计算方法)A constant that all field data estimations are multiplied with to determine a final estimation. Defaults to 1.03
indices.breaker.request.limit
Limit for request breaker, defaults to 40% of JVM heap indices.breaker.request.overhead
(计算方法)A constant that all request estimations are multiplied with to determine a final estimation. Defaults to 1
主要就是这些参数,一般情况下,默认的就够用了,但是如果数据量比较大,可以调整一下.
除了在elasticsearch.yml中配置外,也可以动态的进行调整:
curl -XPUT localhost:9200/_cluster/settings -d '{
"persistent" : {
"indices.breaker.fielddata.limit" : "40%"
}
}'
另外,如果需要跟踪节点的话,可以通过下面命令进行查看.
curl -XGET 'http://localhost:9200/_nodes/stats' curl -XGET 'http://localhost:9200/_nodes/nodeId1,nodeId2/stats'
官网介绍如下:
By default, all stats are returned. You can limit this by combining anyof indices
, os
, process
, jvm
, network
, transport
, http
,fs
, breaker
and thread_pool
. For example:
|
Indices stats about size, document count, indexing and deletion times, search times, field cache size , merges and flushes |
|
File system information, data path, free disk space, read/write stats |
|
HTTP connection information |
|
JVM stats, memory pool information, garbage collection, buffer pools |
|
TCP information |
|
Operating system stats, load average, cpu, mem, swap |
|
Process statistics, memory consumption, cpu usage, open file descriptors |
|
Statistics about each thread pool, including current size, queue and rejected tasks |
|
Transport statistics about sent and received bytes in cluster communication |
|
Statistics about the field data circuit breaker |
# return indices and os curl -XGET 'http://localhost:9200/_nodes/stats/os' # return just os and process curl -XGET 'http://localhost:9200/_nodes/stats/os,process' # specific type endpoint curl -XGET 'http://localhost:9200/_nodes/stats/process' curl -XGET 'http://localhost:9200/_nodes/10.0.0.1/stats/process'
The all
flag can be set to return all the stats.
You can get information about field data memory usage on nodelevel or on index level.
# Node Stats curl -XGET 'http://localhost:9200/_nodes/stats/indices/?fields=field1,field2&pretty' # Indices Stat curl -XGET 'http://localhost:9200/_stats/fielddata/?fields=field1,field2&pretty' # You can use wildcards for field names curl -XGET 'http://localhost:9200/_stats/fielddata/?fields=field*&pretty' curl -XGET 'http://localhost:9200/_nodes/stats/indices/?fields=field*&pretty'
You can get statistics about search groups for searches executedon this node.
# All groups with all stats curl -XGET 'http://localhost:9200/_nodes/stats?pretty&groups=_all' # Some groups from just the indices stats curl -XGET 'http://localhost:9200/_nodes/stats/indices?pretty&groups=foo,bar'
最后,总结一些.
咱们上面说的是全部加载到内存中.在1.x推出了一种新的模板,如下:
doc_values:doc_values是Elasticsearch 1.0版本引入的新特性。启用该特性的字段,索引写入的时候会在磁盘上构建fielddata。而过去,fielddata是固定只能使用内存的。在请求范围加大的时候,很容易触发OOM或者circuit breaker报错:
ElasticsearchException[org.elasticsearch.common.breaker.CircuitBreakingException: Data too large, data for field [@timestamp] would be larger than limit of [639015321/609.4mb]]
doc_values 只能给不分词(对于字符串字段就是设置了 "index":"not_analyzed",数值和时间字段默认就没有分词)的字段配置生效。
doc_values虽然用的是磁盘,但是系统本身也有自带VFS的cache效果并不会太差。据官方测试,经过1.4的优化后,只比使用内存的fielddata慢15%。所以,在数据量较大的情况下,强烈建议开启该配置.