关于ElasticSearch的堆内存设置,目前大多数的建议是低于32GB,这是由于jvm中有个指针压缩的概念(compressed ordinary object pointers),详见OpenJDK Wiki: 《Compressed Oops》。简单概括就是启用对象指针压缩则jvm可以使用32位托管指针管理上限32GB的堆内存
Compressed oops represent managed pointers (in many but not all places in the JVM) as 32-bit values which must be scaled by a factor of 8 and added to a 64-bit base address to find the object they refer to. This allows applications to address up to four billion objects (not bytes), or a heap size of up to about 32Gb. At the same time, data structure compactness is competitive with ILP32 mode.
并将地址转换的过程称为Encoding/Decoding。
但这事儿没这么简单,Compressed Oops在堆内存4GB到32GB之间生效,却还分为两种模式,即Zero-Based和Non-Zero模式,这两种模式下计算实际内存地址的方法有区别,以下是伪码:
Zero-Based Mode:
<wide_oop> = <narrow-oop> << 3
Non-Zero Mode:
if (<narrow-oop> == NULL)
<wide_oop> = NULL
else
<wide_oop> = <narrow-oop-base> + (<narrow-oop> << 3)
了解汇编的读者肯定已经发现,Non-Zero模式比Zero-Based模式的直接移位运算,在访存时会有巨大的额外开销。
而jvm在分配堆内存时采用何种策略实际上取决于两方面的因素,一是堆大小,二是它运行的平台(操作系统)。
首先,如果堆小于4GB,那就不用压缩(decoding)了。如果这步失败,或者堆大于4GB,jvm会尝试在Zero-Based模式下分配堆内存。如果这步还失败,则采用一般的指针压缩模式(Non-Zero)。
Zero based implementation tries to allocated java heap using different strategies based on the heap size and a platform it runs on.
First, it tries to allocate java heap below 4Gb to use compressed oops without decoding if heap size < 4Gb.
If it fails or heap size > 4Gb it will try to allocate the heap below 32Gb to use zero based compressed oops.
If this also fails it will switch to regular compressed oops with narrow oop base.
堆大小这个是比较容易解释和计算的。操作系统的影响就只能靠实地测试了。
我们的做法是找一台内存大于32GB的服务器,写一个helloword程序,然后指定不同的堆大小,通过指定-XX:+PrintCompressedOopsMode或-Xlog:gc+heap+coops=info(根据jvm的版本不同)来查看jvm在不同堆大小时使用的压缩模式。
比如,以Open JDK 12 + CentOS 7.5环境测试:
#未启用压缩,直接使用32位指针即可
[2021-10-09T09:26:08.032+0000][18738][gc,heap,coops] Heap address: 0x00000000c0000000, size: 1024 MB, Compressed Oops mode: 32-bit
#Zero-based指针压缩
[2021-10-12T03:20:27.061+0000][16788][gc,heap,coops] Heap address: 0x00000006e3000000, size: 4560 MB, Compressed Oops mode: Zero based, Oop shift amount: 3
#Non-zero
[2021-09-27T07:51:11.310+0000][65141][gc,heap,coops] Heap address: 0x0000001000000000, size: 31744 MB, Compressed Oops mode: Non-zero disjoint base: 0x0000001000000000, Oop shift amount: 3
经过测试,在以上运行环境中,Zero based Compressed Oops的阈值是30720MB即30GB。
上面分析了jvm的对象指针压缩机制,现在回到ES的设置问题。对于ES这种有频繁访存需求的应用,尤其是在写比较集中的场景下,差1GB的堆内存容量和更迅速的内存访问,哪个更有利于提高性能呢?
在ES堆内存的设置问题上我们总是陷入这样一个误区,即在启用Compressed Oops的前提下堆内存越高越好,但容量大并不一定就是绝对的好事情。其实在满足要求的前提下堆内存越小越好,这样可以把物理内存资源留给Lucene文件缓存,可以避免GC长停顿,等等。
What frequently happens though is that our advice surrounding compressed oops is interpreted as advice to set the heap as high as it can go while staying under the compressed oops threshold. Instead though, it’s better to set the heap as low as possible while satisfying your requirements for indexing and query throughput, end-user query response times, yet large enough to have adequate heap space for indexing buffers, and large consumers of heap space like aggregations, and suggesters. The smaller that you can set the heap, the less likely you’ll be subject to detrimental long garbage collection pause, and the more physical memory that will be available for the filesystem cache which continues to be used more and more to great effect by Lucene and Elasticsearch.
参考文献: