记一下摘要,有时间再整理
1. Three attributes,三个标准
Throughput 吞吐量
Latencies 延迟
Footprint 占用内存(Good space efficiency 利用率?)
Usually, you sacrifice one in favor of the other two
Not all three are always important
But you do need to know what's important for your application
内存分配越多越好,每次尽可能回收更多垃圾、优化上面三个标准中的两个
2. 测试方法
测试三个属性
测试实际负载下的实际条件
一下条件需要一致:应用 第三方包 框架 jdk hardware OS
3. GC Choice
General rule of thumb
Latency more critical than throughput → use CMS
Exception: < 1GB heap → ParallelOldGC might be able to meet desired latencies
Latency not as important as throughput → use ParallelOldGC
Our approach presented here
Start with and measure ParallelOldGC
Move to CMS as necessary
4. 一般的方法
a.测试
b.进一步优化
c. 重复ab
5. footprint
可用内存
单一jvm能分配多少
多jvm或有其他进程(db)每个jvm能分配多少?
记得给OS留些
通常young old都是越大越好,更少的gc间隔,更少的压力、更多对象可以回收
Minor gc的时间主要取决于live object,而不是young gen的大小
6. 初始化的heap size
首先使用ParallelOldGC
选择一个应用可以顺利跑起来的heap size(说起来容易 做起来难
使用默认参数
-XX:+PrintCommandLineFlags查看初始和max的heap size
查看gclog 确定heap 使用情况
OOM则增加内存
目的是为了获取初始的数据,然后做进一步优化
FootPrint
7. Calculate Live Data Size(LDS)
应用稳定期获取数据
使用如下方法触发fullgc:
jconsole/visualvm: 点击perform gc
jmap -histo:live
gc log中能看到
live data size
max perm gen size
worse case latency
确定合理的heap size,一般的规则
Set -Xms and -Xmx to 3x to 4x LDS
Set both -XX:PermSize and -XX:MaxPermSize to around 1.2x to 1.5x the max perm gen size
Young gen should be around 1x to 1.5x LDS
Old gen should be around 2x to 3x LDS
e.g., young gen should be around 1/3-1/4 of the heap size(LDS of 512m : -Xmn768m -Xms2g -Xmx2g)
但是heap size并不是全部
查看整个进程的footprint:top prstat
其他占用内存:本地lib,io buffer,java 线程栈
问题:
有可能在指定的内存下,应用无法正常运行
应用层优化
heap size小于1.5LDS, 可能gc频繁 影响应用
Latencies
8. Latencies
How large are the pauses?
Average GC pause time target
Max GC pause time target
How frequently violations can be tolerated
How frequent are the pauses?
GC pause frequency target (will usually be the same as application pause frequency target)
Likely less important than pause time target
9. Refine Young Gen Size
Monitor young GC times
The most frequent source of GC-induced latencies
Look at both duration and frequency
If young GCs too long → decrease young gen size
It might decrease young GC times (but not always)
It will make young GCs more frequent
If young GCs too frequent → increase young gen size
It might increase young GC times (but not always)
When changing the young gen size
Try to keep old gen size constant
e.g., increase young gen size by 1g
-Xms2g -Xmx2g -Xmn1g → -Xms3g -Xmx3g -Xmn2g
Old gen size should not be much smaller than 1.5x LDS
A very small young gen can be counter-productive
Very frequent young GCs
Generally, the young gen should not be much smaller than around 10% of the heap
Gc时间过长没有什么好方法,修改应用、把应用部署到多个jvm
10. 下一步优化:决定gc类型
收集了足够的信息:
Young gc 、 old gc time和frequency都满足,就是用Parellelold,优化结束。
Yong gc时间ok,full gc 时间长或频率高,使用CMS
Younggc 时间长,应用级优化。
11. ParallelOld优化
Throughput
UseAdaptiveSizePolicy
更高的吞吐量可以增加young old gen
UseNUMA
>Non-Uniform Memory Access
Applicable to most SPARC, Opteron, more recently Intel platforms
> -XX:+UseNUMA
> Splits the young generation into partitions
Each partition “belongs” to a CPU
>Allocates new objects into the partition that belongs to the allocating CPU
> Big win for some applications
12. CMS优化
GC暂停时间
GC频率
吞吐量
CMS不会压缩,默认在Full GC时压缩
迁移到CMS需要增加20-30%的内存(因为碎片和更长的回收周期)
更长的young gc时间(old gen的内存分配更缓慢)
更好的最差延迟
更低的吞吐量
Tenuring Threshold
Higher tenuring threshold → promotes fewer objects
Possibly (but not necessarily) longer young GC times
Increases the number of objects reclaimed in the young gen
Better overall efficiency
Lower tenuring threshold → promotes more objects
Possibly (but not necessarily) shorter young GC times
More load / pressure on the old gen
More frequent old GCs
Could make fragmentation more severe
Essential in CMS to minimize this as much as possible
Survivor Size Tuning
不应该溢出
调试TargetSurvivorRatio
通过GC log or the tenuring distribution
If survivors overflow
Increase survivor size using -XX:SurvivorRatio=<ratio>
survivor size = (100 / (<ratio> + 2))% of young gen size
larger <ratio> → smaller survivors
Or decrease MTT
Desired survivor size:suvivor size * TargetSurvivorRatio
需要拷贝到suvivor的存活对象>DSS,TT减小(survivor溢出)
何时开始:
CMSInitiatingOccupancyFraction:1.5x LDS at least
Cycle starts too early → unnecessary overhead
Cycle starts too late → will not finish on time
与old gen增长速度也有关系:增长快 更早收集
收集过于频繁:增加old gen大小
暂停时间优化:
Initial mark:很难优化
Remark
与收集过程中对象变化有关
CMSScavengeBeforeRemark:先激发一个young gc
ParallelRefProcEnabled:如果有很多待处理的Reference / finalizable对象 则有用
如果有full gc
是否出现concurrent mode failure
Perm gen满了?
-XX:+ExplicitGCInvokesConcurrent
ParallelOld的young gc很慢,CMS的young gc极难做的更好
13. 一些特殊的配置
Use this material as a guide, not as hard rules
Don't be afraid to experiment
We have seen the following in the field
Young gen size = 80% of heap size
Maximize throughput by minimizing young GC frequency
Old gen size = 1.2x LDS
Old gen with extremely low growth rate
Initiating occupancy threshold = 95%
Ditto
14.
Before you start, you'll need to have some basic
knowledge of
The application's behavior
The application's important requirements
The context in which it will be run