1. 没有免费的午餐
While the goal may be to improve performance overall, using multiple threads always introduces some performance costs compared to the single-threaded approach. These include the overhead associated with coordinating between threads (locking, signaling, and memory synchronization), increased context switching, thread creation and teardown, and scheduling overhead.
当发现内核CPU占用率较高(>10%),通常表示调度活动发生的很频繁,这很可能是由I/O或者竞争锁导致的阻塞引起的(When a thread blocks because it is waiting for a contended lock, the JVM usually suspends the thread and allows it to be switched out)。
可以使用vmstat/mpstat命令查看CPU使用率,如下:
/home/a/j/nomad2:vmstat 1 kthr memory page disk faults cpu r b w swap free re mf pi po fr de sr s0 s1 -- -- in sy cs us sy id 0 0 0 8011960 6610640 15 13 0 0 0 0 0 0 0 0 0 299 118 84 0 0 100 0 0 0 6795008 5783928 0 10 0 0 0 0 0 0 4 0 0 410 338 214 0 0 100 0 0 0 6794944 5783864 0 2 0 0 0 0 0 0 0 0 0 389 284 192 0 0 100 0 0 0 6794944 5783864 0 0 0 0 0 0 0 0 0 0 0 362 249 162 0 0 100 0 0 0 6794944 5783864 0 0 0 0 0 0 0 0 0 0 0 380 246 172 0 0 100
Scalability describes the ability to improve throughput or capacity when additional computing resources (such as additional CPUs, memory, storage, or I/O bandwidth) are added.
3. 避免不成熟的优化
Avoid premature optimization.
在对性能的调优时,一定要有明确的性能需求(这样才能知道什么时候需要调优,什么时候应该停止),此外还要一个测试程序以及真实的配置和负载等环境。以测试为基础,不要猜测。
It is therefore imperative that any performance tuning exercise be accompanied by concrete performance requirements (so you know both when to tune and when to stop tuning) and with a measurement program in place using a realistic configuration and load profile. Measure, don't guess.
4. Amdahl定律
在N个处理器的机器中,最高的加速比,F为串行执行部分的比率
5. 监测CPU的利用率
如果CPU没有得到充分利用,通常有以下原因:
1) Insufficent load.
2) I/O-bound.
3) Externally bound.
4) Lock contention.