本篇文章如果有订阅Ruby5的童鞋们应该知道的。

最近国外的一个同行Bryan Liles做过一个对RubyEE下执行测试的的评测:

未调优前:

410 scenarios (410 passed)

3213 steps (3213 passed) 

9m29.685s

 调优后:

410 scenarios (410 passed) 

3213 steps (3213 passed) 

5m58.661s

差距怎么这么大呢?

可以去看看REE官方文档关于GC性能调整的章节。只需要设置5个参数,我们也可以得到上面的效果:

export RUBY_HEAP_MIN_SLOTS=1000000

export RUBY_HEAP_SLOTS_INCREMENT=1000000

export RUBY_HEAP_SLOTS_GROWTH_FACTOR=1 

export RUBY_GC_MALLOC_LIMIT=1000000000

export RUBY_HEAP_FREE_MIN=500000

下面我们来解释一下这些变量的意义:

RUBY_HEAP_MIN_SLOTS

This specifies the initial number of heap slots. The default is 10000.

专门声明堆栈的初始数据的,默认是10000.

这个默认配置太小了。因为我们测试可能用到更多的堆,所以我们把它设置的大点。这就意味着我们要消耗更多的内存。

RUBY_HEAP_SLOTS_INCREMENT

The number of additional heap slots to allocate when Ruby needs to allocate new heap slots for the first time. The default is 10000.

For example, suppose that the default GC settings are in effect, and 10000 Ruby objects exist on the heap (= 10000 used heap slots). When the program creates another object, Ruby will allocate a new heap with 10000 heap slots in it. There are now 20000 heap slots in total, of which 10001 are used and 9999 are unused.

当Ruby需要开辟一片新的堆栈所需的数,默认是10000.

例如, 假如实际堆栈就只有默认配置的那么多,有10000个Ruby对象已经占满了堆栈, 当你的程序创建另一个对象的时候, Ruby会分配一个新的堆,包含10000个栈。 那么现在就一共有20000 个堆栈, 其中有10001个是被占的, 而有9999空闲。

从这里我们可以看出来, 当所有的桟都被耗尽,一片新的桟会被开辟出来。 默认配置的数字太低了,可能时不时的开辟堆,所以上面的配置把它增加了100倍。

RUBY_HEAP_SLOTS_GROWTH_FACTOR

Multiplicator used for calculating the number of new heaps slots to allocate next time Ruby needs new heap slots. The default is 1.8.

Take the program in the last example. Suppose that the program creates 10000 more objects. Upon creating the 10000th object, Ruby needs to allocate another heap. This heap will have 10000 * 1.8 = 18000 heap slots. There are now 20000 + 18000 = 38000 heap slots in total, of which 20001 are used and 17999 are unused.

The next time Ruby needs to allocate a new heap, that heap will have 18000 * 1.8 = 32400 heap slots.

 当ruby需要新的堆栈的时候, 此参数做为一个乘数被用来计算这片新的堆栈的大小。

 拿上个例子来说, 假设程序又创建了10000个对象。Ruby需要分配另一个堆。这个堆将会是这么多堆栈:

 10000 * 1.8 = 18000。 那么这里一共就有20000+18000个堆栈, 其中有20001个被用,而有17999个空闲。

 

RUBY_GC_MALLOC_LIMIT

The amount of C data structures which can be allocated without triggering a garbage collection. If this is set too low, then the garbage collector will be started even if there are empty heap slots available. The default value is 8000000.

这是不触发垃圾回收的C数据结构的数量。如果它设置的太低就会触发垃圾回收,即使还有很多可用的空闲堆栈。默认是8000000.

 

RUBY_HEAP_FREE_MIN

The number of heap slots that should be available after a garbage collector run. If fewer heap slots are available, then Ruby will allocate a new heap according to the RUBY_HEAP_SLOTS_INCREMENTand RUBY_HEAP_SLOTS_GROWTH_FACTOR parameters. The default value is 4096.

       在垃圾回收运行以后可用的堆栈数。 如果可用的堆栈太少, 那么Ruby会按照UBY_HEAP_SLOTS_INCREMENT  和RUBY_HEAP_SLOTS_GROWTH_FACTOR的配置参数分配一个新的堆。默认是4096. 

值得注意的是,以上的配置请不要在生产环境下使用。 当然你可以参考37signals 和 Twitter 提供的REE配置来调优你的应用。

 

37signals 在生产环境的配置:

RUBY_HEAP_MIN_SLOTS=600000

RUBY_GC_MALLOC_LIMIT=59000000 

RUBY_HEAP_FREE_MIN=100000

Twitter 用在生产环境的配置:

RUBY_HEAP_MIN_SLOTS=500000 

RUBY_HEAP_SLOTS_INCREMENT=250000 

RUBY_HEAP_SLOTS_GROWTH_FACTOR=1 

RUBY_GC_MALLOC_LIMIT=50000000

 

关于Twitter配置的说明:

  • Start with enough memory to hold the application (Ruby’s default is very low, lower than what a Rails application typically needs).

  • 让你的应用保持足够的内存(Ruby默认的非常低, 低于典型Rails应用所需)

  • Increase it linearly if you need more (Ruby’s default is exponential increase).

  • 如果你需要更多堆栈,请保证它线性增长(Ruby默认的是指数级增长)

  • Only garbage-collect every 50 million malloc calls (Ruby’s default is 6x smaller).

  • 每5000万个malloc被调用才启动垃圾回收(Ruby默认的数比这个数小6倍)