This is a community maintained page of speed optimization techniques for use with ActionScript3. Originally started on RockOnFlash.com , the post just grew with new revisions and tests and yet new techniques from the community! Only made sense to put it on a wiki for everyone to maintain ;)
(Morf ) Many of these enhancements are particular to the type of variable declaration you use. So far, in testing Number vs. int , the greatest improvements are gained when using int s. In general (with exceptions), only moderate if any improvements are seen when using Number -declared variables, due to the limited accurate operations.
The following excellent generalizations come from Andre Michelle ’s website (link below):
- Do not overload condition and statements as in AS2
It astounds me that this once-excellent technique for optimizing for speed no produces faster code. Overloading *should* produce faster code by reducing the number of compiled instructions and generating faster ones.
- Do not use Objects, if you know which properties will be finally involved.
- Cast instances, while reading from an Array
- Try to use Integers (int) for all possible cases
- Bitoperators are lightning fast
(Other things we what would like to see in AS3:
- Typed, length-fixed Arrays.
“…this would increase the speed…”
- A faster solution to read/write pixels on a bitmap.)
Dennis Ippel - Some ActionScript 3.0 Optimizations and links
Andre Michelle - General Guidelines for AS3 Optimizations
Tests done to determine if “a += b” really is faster (as it should be) than “a = a + b” have been inconclusive so far. I’d still recommend using the shorthand if you’re accustomed to it, mostly because I’m hoping a new compiler will grant faster speeds for using it.
The good news is that Incrementing (adding 1) with the ++ operator is far faster than “a = a + 1” - in fast, using the ++ four times (in consecutive statements) is still about 50% faster than “a = a + 4”!!
Unfortunately, the same is not true with the decrement (–) operator . So far, tests on that have not shown any consistent speed increase over “a = a - 1”. (sigh)
Here’s the testing code:
import flash.utils.getTimer;
var time : Number = getTimer();
var sumStd : Number = 0;
var sumOpt : Number = 0;
var sumMpt : Number = 0;
var loops : Number;
var thresh : Number = 0.002;
function runStdTest():void {
time = getTimer();
for (var i:int = 0; i < 20000000; i++) {
var test:int = i;
test = test - 1;
}
sumStd += (getTimer() - time);
}
function runOptTest():void {
time = getTimer();
for (var i:int = 0; i < 20000000; i++) {
var test:int = i;
test--;
}
sumOpt += (getTimer() - time);
}
function runEmptyTest():void {
time = getTimer();
for (var i:int = 0; i < 20000000; i++) {
var test:int = i;
}
sumMpt += (getTimer() - time);
}
loops = 10;
for (var i:int = 0; i < loops; i++) {
runStdTest();
runOptTest();
runEmptyTest();
}
trace ("Standard vs. Optimization - Mean over", loops, "loops");
trace ("Standard Test: ", (sumStd /= loops));
trace();
Now, you’ve probably heard the tip about “use multiplication instead of division when dividing by 2”, right?
faster:
var n:Number = value *.5;
slower:
var n:Number = value / 2;
run it with a test:
import flash.utils.getTimer;
var time:Number = getTimer();
function runDivisionTest():void
{
time = getTimer();
for(var i:int=0;i<10000000;i++)
{
var test:Number = i/2;
}
trace("DivisionTest: ", (getTimer()-time));
}
function runMultTest():void
{
time = getTimer();
for(var i:int=0;i<10000000;i++)
{
var test:Number = i*.5;
}
trace("MultTest: ", (getTimer()-time));
}
runDivisionTest();
runMultTest();
traces out:
DivisionTest: 162
MultTest: 110
Alright, so it’s not miles faster, but in a 3D engine like Papervision3D , this becomes the difference between a fast engine and a slow engine real quickly.
Well, there’s one still that’s even faster: Bitwise shift operations
Divide by 2 with a right shift:
trace(10 >> 1);
// traces:
5
Multiply by 2 with a left shift:
trace(10 << 1);
// traces:
20
Now, run the test against the Division and Multiplication tests above:
function runBitTest():void
{
time = getTimer();
for(var i:int=0;i<10000000;i++)
{
var test:int = i >> 1;
}
trace("BitTest: ", (getTimer()-time));
}
runBitTest();
traces out:
DivisionTest: 152
MultTest: 112
BitTest: 63
HOLY CRAP Batman - that’s nearly 1/2 the speed?? or should I say, 1*.5 the speed ;)
(Morf ) Well, kind of - but you may not want to use the bitshift . When the BitTest code above uses the the same (slower) Number declaration for test , the results are slightly different (btw, the iterator i in each testing loop also now has the same declaration (int ) in each code section ):
Division by 2 - Mean over 20 loops
DivisionTest: 83.3
MultTest: 62.9
BitRightTest: 45.85
EmptyCodeTest: 29.1
I’ve also thrown in an empty loop timing for comparison - so a more accurate comparison can be made by subtracting that timing from each of the others. These gave:
Net timing results:
DivisionTest: 54.2
MultTest: 33.8
BitRightTest: 16.75
So both optimizations are still faster with Number declarations. But don’t use BitShifting for Number operations. That’s because the bit shift will round off the result to an integer - AS3 turns your original value into an integer before making the bit shift. Unless that’s what you’re after, use “0.5 *” instead.
So, I was looking at some other stuff with uint and this seems to be a real gem when it comes to Math.floor and Math.ceil operations. I was reading through Flash’s help (I have to say, I really enjoy the new help - I live in there) on uint and realized that, like int, when you pass a number, it strips everything passed the decimal (like Math.floor). So, I thought, what the hey, let’s give er’ a speed test.
Sure enough, it was much faster to use uint(n) than Math.floor(n) - nearly 10x’s as fast
– UPDATE – After Paulius Uza’s comments about using int, I went back and added tests for int with the floor/ceil tests, and sure enough, they’re even faster than using uint. floor’s test wasn’t that drastic, but ceil’s was 1/2 the time of uint’s version
– /UPDATE –
fast:
var test:int = int(1.5);
//test equals 1
slow:
var test:Number = Math.floor(1.5);
//test equals 1
– /UPDATE by lilive –
But take care, this solution is only for positive integer:
var test1:int = int(-1.5);
//test1 equals -1
var test2:Number = Math.floor(-1.5);
//test2 equals -2
Now, I know what yer thinkin’: what about Math.ceil? +1? yes ;)
– /UPDATE by matthewwithanm – Note that this this is not a general replacement for ceil as it will give an incorrect result for integers. [ceil(1) == 1 but int(1) + 1 == 2]
fast:
var test:int = int(1.5)+1;
//test equals 2
slow:
var test:Number = Math.ceil(1.5);
//test equals 2
Time for a test:
function runFloorTest():void
{
time = getTimer();
for(var i:uint=0;i<10000000;i++)
{
var n:Number = 1.5;
var test:Number = Math.floor(n);
}
trace("FloorTest: ", (getTimer()-time));
}
function runUintFloorTest():void
{
time = getTimer();
for(var i:uint=0;i<10000000;i++)
{
var n:Number = 1.5;
var test:uint = uint(n);
}
trace("UintFloorTest: ", (getTimer()-time));
}
function runIntFloorTest():void
{
time = getTimer();
for(var i:uint=0;i<10000000;i++)
{
var n:Number = 1.5;
var test:int = int(n);
}
trace("IntFloorTest: ", (getTimer()-time));
}
function runCeilTest():void
{
time = getTimer();
for(var i:uint=0;i<10000000;i++)
{
var n:Number = 1.5;
var test:Number = Math.ceil(n);
}
trace("CeilTest: ", (getTimer()-time));
}
function runUintCeilTest():void
{
time = getTimer();
for(var i:uint=0;i<10000000;i++)
{
var n:Number = 1.5;
var test:uint = n == uint(n) ? n : uint(n)+1;
}
trace("UintCeilTest: ", (getTimer()-time));
}
function runIntCeilTest():void
{
time = getTimer();
for(var i:uint=0;i<10000000;i++)
{
var n:Number = 1.5;
var test:int = n == int(n) ? n : int(n)+1;
}
trace("IntCeilTest: ", (getTimer()-time));
}
function runBitShiftFloorTest():void
{
time = getTimer();
for(var i:uint=0;i<10000000;i++)
{
var n:Number = 1.5;
var test:Number = n >> 0;
}
trace("BitShiftFloorTest: ", (getTimer()-time));
}
runFloorTest();
runUintFloorTest();
runIntFloorTest();
runBitShiftFloorTest();
runUintCeilTest();
runIntCeilTest();
traces out:
FloorTest: 1733
UintFloorTest: 176
*IntFloorTest: 157
BitShiftFloorTest: 178
UintCeilTest: 650
*IntCeilTest: 384
I was thinking about another one that I use sometimes which was using *-1 instead of Math.abs on a number
faster:
var nn:Number = -23
var test:Number= nn < 0 ? nn * -1 : nn;
slower:
var nn:Number = -23
var test:Number = Math.abs(nn);
Lets test!
function runABSTest():void
{
time = getTimer();
for(var i:uint=0;i<10000000;i++)
{
var n:Number = -1.5;
var test:Number = Math.abs(n);
}
trace("ABSTest: ", (getTimer()-time));
}
function runABSMultTest():void
{
time = getTimer();
for(var i:uint=0;i<10000000;i++)
{
var n:Number = -1.5;
var test:Number = n < 0 ? n * -1 : n;
}
trace("ABSMultTest: ", (getTimer()-time));
}
runABSTest();
runABSMultTest();
traces out:
ABSTest: 1615
*ABSMultTest: 153
(Morf ) So... Taking this a bit further, instead of multiplying by -1, what about just assigning the inverse? The code is updated so n (instead of test ) receives the new Abs value:
function runAbsTest():void {
time = getTimer();
for (var i:int = 0; i < 10000000; i++) {
var n:Number = -1.5;
n = Math.abs (n);
}
sumAbs += (getTimer() - time);
}
function runMm1Test():void {
time = getTimer();
for (var i:int = 0; i < 10000000; i++) {
var n:Number = -1.5;
n = n < 0 ? n * -1 : n;
}
sumMm1 += (getTimer() - time);
}
function runChsTest():void {
time = getTimer();
for (var i:int = 0; i < 10000000; i++) {
var n:Number = -1.5;
n = n < 0 ? -n : n;
}
sumChs += (getTimer() - time);
}
function runEmptyTest():void {
time = getTimer();
for (var i:int = 0; i < 10000000; i++) {
var n:Number = -1.5;
n = n;
}
sumMpt += (getTimer() - time);
}
And what if we just tested to see if n was less than zero before doing anything?
function runIfmTest():void {
time = getTimer();
for (var i:int = 0; i < 10000000; i++) {
var n:Number = -1.5;
if (n < 0) n = -n;
}
sumIfm += (getTimer() - time);
}
Traced out:
Abs - Mean over 25 loops
AbsTest: 1572.4
MultByMinus1Test: 121.96
ChsTest: 95.76
IfMinusTest: 76
EmptyTest: 32.48
Net timing results:
AbsTest: 1539.92
MultByMinus1Test: 89.48
ChsTest: 63.28
IfMinusTest: 43.52
Bear in mind the IfMinusTest code assumes the inverse operation must take place every time (n is always < 0) - so the actual implementation would probably run even faster.
In response to a question posted on how to determine distance, someone said “sqrt is an expensive operation”. So I tried it out against the old Babylonian Method that was known (as recently as the late 20th century !) to be quite accurate once you’d run it through a few iterations:
import flash.utils.getTimer;
var time : Number = getTimer();
var sumSqrt : Number = 0;
var sumAppr : Number = 0;
var sumMpt : Number = 0;
var loops : Number;
var thresh : Number = 0.00001;
var a : Number = 0,
b : Number = 0,
c : Number = 0,
w : Number = 0;
var count : int = 0;
w = 147.29;
trace ("Actual value =", Math.sqrt (w)); // This is what AS3 gives us.
trace ("Testing threshold:", thresh);
b = w * 0.25;
do {
count++;
c = w / b;
b = (b + c) * 0.5;
a = b - c;
if (a < 0) a = -a;
} while (a > thresh); // This tests the algorithm.
c = Math.sqrt (w);
trace ("Number of iterations for approximation:", count);
trace ("Approximation:", b);
trace ("Error in approximation:", b - Math.sqrt (w));
trace (" - - - -");
function runSqrtTest():void {
time = getTimer();
for (var i:int = 0; i < 1000000; i++) {
var test:Number = Math.sqrt (w);
}
sumSqrt += (getTimer() - time);
}
function runApprTest():void {
time = getTimer();
for (var i:int = 0; i < 1000000; i++) {
var b :Number = w * 0.25,
a :Number;
c :Number;
do {
c = w / b;
b = (b + c) * 0.5;
a = b - c;
if (a < 0) a = -a;
} while (a > thresh);
}
sumAppr += (getTimer() - time);
}
function runEmptyTest():void {
time = getTimer();
for (var i:int = 0; i < 1000000; i++) {
var n:Number = w;
}
sumMpt += (getTimer() - time);
}
loops = 20;
for (var i:int = 0; i < loops; i++) {
runSqrtTest();
runApprTest();
runEmptyTest();
}
trace ("Math.sqrt - Mean over", loops, "loops");
trace ("Math.sqrt() Test: ", (sumSqrt /= loops));
trace ("Approximation Test: ", (sumAppr /= loops));
trace ("Empty Test: ", (sumMpt /= loops));
trace ("Net timing results:");
trace ("Math.sqrt() Test:", (sumSqrt - sumMpt));
trace ("Approximation Test:", (sumAppr - sumMpt));
I could not believe the results:
Actual value = 12.136309158883519
Testing threshold: 0.00001
Number of iterations for approximation: 6
Approximation: 12.13630915888352
Error in approximation: 1.7763568394002505e-15
- - - -
Math.sqrt - Mean over 20 loops
Math.sqrt() Test: 172.05
Approximation Test: 134.8
Empty Test: 4.05
Net timing results:
Math.sqrt() Test: 168
Approximation Test: 130.75
Okay, what gives? Even my math co-processor on my old 286 (that I still do my taxes on) could do better than that! And when we boost the testing threshold up for one less iteration, we still get 10-digit accuracy, but with nearly a 1/3 speed boost:
Actual value = 12.136309158883519
Testing threshold: 0.002
Number of iterations for approximation: 5
Approximation: 12.136309166280597
Error in approximation: 7.397078505277932e-9
- - - -
Math.sqrt - Mean over 20 loops
Math.sqrt() Test: 167.3
Approximation Test: 114.95
Empty Test: 3.95
Net timing results:
Math.sqrt() Test: 163.35000000000002
Approximation Test: 111
Speed Increase: 32.047750229568415 %
I can’t help feeling like I’m missing something here :T
If I’m not, then this algorithm will be faster than Math.sqrt.
UPDATE by [email protected]: Whoever posted this tip must’ve used an older version of Flash. I ran the Math.sqrt improvement test with Flash 9 v. 115 today on March 26th, 2008. The results were the opposite. I think Adobe optimized their Math.sqrt. Here are the results I got when running your code:
Actual value = 12.136309158883519
Testing threshold: 0.00001
Number of iterations for approximation: 6
Approximation: 12.13630915888352
Error in approximation: 1.7763568394002505e-15
- - - -
Math.sqrt - Mean over 20 loops
Math.sqrt() Test: 36.65
Approximation Test: 60.1
Empty Test: 26.8
Net timing results:
Math.sqrt() Test: 9.849999999999998
Approximation Test: 33.3
(Morf ) UPDATE RESPONSE: That revision of Flash did speed a lot of things up, including Math.sqrt(); I suspect they streamlined calls in general (this process is by far the biggest culprit in this problem). I did go ahead and create faster code that is way way uglier. It is faster by about 20% under Flash 10... but only gives approx .0001 accuracy, which is plenty for Papervision3D applications. In any case, I don’t recommend its use as it will not offer any increase in speed unless used inline (not as any sort of function or method).
Here’s the package used for testing:
package
{
import flash.display.*;
import flash.text.*;
import flash.utils.getTimer;
public class Main extends Sprite
{
public function Main()
{
var time : Number = getTimer();
var sumSqrt : Number = 0;
var sumAppr : Number = 0;
var sumMpt : Number = 0;
var loops : Number;
var resultMsg:TextField = new TextField();
addChild (resultMsg);
var thresh : Number = 0.002;
var a : Number = 0,
b : Number = 0,
c : Number = 0,
ww : Number = 0,
www : Number = 0;
var count : int = 0;
var j : int = 0;
var jj : int = 0;
function runApprTest():void
{
time = getTimer();
for (var ii:int = 0; ii < 20; ii++) {
for (var i:int = 33000; i < 44000; i++) {
var sqNum:Number = i + 0.208;
//--Using the Babylonian Method for computing square roots efficiently requires making the initial fast
// approximation as accurate as possible, insuring that the number of iterations can be minimized while
// perserving a desired level of accuracy.
//
var apprxInit : int;
var apprxRoot : Number;
var initVal : Number;
var invVal : Boolean;
if (sqNum < 1) {
initVal = 1 / sqNum;
invVal = true;
} else {
initVal = sqNum;
invVal = false;
}
apprxInit = initVal;
//--Make first approximation of APPRXROOT: Set the starting approximation APPRXINIT to determine
// its binary degree; in this case, the location of its most significant bit to determine its length
// in bits.
// In AS3, the most efficient method for accomplishing this is by comparing it to integers equal to odd
// powers of 2, then shifting it right half the number of bits of its length. This unfortunately results
// in rather unseemly code. The final treatment is one iteration of the Babylonian Method. "Hey, Morf
// said it's fast and it works"
// Well, yes... but this MUST BE USED AS INLINE CODE FOR ANY SPEED OPTIMIZATION TO BE REALIZED.
//
if (apprxInit > 7) {
if (apprxInit < 32768) {
if (apprxInit < 128) {
if (apprxInit < 32) {
apprxInit >>= 2;
if (apprxInit < 4) apprxInit++;
} else {
apprxInit >>= 3;
}
} else {
if (apprxInit < 2048) {
if (apprxInit < 512) {
apprxInit >>= 4;
} else {
apprxInit >>= 5;
}
} else {
if (apprxInit < 8096) {
apprxInit >>= 6;
} else {
apprxInit >>= 7;
}
}
}
} else {
if (apprxInit < 8388608) {
if (apprxInit < 524288) {
if (apprxInit < 131072) {
apprxInit >>= 8;
} else {
apprxInit >>= 9;
}
} else {
if (apprxInit < 2097152) {
apprxInit >>= 10;
} else {
apprxInit >>= 11;
}
}
} else {
if (apprxInit < 134217728) {
if (apprxInit < 33554432) {
apprxInit >>= 12;
} else {
apprxInit >>= 13;
}
} else {
apprxInit >>= 14; //What are you doing with a number this big anyway? Not bothering with yet another test.
}
}
}
apprxRoot = (apprxInit + initVal / apprxInit) * 0.5;
} else if (apprxInit < 2) apprxRoot = initVal * 0.5 + 0.5
else apprxRoot = initVal * 0.25 + 1;
//
//-- First approximation will be within about 1% at twice the speed and may be sufficient for collision detection computations.
//-- Now perform as many additional approximations as desired; fourth iteration = same precision as Math.sqrt().
//
apprxRoot = (apprxRoot + initVal / apprxRoot) * 0.5; // babylonian iteration 2 -- about 20% faster.
// apprxRoot = (apprxRoot + initVal / apprxRoot) * 0.5; // babylonian iteration 3 -- seven-digit precision.
// apprxRoot = (apprxRoot + initVal / apprxRoot) * 0.5; // babylonian iteration 4 -- full precision.
if (invVal) apprxRoot = 1 / apprxRoot;
}
}
sumAppr += (getTimer() - time);
}
function runSqrtTest():void
{
time = getTimer();
for (var ii:int = 0; ii < 20; ii++) {
for (var i:int = 33000; i < 44000; i++) {
var w:Number = i + 0.208;
var test:Number = Math.sqrt (w);
}
}
sumSqrt += (getTimer() - time);
}
function runEmptyTest():void {
time = getTimer();
for (var ii:int = 0; ii < 20; ii++) {
for (var i:int = 33000; i < 44000; i++) {
var n:Number = i + 0.208;
} }
sumMpt += (getTimer() - time);
}
loops = 50;
for (var i:int = 0; i < loops; i++) {
runSqrtTest();
runApprTest();
runEmptyTest();
}
resultMsg.text = "Math.sqrt - Mean over " + loops + " loops"
+ "/n" + "Math.sqrt() Test: " + (sumSqrt /= loops)
+ "/n" + "Approximation Test: " + (sumAppr /= loops)
+ "/n" + "Empty Test: " + (sumMpt /= loops)
+ "/n" + "Net timing results:"
+ "/n" + "Math.sqrt() Test: " + (sumSqrt - sumMpt)
+ "/n" + "Approximation Test: " + (sumAppr - sumMpt)
+ "/n" + "Speed Increase: " + 100 * (sumSqrt - sumAppr) / (sumSqrt - sumMpt) + "% (anecdotal)";
resultMsg.autoSize = TextFieldAutoSize.LEFT;
}
}
}