Since we released AspectWerkz 1.0, and more generally for every release of any AOP / interceptor framework (AspectWerkz, AspectJ, JBoss AOP, Spring AOP, cglib, dynaop etc), a question is always raised: "what is the performance cost of such an approach?", "how much do I loose per method invocation when an advice / interceptor is applied?".
This is indeed an issue that needs to be carefully addressed, and that in fact has affected the design of every mature enough framework.
We are probably all scared by the cost of the java.lang.reflect despite its relative power, and usually, even before starting to evaluate semantics robustness and ease of use in general - we start doing some Hello World bench.
We have started AWbench for that purpose. Offering a single place to measure the relative performance of AOP/Interceptor frameworks, and measure it by your own.
More than providing performance comparison, AWbench is a good place to figure out the semantic differences and ease of use of each framework by using them for the same rather simple purpose. A "line of count" metrics will be provided in a next report.
This table provides the figures from a bench in "nanosecond per advised method invocation". A single method invocation is roughly about 5 ns/iteration on the bench hardware/software that was used. Note that an advised application will have more behavior than just a non advised method so you should not compare non advised version to advised version. AWbench does not provide yet metrics for a hand written implementation of the AOP concepts.
The results were obtained with 2 million iterations.
In this table, the two first lines in bold are the most important ones. In a real world application, it is likely that the before or around advice will interact with the code it is advising and to be able to do that it needs to access runtime information (contextual information) like method parameters values and target instance. It is also likely that the join point is advised by more than one advice.
On the opposite it is very unlikely to have just a before advice that does nothing, but it gives us a good evaluation on the most minimal overhead we can expect.
Note: comparing such results when the difference is small (f.e. 15 ns vs 10 ns) might not be relevant. Before doing so you should run the bench several time and compute an average after removing the smallest and highest measurements.
AWBench (ns/invocation) | aspectwerkz | awproxy | aspectwerkz_1_0 | aspectj | jboss | spring | dynaop | cglib | ext:aopalliance | ext:spring | ext:aspectj |
---|---|---|---|---|---|---|---|---|---|---|---|
before, args() target() | 10 | 25 | 606 | 10 | 220 | 355 | 390 | 145 | - | 220 | - |
around x 2, args() target() | 80 | 85 | 651 | 50 | 290 | 436 | 455 | 155 | 465 | 476 | - |
before | 15 | 20 | 520 | 15 | 145 | 275 | 320 | 70 | - | 40 | 10 |
before, static info access | 30 | 30 | 501 | 25 | 175 | 275 | 330 | 70 | - | 35 | - |
before, rtti info access | 50 | 55 | 535 | 50 | 175 | 275 | 335 | 75 | - | 35 | - |
after returning | 10 | 20 | 541 | 10 | 135 | 285 | 315 | 85 | - | 45 | 15 |
after throwing | 3540 | 3870 | 6103 | 3009 | 5032 | - | 6709 | 8127 | - | - | 3460 |
before + after | 20 | 30 | 511 | 20 | 160 | 445 | 345 | 80 | - | 35 | 20 |
before, args() primitives | 10 | 20 | 555 | 10 | 195 | 350 | 375 | 145 | - | 210 | - |
before, args() objects | 5 | 25 | 546 | 10 | 185 | 325 | 345 | 115 | - | 200 | - |
around | 60 | 95 | 470 | 10 | - | 225 | 315 | 75 | - | - | 90 |
around, rtti info access | 70 | 70 | 520 | 50 | 140 | 250 | 340 | 80 | 70 | 70 | - |
around, static info access | 80 | 90 | 486 | 25 | 135 | 245 | 330 | 75 | 80 | 80 | - |
This table provides the figures from the same bench where for each category AspectWerkz 2.0.RC2-snapshot is the reference.
The first line illustrates that for the most simple before advice, AspectWerkz is 13 times faster than JBoss AOP 1.0.
AWBench (relative %) | aspectwerkz | awproxy | aspectwerkz_1_0 | aspectj | jboss | spring | dynaop | cglib | ext:aopalliance | ext:spring | ext:aspectj |
---|---|---|---|---|---|---|---|---|---|---|---|
before, args() target() | 1 x | 2.5 x | 60.6 x | 1 x | 22 x | 35.5 x | 39 x | 14.5 x | - | 22 x | - |
around x 2, args() target() | 1 x | 1 x | 8.1 x | 0.6 x | 3.6 x | 5.4 x | 5.6 x | 1.9 x | 5.8 x | 5.9 x | - |
before | 1 x | 1.3 x | 34.6 x | 1 x | 9.6 x | 18.3 x | 21.3 x | 4.6 x | - | 2.6 x | 0.6 x |
before, static info access | 1 x | 1 x | 16.7 x | 0.8 x | 5.8 x | 9.1 x | 11 x | 2.3 x | - | 1.1 x | - |
before, rtti info access | 1 x | 1.1 x | 10.7 x | 1 x | 3.5 x | 5.5 x | 6.7 x | 1.5 x | - | 0.7 x | - |
after returning | 1 x | 2 x | 54.1 x | 1 x | 13.5 x | 28.5 x | 31.5 x | 8.5 x | - | 4.5 x | 1.5 x |
after throwing | 1 x | 1 x | 1.7 x | 0.8 x | 1.4 x | - | 1.8 x | 2.2 x | - | - | 0.9 x |
before + after | 1 x | 1.5 x | 25.5 x | 1 x | 8 x | 22.2 x | 17.2 x | 4 x | - | 1.7 x | 1 x |
before, args() primitives | 1 x | 2 x | 55.5 x | 1 x | 19.5 x | 35 x | 37.5 x | 14.5 x | - | 21 x | - |
before, args() objects | 1 x | 5 x | 109.2 x | 2 x | 37 x | 65 x | 69 x | 23 x | - | 40 x | - |
around | 1 x | 1.5 x | 7.8 x | 0.1 x | - | 3.7 x | 5.2 x | 1.2 x | - | - | 1.5 x |
around, rtti info access | 1 x | 1 x | 7.4 x | 0.7 x | 2 x | 3.5 x | 4.8 x | 1.1 x | 1 x | 1 x | - |
around, static info access | 1 x | 1.1 x | 6 x | 0.3 x | 1.6 x | 3 x | 4.1 x | 0.9 x | 1 x | 1 x | - |
Bench were run on a Java HotSpot 1.4.2, Windows 2000 SP4, Pentium M 1.6 GHz, 1 Go RAM.
Notes:
AWbench is a micro benchmark suite, which aims at staying simple. The test application is very simple, and AWbench is mainly the glue around the test application that applies one or more very simple advice / interceptor of the framework of your choice.
AWbench comes with an Ant script that allows you to run it on you own box, and provide some improvement if you know some for a particular framework.
So far, AWbench includes method execution pointcuts, since call side pointcuts are not supported by proxy based framework (Spring AOP, cglib, dynaop etc).
The awbench.method.Execution class is the test application, and contains one method per construct to bench. An important fact is that bytecode based AOP may provide much better performance for before advice and after advice, as well as much better performance when it comes to accessing contextual information.
Indeed, proxy based frameworks are very likely to use reflection to give the user access to intercepted method parameters at runtime from within an advice, while bytecode based AOP may use more advanced constructs to provide access at the speed of a statically compiled access.
The current scope is thus:
For method execution pointcut
Construct | Contextual information access | Notes |
---|---|---|
before advice | none | |
before advice | static information (method signature etc) | |
before advice | contextual information accessed reflectively | Likely to use of casting and unboxing of primitives |
before advice | contextual information accessed with explicit framework capabilities | Only supported by AspectJ and AspectWerkz 2.x |
after advice | none | |
after returning advice | return value | |
after throwing advice | exception instance | |
before + after advice | none | |
around advice | optimized | AspectJ and AspetWerkz 2.x provides specific optimizations (thisJoinPointStaticPart vs thisJoinPoint) |
around advice | non optimizezd | |
2 around advice | contextual information |
By accessing contextual information we means:
A pseudo code block is thus likely to be:
class awbench.method.Execution {
int m_field;
void before(int i) {
// very simple test application method body - does not vary
m_field++;
}
}
class XXframework {
// might be optimized by some framework using lazy instantiation
.. interceptWithoutContextualInformationAccess(..) {
// very simple advice body - does not vary
awbench.Run.staticIntField++;
}
.. interceptWithReflectiveContextualInformationAccess(.., XXInvocation invocation, ..) {
// very simple advice body - does not vary
awbench.Run.staticIntField++;
// reflective access to target instance and intercepted method argument
Execution target = (Execution) invocation.getTarget();
int arg0 = (int) (Integer(invocation.getArgument_0())).intValue();
}
// may not be supported by all framework
.. interceptWithDirectContextualInformationAccess(.., Execution executionTargetInstance, int arg0Intercepted, ..) {
// very simple advice body - does not vary
awbench.Run.staticIntField++;
// direct access to target instance and intercepted method argument thru rich framework facilities
Execution target = executionTargetInstance; // no casting
int arg0 = arg0Intercepted; // no casting
}
}
The following are included in AWbench:
Bytecode based frameworks
Framework | URL |
---|---|
AspectWerkz 1.0 | http://aspectwerkz.codehaus.org |
AspectWerkz 2.x | http://aspectwerkz.codehaus.org |
AspectJ (1.2) | http://eclipse.org/aspectj/ |
JBoss AOP (1.0) | http://www.jboss.org/developers/projects/jboss/aop |
Proxy based frameworks
Framework | URL |
---|---|
Spring AOP (1.1.1) | http://www.springframework.org/ |
cglib proxy (2.0.2) | http://cglib.sourceforge.net/ |
dynaop (1.0 beta) | https://dynaop.dev.java.net/ |
Moreover, AWbench includes AspectWerkz Extensible Aspect Container that allow to run any Aspect / Interceptor framework within the AspectWerkz 2.x runtime:
AspectWerkz Extensible Aspect Container running | Notes |
---|---|
AspectJ | |
AOP Alliance | http://aopalliance.sourceforge.net/ |
Spring AOP |
AWbench is extensible. Refer to the How to contribute? section (below) for more info on how to add your framework to the bench.
AWBench is released under LGPL.
There will never be a distribution of it, but source can be checked out:
cvs -d :pserver:[email protected]:/home/projects/aspectwerkz/scm login
cvs -z3 -d :pserver:[email protected]:/home/projects/aspectwerkz/scm co awbench
Once checked out, you can run the bench using several different Ant target
ant run
ant run:aspectwerkz
ant run:aspectj
ant run:jboss
ant run:ext:aspectj
ant run:ext:spring
ant run:ext:aopalliance
ant run:cglib
ant run:spring
ant run:dynaop
ant run:all
If you notice some optimizations for one of the implementation by respecting the requirements, we will add the fix in awbench and update the results accordingly.
If you are willing to write a non-AOP, non-Proxy based version of this bench so that a comparison between AOP approach and regular OO design patterns is possible send us an email.
The current implementation is not covering fine grained deployment models like perInstance / perTarget, whose underlying implementation are unlikely to be neutral on performance results.
From:http://docs.codehaus.org/display/AW/AOP+Benchmark