The goal of this article is to show how to achieve near-linear scalability of jBPM workflow engine by tuning its configuration and setting it up on a JBoss cluster with distributed TreeCache. Readers will be guided through all steps required to cluster jBPM efficiently – from cluster setup to fine-tuning jBPM configuration – and provided with performance test results as well as various tips and tricks allowing to achieve maximum performance.
jBPM is a powerful workflow engine – robust, extensible and fast. However, what are the possibilities if we need more performance than one server can offer? Clustering is the solution that immediately springs to mind. But is it quickly and easily feasible and, more importantly, does it yield expected results? This article takes you through the process of setting up a JBoss cluster together with a distributed TreeCache and tuning jBPM to deliver its full performance. You will also learn what scale of improvement you can expect over standard configuration.
jBPM is a workflow engine enabling its users to easily manage various kinds of business processes. It is based around a process definition, which can consist of activities. Similar or related activities can be grouped in scopes. Activities can be triggered either manually or scheduled for later execution. That's where JobExecutors come into play. They poll the database and check whether there are any jobs available for execution. Whenever a job is available and its execution time is due, JobExecutor puts a lock on it and executes any actions connected with the job. Such a timed execution is called automatic escalation and it is the main focus of this article.
We have developed a business process management application based on a stack of cutting edge technologies including jBPM. The main business unit in our system bears the name Call. Our client's focus was put mainly on scalability and performance of workflow engine, therefore a lot of effort was put into testing and tuning clustered jBPM configuration. My goal was to cluster and tune automatic Call escalations to meet customer's requirements with regards to installation size and usage profile.
The test environment consists of 4 separate machines to run the application, all connecting to an Oracle 10g database. Server instances are running Ubuntu Server 8.10 Linux and JBoss 4.2.3GA, each has a quad-core Q9300 Intel processor running at 2.50GHz and 4GB of RAM memory. JBoss instances communicate with each other using JGroups protocol. Library versions are summarized at the end of this article.
In order to avoid serious problems with jBPM job ownership and execution further down the road, the following steps need to be taken:
JBoss clustering is easy to set up, it is enough to either use the „all” configuration (for development and testing purposes) or to enrich your configuration with the following files from „all”:
We have to be sure that we see messages related to our new setup. The following lines in jboss-log4j.xml will allow us to verify that the cluster is working correctly without cluttering our console with too verbose output:
We can now run JBoss with ./run.sh -c
12:48:23,783 INFO jgroups.JChannel| JGroups version: 2.4.1 SP-4 12:48:24,098 INFO protocols.FRAG2| frag_size=60000, overhead=200, new frag_size=59800 12:48:24,102 INFO protocols.FRAG2| received CONFIG event: {bind_addr=/10.10.1.43} 12:48:24,187 INFO HAPartition.DefaultPartition| Initializing 12:48:24,232 INFO STDOUT| ------------------------------------------------------- GMS: address is 10.10.1.43:46941 ------------------------------------------------------- 12:48:26,335 INFO HAPartition.DefaultPartition| Number of cluster members: 4 12:48:26,335 INFO HAPartition.DefaultPartition| Other members: 3 12:48:26,335 INFO HAPartition.DefaultPartition| Fetching state (will wait for 30000 milliseconds): 12:48:26,393 INFO HAPartition.DefaultPartition| state was retrieved successfully (in 58 milliseconds) 12:48:26,449 INFO jndi.HANamingService| Started ha-jndi bootstrap jnpPort=1100, backlog=50, bindAddress=/10.10.1.43 12:48:26,457 INFO mingService$AutomaticDiscovery| Listening on /10.10.1.43:1102, group=230.0.0.4, HA-JNDI address=10.10.1.43:1100
As you can see, a cluster of 4 nodes has been formed, now we can deploy the application. JBoss provides a farming mechanism, which can be used to deploy an application to all cluster members in one move. Simply drop your EAR or WAR in server/
jBPM 3.2 is designed to work seamlessly in a cluster. All application instances are independent, effectively knowing nothing of each other. Job distribution happens through use of database-level locks and job ownership includes information about thread name and host.
When a job is ready to be executed and not locked yet, JobExecutors will try to put a lock on it. Due to use of optimistic locking, only one of them will succeed, the ones that got an OptimisticLockingException will pause for a defined time before acquiring new jobs.
"The second level cache is an important aspect of the JBoss jBPM implementation. If it weren't for this cache, JBoss jBPM could have a serious drawback in comparison to the other techniques to implement a BPM engine."
- jBPM jPDL User Guide
jBPM entity classes are cached by default. Their cache configuration, however, is prepared for local deployment only – they use a nonstrict-read-write strategy, which is unsupported by JBoss Cache. How can you change the strategy to transactional ? You have to override the Hibernate mappings of jBPM entities. You can do this by manually defining a list of mapping resources in your SessionFactory bean.
classpath*:com/yourcompany/jbpm/**/*.hbm.xml classpath*:org/jbpm/bytes/ByteArray.hbm.xml classpath*:org/jbpm/context/def/ContextDefinition.hbm.xml classpath*:org/jbpm/context/exe/ContextInstance.hbm.xml classpath*:org/jbpm/context/exe/TokenVariableMap.hbm.xml classpath*:org/jbpm/context/exe/VariableInstance.hbm.xml classpath*:org/jbpm/context/exe/variableinstance/ByteArrayInstance.hbm.xml classpath*:org/jbpm/context/exe/variableinstance/DateInstance.hbm.xml classpath*:org/jbpm/context/exe/variableinstance/DoubleInstance.hbm.xml classpath*:org/jbpm/context/exe/variableinstance/HibernateLongInstance.hbm.xml classpath*:org/jbpm/context/exe/variableinstance/HibernateStringInstance.hbm.xml classpath*:org/jbpm/context/exe/variableinstance/JcrNodeInstance.hbm.xml classpath*:org/jbpm/context/exe/variableinstance/LongInstance.hbm.xml classpath*:org/jbpm/context/exe/variableinstance/NullInstance.hbm.xml classpath*:org/jbpm/context/exe/variableinstance/StringInstance.hbm.xml classpath*:org/jbpm/context/log/VariableLog.hbm.xml classpath*:org/jbpm/context/log/VariableCreateLog.hbm.xml classpath*:org/jbpm/context/log/VariableDeleteLog.hbm.xml classpath*:org/jbpm/context/log/VariableUpdateLog.hbm.xml classpath*:org/jbpm/context/log/variableinstance/ByteArrayUpdateLog.hbm.xml classpath*:org/jbpm/context/log/variableinstance/DateUpdateLog.hbm.xml classpath*:org/jbpm/context/log/variableinstance/DoubleUpdateLog.hbm.xml classpath*:org/jbpm/context/log/variableinstance/HibernateLongUpdateLog.hbm.xml classpath*:org/jbpm/context/log/variableinstance/HibernateStringUpdateLog.hbm.xml classpath*:org/jbpm/context/log/variableinstance/LongUpdateLog.hbm.xml classpath*:org/jbpm/context/log/variableinstance/StringUpdateLog.hbm.xml classpath*:org/jbpm/db/hibernate.queries.hbm.xml classpath*:org/jbpm/graph/action/MailAction.hbm.xml classpath*:org/jbpm/graph/exe/Comment.hbm.xml classpath*:org/jbpm/graph/exe/ProcessInstance.hbm.xml classpath*:org/jbpm/graph/exe/Token.hbm.xml classpath*:org/jbpm/graph/exe/RuntimeAction.hbm.xml classpath*:org/jbpm/graph/log/ActionLog.hbm.xml classpath*:org/jbpm/graph/log/NodeLog.hbm.xml classpath*:org/jbpm/graph/log/ProcessInstanceCreateLog.hbm.xml classpath*:org/jbpm/graph/log/ProcessInstanceEndLog.hbm.xml classpath*:org/jbpm/graph/log/ProcessStateLog.hbm.xml classpath*:org/jbpm/graph/log/SignalLog.hbm.xml classpath*:org/jbpm/graph/log/TokenCreateLog.hbm.xml classpath*:org/jbpm/graph/log/TokenEndLog.hbm.xml classpath*:org/jbpm/graph/log/TransitionLog.hbm.xml classpath*:org/jbpm/graph/node/StartState.hbm.xml classpath*:org/jbpm/graph/node/EndState.hbm.xml classpath*:org/jbpm/graph/node/Fork.hbm.xml classpath*:org/jbpm/graph/node/Join.hbm.xml classpath*:org/jbpm/graph/node/State.hbm.xml classpath*:org/jbpm/graph/node/MailNode.hbm.xml classpath*:org/jbpm/job/ExecuteActionJob.hbm.xml classpath*:org/jbpm/job/ExecuteNodeJob.hbm.xml classpath*:org/jbpm/job/Job.hbm.xml classpath*:org/jbpm/job/Timer.hbm.xml classpath*:org/jbpm/logging/log/ProcessLog.hbm.xml classpath*:org/jbpm/logging/log/MessageLog.hbm.xml classpath*:org/jbpm/logging/log/CompositeLog.hbm.xml classpath*:org/jbpm/module/exe/ModuleInstance.hbm.xml classpath*:org/jbpm/scheduler/def/CreateTimerAction.hbm.xml classpath*:org/jbpm/scheduler/def/CancelTimerAction.hbm.xml classpath*:org/jbpm/taskmgmt/exe/TaskMgmtInstance.hbm.xml classpath*:org/jbpm/taskmgmt/exe/TaskInstance.hbm.xml classpath*:org/jbpm/taskmgmt/exe/PooledActor.hbm.xml classpath*:org/jbpm/taskmgmt/exe/SwimlaneInstance.hbm.xml classpath*:org/jbpm/taskmgmt/log/TaskLog.hbm.xml classpath*:org/jbpm/taskmgmt/log/TaskCreateLog.hbm.xml classpath*:org/jbpm/taskmgmt/log/TaskAssignLog.hbm.xml classpath*:org/jbpm/taskmgmt/log/TaskEndLog.hbm.xml classpath*:org/jbpm/taskmgmt/log/SwimlaneLog.hbm.xml classpath*:org/jbpm/taskmgmt/log/SwimlaneCreateLog.hbm.xml classpath*:org/jbpm/taskmgmt/log/SwimlaneAssignLog.hbm.xml
The list consists of all mapped jBPM entities less the cached ones (they are commented out). In order to keep the mapping complete, you need to copy the commented out hbm.xml mapping files to your project and change their caching strategy definition from:
to:
Keep these files in a package defined in the first line of mappingResources list, they will get picked up automatically.
Please note that the transactional caching strategy will work only with a JTA transaction manager in your application! You have to define Hibernate property hibernate.transaction.manager_lookup_class to point to your transaction manager lookup class.
Preparation of TreeCache for use in your application takes a few, rather lengthy steps. But fear not, after all it is not very complicated. First, you have to create a cache MBean and save its definition in a file named jboss-service.xml:
jboss:service=Naming jboss:service=TransactionManager org.jboss.cache.JBossTransactionManagerLookup OPTIMISTIC REPEATABLE_READ REPL_ASYNC true Cache-Cluster 5000 10000 15000 org.jboss.cache.eviction.LRUPolicy 5 5000 1000 120 5000 1000 5 4 false false org.jboss.cache.loader.JDBCCacheLoader cache.jdbc.table.name=jbosscache cache.jdbc.table.create=true cache.jdbc.table.drop=true cache.jdbc.table.primarykey=jbosscache_pk cache.jdbc.fqn.column=fqn cache.jdbc.fqn.type=varchar(255) cache.jdbc.node.column=node cache.jdbc.node.type=blob cache.jdbc.parent.column=parent cache.jdbc.driver=oracle.jdbc.driver.OracleDriver cache.jdbc.url=jdbc:oracle:thin:@[hostname]:1521:orcl cache.jdbc.user=[username] cache.jdbc.password=[password] true false false false
Then package this file into a SAR archive (which is a ZIP essentially, only with a different extension) with the following structure:
jbosscache.sar/ - META-INF/ - jboss-service.xml
You have to package the SAR at your EAR's root level and add the following lines in your META-INF/jboss-app.xml:
jbosscache.sar
Should you encounter any serialization problems during startup or later use, you can switch back from JBoss serialization to standard Java serialization by adding the following JVM option:
-Dserialization.jboss=false
Your application should depend on the following artifacts in order to be able to use JBoss Cache as its cache provider (assuming you use Maven for building):
org.jboss.cluster hibernate-jbc-cacheprovider 1.0.1.GA hibernate hibernate3 jboss jboss-common jboss jboss-jmx jboss jboss-system jboss jboss-j2ee jboss jboss-transaction org.hibernate hibernate-jbosscache 3.3.1.GA jboss jboss-cache jboss jboss-system jboss jboss-common jboss jboss-minimal jboss jboss-j2se concurrent concurrent jgroups jgroups-all
Note: the exclusions are here to prevent version mismatches with the libraries already included in our project or provided by JBoss itself. You may have to adjust them manually for your application.
If you don't use Maven, you have to download the mentioned libraries manually and include them on your classpath.
Now it's time to let Hibernate know something about our cache. A few options is more than enough:
hibernate.cache.provider_class=org.jboss.hibernate.jbc.cacheprovider.JmxBoundTreeCacheProvider hibernate.treecache.mbean.object_name=jboss.cache:service=TreeCache hibernate.cache.use_second_level_cache=true hibernate.cache.use_query_cache=false hibernate.transaction.manager_lookup_class=
Having done all this, you can cache your entity classes by marking them with the @Cache annotation. Remember that only read-only and transactional strategies are supported by clustered TreeCache.
Your newly created cache can be monitored in two ways - via Hibernate statistics module or via TreeCache JMX MBean, which we have already created.
To use Hibernate statistics, an additional dependency is needed in your POM:
org.hibernate hibernate-jmx 3.3.1.GA
In order to enable statistic gathering and exporting, the following has to be put in your Spring context file:
true
where "hibernateSessionFactory " is the ID of session factory Spring bean. With this change, Hibernate statistics module is available via JMX.
You can monitor cached entity Fully Qualified Names (labelled Second level cache regions) and the ratio of put, hit and miss counts to verify that the cache is working as expected. Correctly cached jBPM after a while of operating should result in a very high hit/miss ratio, such as on this screenshot from JConsole:
jBPM in a default configuration scales well but provides only a fraction of its potential performance. The following graph shows how the escalation times fall with addition of subsequent nodes. Scenario which I tested consists of 1000 Calls, each automatically escalated twice – which results in 2000 total escalations. Each escalation results in a database update. All results are illustrative and subject to some fluctuation under different testing conditions.
We are seeking to achieve near-linear scalability. Linear scalability, relative to server resources, means that with a constant load, performance improves at a constant rate relative to additional resources.
The left chart shows comparison of real escalation time to theoretical time, based on linear acceleration. The right chart compares real acceleration to theoretical linear acceleration.
These results were collected using standard jBPM configuration – 1 JobExecutor thread, 10 second idleInterval, 1 hour maxIdleInterval and are meant to show only how jBPM scales in its default setup. It's not bad but the acceleration factor could be higher.
Now, let's play with the configuration a little. jBPM has two options named idleInterval and maxIdleInterval which are of interest to us. When an Exception is thrown by the JobExecutor, it pauses for a period defined in idleInterval, which is then increased twofold until it reaches maxIdleInterval . Unfortunately for us, StaleObjectStateException is thrown each time an optimistic locking clash is detected, and this happens quite often with many concurrent JobExecutors trying to acquire a job. Reducing both values is crucial in order to achieve a high concurrency rate. Here are the results of reducing idleInterval to 500 milliseconds and maxIdleInterval to 1000 milliseconds:
The left chart shows comparison of real escalation time to theoretical time, based on linear acceleration. The right chart compares real acceleration to theoretical linear acceleration.
You can see that the acceleration curve is very close to the optimum now. Let's see if we can shift the whole time curve downwards.
In order to increase throughput of the workflow engine you can change the number of JobExecutor threads per machine. I have performed the same performance tests as previously but this time on 4 nodes only, increasing the number of threads and experimenting with cache on or off. Caching increased the throughput by 15-23%, but the most significant gain comes from increasing the thread number:
This finally got us some real high performance. Increasing the thread number to 20 together with enabling cache skyrocketed the throughput by 340% in comparison to standard configuration.
We can see that at 20 threads per machine we have reached the saturation point, increasing this value further does not yield significantly better results. Caching would probably bring even more light into the picture if the database was under constant load from other parts of the application.
You have seen a thorough study of jBPM clustering and tuning. The verdict is that jBPM is a very efficient workflow engine, it only requires turning the right knobs in order to get the most out of it. By adding 3 servers and tweaking jBPM configuration, we were able to increase the throughput over 16 times in comparison to 1 server environment with default setup. If you need to increase the throughput of your workflow, I suggest you take the following order of modifications:
One thing is to be remembered though – database will always be a bottleneck at some point in time. After all, jBPM is mostly based on Hibernate. Therefore, if your efforts don't bring expected results, think about tuning / clustering your DB.
jBPM-jPDL: 3.2.2
Hibernate: 3.3.1
JBoss: 4.2.3GA
TreeCache: 1.4.1SP9
Hibernate JBossCache provider: 1.0.1GA
Szymon Zeslawski currently works as Senior Developer for Consol Consulting & Solutions, developing cutting edge business solutions. Graduated from AGH Technical University in Krakow, Poland, he devoted his technical side of life to Java since 2002. His areas of expertise, gathered during 3 years of professional experience, cover various web, mobile and enterprise technologies with focus on ergonomy and performance. When not at computer, he stretches his mind and body training Chuo Jiao or chases ghosts from behind the steering wheel.