1.8.3 appears to have addressed this issue with a single application server. However, we're seeing the issues when load-balancing our application (multiple Tomcat servers).
I have verified that our Quartz config contains
org.quartz.jobStore.isClustered = true
The stack trace that we're seeing regularly is below. I'm working on getting a thread dump. Code:
2010-09-10 11:21:39,199 ERROR [ErrorLogger] : An error occured while firing trigger 'DEFAULT.Publish Request Check'
org.quartz.JobPersistenceException: Couldn't update states of blocked triggers: Lock wait timeout exceeded; try restarting transaction [See nested exception: java.sql.SQLException: Lock wait timeout exceeded; try restarting transaction]
at org.quartz.impl.jdbcjobstore.JobStoreSupport.triggerFired(JobStoreSupport.java:2925)
at org.quartz.impl.jdbcjobstore.JobStoreSupport$38.execute(JobStoreSupport.java:2846)
at org.quartz.impl.jdbcjobstore.JobStoreSupport.executeInNonManagedTXLock(JobStoreSupport.java:3763)
at org.quartz.impl.jdbcjobstore.JobStoreSupport.triggerFired(JobStoreSupport.java:2840)
at org.quartz.core.QuartzSchedulerThread.run(QuartzSchedulerThread.java:320)
Caused by: java.sql.SQLException: Lock wait timeout exceeded; try restarting transaction
at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:1056)
at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:957)
at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:3376)
at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:3308)
at com.mysql.jdbc.MysqlIO.sendCommand(MysqlIO.java:1837)
at com.mysql.jdbc.MysqlIO.sqlQueryDirect(MysqlIO.java:1961)
at com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2543)
at com.mysql.jdbc.PreparedStatement.executeInternal(PreparedStatement.java:1737)
at com.mysql.jdbc.PreparedStatement.executeUpdate(PreparedStatement.java:2022)
at com.mysql.jdbc.PreparedStatement.executeUpdate(PreparedStatement.java:1940)
at com.mysql.jdbc.PreparedStatement.executeUpdate(PreparedStatement.java:1925)
at org.apache.tomcat.dbcp.dbcp.DelegatingPreparedStatement.executeUpdate(DelegatingPreparedStatement.java:101)
at org.quartz.impl.jdbcjobstore.StdJDBCDelegate.updateTriggerStatesForJobFromOtherState(StdJDBCDelegate.java:1695)
at org.quartz.impl.jdbcjobstore.JobStoreSupport.triggerFired(JobStoreSupport.java:2918)
... 4 more
Lock timeout exceptions aren't necessarily deadlocks, they could simply be true timeouts due to lock contention.
I see that you're using mysql. Make sure you have row-level locking enabled (and indexes on the tables). Also make sure that your setup is configured to use Read-Committed tx isolation level.
Thanks for the info. We are using InnoDB tables which use row-locking by default, as far as I know. There are definitely indexes on the table.
to our quartz config and this seems to have made a significant difference. Previously I was getting those errors every few minutes without fail. So far, I haven't seen any.
Is this something you would recommend us doing for all of our supported db vendors or just MySQL? We support MySQL 4.1/5.0, SQL Server 2k5/2k8, and Oracle 10g.
Should we be using this setting in non-load-balanced environments as well?