Redis集群环境下的分布式事务问题探究

 

  近日,从部署在PaaS平台上的java项目日志中发现每次前台按钮触发后会抛出redis释放锁的自定义的异常信息,回查代码发现是由于使用了自定义的Redis分布式锁(RedisLock工具类)过程中,在自定义的释放锁的方法中本打算使用watch()、multi()和exec()方法组成的事务来实现乐观锁控制(Redis事务没有回滚操作),结果发现并不可以直接用,否则会在PaaS平台的Redis集群环境(3个master节点和3个slave节点)引发不支持事务的异常。

释放锁代码段:

	/**
	 * 释放锁
	 * @param lockName 锁名称
	 * @param identifier 锁标识
	 * @return
	 */
	public boolean releaseLock(String lockName, String identifier) {

		if (StringUtils.isEmpty(identifier)) return false;

		RedisConnectionFactory connectionFactory = redisTemplate.getConnectionFactory();
		RedisConnection redisConnection = connectionFactory.getConnection();
		String lockKey = LOCK_PREFIX + lockName;
		boolean releaseFlag = false;
		while (true) {
			try{
				/** 监视lock,准备开始事务 */
				redisConnection.watch(lockKey.getBytes());
				byte[] valueBytes = redisConnection.get(lockKey.getBytes());

				/**  value为空表示锁不存在或已经被释放*/
				if(valueBytes == null){
					redisConnection.unwatch();
					releaseFlag = false;
					break;
				}

				/** 通过前面返回的value值判断是不是该锁,若是该锁,则删除,释放锁 */
				String identifierValue = new String(valueBytes);
				if (identifier.equals(identifierValue)) {
					redisConnection.multi();
					redisConnection.del(lockKey.getBytes());
					List results = redisConnection.exec();
					if (results == null) {
						continue;
					}
					releaseFlag = true;
				}
				redisConnection.unwatch();
				break;
			}
			catch(Exception e){
				log.warn("释放锁异常", e);
			}
		}
		RedisConnectionUtils.releaseConnection(redisConnection, connectionFactory);
		return releaseFlag;
	} 
  

异常信息截图如下:

Redis集群环境下的分布式事务问题探究_第1张图片

花费约1小时去分析,本来猜测是可能和加锁的方法上的@Transactional注解有关、认为不能和spring事务共用、不支持的可能是当redis发生get或set或加锁异常后能自动触发数据库数据的回滚,于是把@Transactional注释掉了,重新部署服务后测试,发现并非此原因,经网上搜索查到stack overflow上这篇文章(https://stackoverflow.com/questions/42088324/is-there-any-redis-client-java-prefered-which-supports-transactions-on-redis-c/42091605#42091605)给到的解释较为专业:

 In Redis Cluster, a particular node is a master for one or more hash-slots, that’s the partitioning scheme to shard data amongst multiple nodes. One hash-slot, calculated from the keys used in the command, lives on one node. Commands with multiple keys are limited to yield to the same hash-slot. Otherwise, they are rejected. Such constellations are called cross-slot.

原来cluster集群模式下果然默认是不支持分布式事务的!也就是说限于同一个hash槽的指向数据区后就不能跨槽了,因此在集群环境下不能没配置直接识别WATCH、MULTI和EXEC这些命令,只能先注释掉watch()、multi()和exec()方法段了。而在单节点状态下是可以的,估测还是java工程pom依赖集spring-boot-starter-data-redis(或spring-data-redis)里,对集群模式的数据分区存储和处理方式不同导致。

再次分析了Redis源码,从redis主要实现类RedisTemplate中发现疑点:

1)有支持事务的属性enableTransactionSupport = false,莫非这个是控制打开支持事务开关的?然而,这里的变量并不是指的不是redis事务,而应该是spring事务、代表支持数据库的事务成功才执行的意思。因为Spring默认的事务,都是基于DB事务的。

2)发现如下一段源码是在bind连接池之后真正意义上的执行读写操作的方法体:

	public  T execute(SessionCallback session) {
		Assert.isTrue(initialized, "template not initialized; call afterPropertiesSet() before using it");
		Assert.notNull(session, "Callback object must not be null");

		RedisConnectionFactory factory = getConnectionFactory();
		// bind connection
		RedisConnectionUtils.bindConnection(factory, enableTransactionSupport);
		try {
			return session.execute(this);
		} finally {
			RedisConnectionUtils.unbindConnection(factory);
		}
	}

这里的SessionCallback是何物呢?再看定义:

/**
 * Callback executing all operations against a surrogate 'session' (basically against the same underlying Redis
 * connection). Allows 'transactions' to take place through the use of multi/discard/exec/watch/unwatch commands.
 * 
 * @author Costin Leau
 */
public interface SessionCallback {

	/**
	 * Executes all the given operations inside the same session.
	 * 
	 * @param operations Redis operations
	 * @return return value
	 */
	 T execute(RedisOperations operations) throws DataAccessException;
}

原来,在依赖集spring-boot-starter-data-redis(或spring-data-redis)里,事务是可以通过RedisTemplate的SessionCallback中来支持(否则事务不生效),依据这个来判断事务是否成功,没有抛异常)。没有在SessionCallback里头执行watch、multi、exec,而是自己直接用redisConnection连接对象单独调了。

大致的方案可以启动一个新线程去做:

@Test
    public void testRedisTrans() throws InterruptedException, ExecutionException {
        String key = "test-trans-1";
        ValueOperations strOps = redisTemplate.opsForValue();
        strOps.set(key, "hello");
        ExecutorService pool  = Executors.newCachedThreadPool();
        List> tasks = new ArrayList<>();
        for(int i=0;i<5;i++){
            final int idx = i;
            tasks.add(new Callable() {
                @Override
                public Object call() throws Exception {
                    return redisTemplate.execute(new SessionCallback() {
                        @Override
                        public Object execute(RedisOperations operations) throws DataAccessException {
                            operations.watch(key);
                            String origin = (String) operations.opsForValue().get(key);
                            operations.multi();
                            operations.opsForValue().set(key, origin + idx);
                            Object result = operations.exec();
                            System.out.println("set value:"+origin + idx+",result:"+ result);
                            return result;
                        }
                    });
                }
            });
        }
        List> futures = pool.invokeAll(tasks);
        for(Future f:futures){
            System.out.println(f.get());
        }
        pool.shutdown();
        pool.awaitTermination(1000, TimeUnit.MILLISECONDS);
    }

 
  

结果:

set value:hello2,result:null
set value:hello3,result:[]
set value:hello1,result:null
set value:hello4,result:null
set value:hello0,result:null

查看该值:
127.0.0.1:6379> get test-trans-1
"\"hello3\""

你可能感兴趣的:(Java,分布式,缓存)