JDK6的Random和System.nanoTime()引起的一个“大”bug

目前在做一个基于nats的paas项目时,碰到了一个诡异的问题。这个场景是业务A发送一个消息,业务B处理完以后应答,业务A拿到应答数据以后处理。

业务A --->request---->nats server dispatch--->业务B....--->发送应答---->nats server dispatch---->业务A

在我的macbook + JDK上运行时,大概有50%的概率会碰到业务A的两个线程收到同样的消息应答。

以下为业务A发送消息的代码,inbox是临时创建的一个唯一应答id(subject),如果是收到重复消息,应该是创建了不唯一的inbox。

	@Override
	public Request request(final String subject, String message, long timeout, TimeUnit unit, final Integer maxReplies, MessageHandler... messageHandlers) {
		assertNatsOpen();
		if (message == null) {
			throw new IllegalArgumentException("Message can NOT be null.");
		}
		final String inbox = createInbox();
		System.out.println(Thread.currentThread()+ " inbox == "+inbox);
		final Subscription subscription = subscribe(inbox, maxReplies);
		for (MessageHandler handler : messageHandlers) {
			subscription.addMessageHandler(handler);
		}

		final ScheduledExecutorService scheduler = (channel == null) ? eventLoopGroup.next() : channel.eventLoop();
		scheduler.schedule(new Runnable() {
			@Override
			public void run() {
				subscription.close();
			}
		}, timeout, unit);

		final ClientPublishFrame publishFrame = new ClientPublishFrame(subject, message, inbox);
		publish(publishFrame);
		return new Request() {
			@Override
			public void close() {
				subscription.close();
			}

			@Override
			public String getSubject() {
				return subject;
			}

			@Override
			public int getReceivedReplies() {
				return subscription.getReceivedMessages();
			}

			@Override
			public Integer getMaxReplies() {
				return maxReplies;
			}
		};
	}

	public static String createInbox() {
		
		byte[] bytes = new byte[16];
		new Random().nextBytes(bytes);
		return "_INBOX." + new BigInteger(bytes).abs().toString(16);
	}
createInbox()使用了Random()类来产生唯一标识,Random类本身是线程安全的,看代码。问题是,如果两个线程产生的inbox伪随机数相同,那么Random的种子应该是一样的。

    public Random() { this(++seedUniquifier + System.nanoTime()); }
    private static volatile long seedUniquifier = 8682522807148012L;


为了定位错误,用一个新类替换原来的类
    public
class RandomReplacer extends Random{

   private static volatile long seedUniquifierV2 = 8682522807148012L;
	private static volatile long seed;

	RandomReplacer(){
        	super(seed=++seedUniquifierV2 + System.nanoTime());
		System.out.println(Thread.currentThread() + "seed is "+ seed);
	}
}

结果得到:

Thread[DefaultDeployExecutor-1,5,DefaultDeployExecutor]seed is 1435269911120252014

Thread[DefaultDeployExecutor-0,5,DefaultDeployExecutor]seed is 1435269911120252014

Thread[DefaultDeployExecutor-1,5,DefaultDeployExecutor] inbox == _INBOX.50f8f44519702c14d2c4ee88332022ab

Thread[DefaultDeployExecutor-0,5,DefaultDeployExecutor] inbox == _INBOX.50f8f44519702c14d2c4ee88332022ab

证明了我们的猜测,种子是一样的。可是System.nanoTime()可是返回纳秒,两个线程再怎么调度,这个值还能一样??

还真的时一样的。System.nanoTime()方法是一个native方法,依赖具体的系统实现,在Mac系统中就不支持纳秒,可以做一个实验:

class Test{
public static void main(String [] args){

  System.out.println(System.nanoTime());
}
}

得到:

1426594734801287000

后三位总是0. 这样连个线程就真的可能以同样的seed初始化Random(),从而得到同样的inbox伪随机值。


这个问题在JDK7中是这样解决的:

    /**
     * Creates a new random number generator. This constructor sets
     * the seed of the random number generator to a value very likely
     * to be distinct from any other invocation of this constructor.
     */
    public Random() {
        this(seedUniquifier() ^ System.nanoTime());
    }

    private static long seedUniquifier() {
        // L'Ecuyer, "Tables of Linear Congruential Generators of
        // Different Sizes and Good Lattice Structure", 1999
        for (;;) {
            long current = seedUniquifier.get();
            long next = current * 181783497276652981L;
            if (seedUniquifier.compareAndSet(current, next))
                return next;
        }
    }

    private static final AtomicLong seedUniquifier
        = new AtomicLong(8682522807148012L);



你可能感兴趣的:(java)