suasync包中Deferred类API及应用示例

public final class Deferred  extends Object

A thread-safeimplementation of a deferred result for easy asynchronous processing.

Thisimplementation is based on Twisted's Python DeferredAPI.

  • Deferred Reference
  • A tutorial (Deferred in depth)
  • Source code of defer.py

This API is a simple and elegant way ofmanaging asynchronous and dynamic "pipelines" (processing chains)without having to explicitly define a finite state machine.

 

The tl;dr version

We're all busy and don't always have timeto RTFM in details. Please pay special attention to theinvariants youmust respect. Other than that, here's an executive summary of whatDeferredoffers:

  • A Deferred is like a Future with a dynamic Callback chain associated to it.
  • When the deferred result becomes available, the callback chain gets triggered.
  • The result of one callback in the chain is passed on to the next.
  • When a callback returns another Deferred, the next callback in the chain doesn't get executed until that otherDeferred result becomes available.
  • There are actually two callback chains. One is for normal processing, the other is for error handling / recovery. ACallback that handles errors is called an "errback".
  • Deferred is an important building block for writing easy-to-use asynchronous APIs in a thread-safe fashion.

Understanding the concept of Deferred

The idea is that a Deferredrepresents a result that's not yet available. An asynchronous operation (I/O,RPC, whatever) has been started and will hand its result (be it successful ornot) to theDeferred inthe future. The key difference between a Deferred and aFuture isthat a Deferred has acallback chain associated to it, whereas withjust aFuture youneed get the result manually at some point, which poses problems such as: Howdo you know when the result is available? What if the result itself depends onanother future?

When you start anasynchronous operation, you typically want to be called back when the operationcompletes. If the operation was successful, you want your callback to use itsresult to carry on what you were doing at the time you started the asynchronousoperation. If there was an error, you want to trigger some error handling code.

But there's moreto a Deferred than a single callback. You can addarbitrary number of callbacks, which effectively allows you to easily buildcomplex processing pipelines in a really simple and elegant way.

Understanding the callback chain

Let's take a typical example. You'rewriting a client library for others to use your simple remote storage service.When your users call theget method in your library, you want toretrieve some piece of data from your remote service and hand it back to theuser, but you want to do so in an asynchronous fashion.

When the user ofyour client library invokes get, you assemble arequest and send it out to the remote server through a socket. Before sendingit to the socket, you create aDeferred and you store itsomewhere, for example in a map, to keep an association between the request andthisDeferred. You then return this Deferredto the user, this is how they will access the deferred result as soon as theRPC completes.

Sooner or later,the RPC will complete (successfully or not), and your socket will becomereadable (or maybe closed, in the event of a failure). Let's assume for nowthat everything works as expected, and thus the socket is readable, so you readthe response from the socket. At this point you extract the result of theremote get call, and you hand it out to theDeferredyou created for this request (remember, you had to store it somewhere, so youcould give it the deferred result once you have it). TheDeferredthen stores this result and triggers any callback that may have been added toit. The expectation is that the user of your client library, after calling yourget method, will add aCallback to the Deferredyou gave them. This way, when the deferred result becomes available, you'llcall it with the result in argument.

So far what we'veexplained is nothing more than a Future with a callbackassociated to it. But there's more toDeferredthan just this. Let's assume now that someone else wants to build a cachinglayer on top of your client library, to avoid repeatedlygettingthe same value over and over again through the network. Users who want to usethe cache will invokeget on the caching library instead of directlycalling your client library.

Let's assume thatthe caching library already has a result cached for a getcall. It will create aDeferred, and immediately hand it the cachedresult, and it will return thisDeferred to the user. Theuser will add a Callback to it, which will be immediatelyinvoked since the deferred result is already available. So the entiregetcall completed virtually instantaneously and entirely from the same thread.There was no context switch (no other thread involved, no I/O and whatnot),nothing ever blocked, everything just happened really quickly.

Now let's assumethat the caching library has a cache miss and needs to do a remotegetcall using the original client library described earlier. The RPC is sent outto the remote server and the client library returns aDeferredto the caching library. This is where things become exciting. The cachinglibrary can then add its own callback to theDeferredbefore returning it to the user. This callback will take the result that cameback from the remote server, add it to the cache and return it. As usual, theuser then adds their own callback to process the result. So now theDeferredhas 2 callbacks associated to it:

              1st callback       2nd callback
 
   Deferred:  add to cache  -->  user callback
 

When the RPC completes, the original clientlibrary will de-serialize the result from the wire and hand it out to theDeferred. Thefirst callback will be invoked, which will add the result to the cache of thecaching library. Then whatever the first callback returns will be passed on tothe second callback. It turns out that the caching callback returns thegetresponse unchanged, so that will be passed on to the user callback.

Now it's veryimportant to understand that the first callback could have returned anotherarbitrary value, and that's what would have been passed to the second callback.This may sound weird at first but it's actually the key behindDeferred.

To illustrate why,let's complicate things a bit more. Let's assume the remote service that servesthoseget requests is a fairly simple and low-levelstorage service (think memcached), so it only works with byte arrays, itdoesn't care what the contents is. So the original client library is onlyde-serializing the byte array from the network and handing that byte array totheDeferred.

Now you're writinga higher-level library that uses this storage system to store some of yourcustom objects. So when you get the byte array from the server, you need tofurther de-serialize it into some kind of an object. Users of your higher-levellibrary don't care about what kind of remote storage system you use, the onlything they care about isgetting those objectsasynchronously. Your higher-level library is built on top of the originallow-level library that does the RPC communication.

When the users ofthe higher-level library call get, you call geton the lower-level library, which issues an RPC call and returns a Deferredto the higher-level library. The higher-level library then adds a firstcallback to further de-serialize the byte array into an object. Then the userof the higher-level library adds their own callback that does something withthat object. So now we have something that looks like this:

              1st callback                    2nd callback
 
   Deferred:  de-serialize to an object  -->  user callback
 

When the resultcomes in from the network, the byte array is de-serialized from the socket. Thefirst callback is invoked and its argument is theinitial result, the byte array. So thefirst callback further de-serializes it into some object that it returns. Thesecond callback is then invoked and its argument isthe result of the previous callback,that is the de-serialized object.

Now back to thecaching library, which has nothing to do with the higher level library. All itdoes is, given an object that implements some interface with agetmethod, it keeps a map of whatever arguments getreceives to anObject that was cached for this particular getcall. Thanks to the way the callback chain works, it's possible to use thecaching library together with the higher-level library transparently. Users whowant to use caching simply need to use the caching library together with thehigher level library. Now when they callgeton the caching library, and there's a cache miss, here's what happens, step bystep:

1.  The caching library calls get onthe higher-level library.

2.  The higher-level library calls get onthe lower-level library.

3.  The lower-level library creates a Deferred,issues out the RPC call and returns itsDeferred.

4.  The higher-level library adds its ownobject de-serialization callback to theDeferred andreturns it.

5.  The caching library adds its owncache-updating callback to the Deferred and returns it.

6.  The user gets the Deferred andadds their own callback to do something with the object retrieved from the datastore.

              1st callback       2nd callback       3rd callback
 
   Deferred:  de-serialize  -->  add to cache  -->  user callback
   result: (none available)
 

Once the response comes back, the firstcallback is invoked, it de-serializes the object, returns it. Thecurrentresult of the Deferred becomes the de-serialized object. Thecurrent state of theDeferred is as follows:

              2nd callback       3rd callback
 
   Deferred:  add to cache  -->  user callback
   result: de-serialized object
 

Because there are more callbacks in thechain, the Deferredinvokes the next one and gives it the current result (the de-serialized object)in argument. The callback adds that object to its cache and returns itunchanged.

              3rd callback
 
   Deferred:  user callback
   result: de-serialized object
 

Finally, the user's callback is invokedwith the object in argument.

   Deferred:  (no more callbacks)
   result: (whatever the user's callback returned)
 

If you think this is becoming interesting,read on, you haven't reached the most interesting thing aboutDeferred yet.

Building dynamic processing pipelines withDeferred

Let's complicate the previous example alittle bit more. Let's assume that the remote storage service that serves thoseget callsis a distributed service that runs on many machines. The data is partitionedover many nodes and moves around as nodes come and go (due to machine failuresand whatnot). In order to execute a get call,the low-level client library first needs to know which server is currentlyserving that piece of data. Let's assume that there's another server, which ispart of that distributed service, that maintains an index and keeps track ofwhere each piece of data is. The low-level client library first needs to lookupthe location of the data using that first server (that's a first RPC), thenretrieves it from the storage node (that's another RPC). End users don't carethat retrieving data involves a 2-step process, they just want to callget andbe called back when the data (a byte array) is available.

This is wherewhat's probably the most useful feature of Deferredcomes in. When the user callsget, the low-level librarywill issue a first RPC to the index server to locate the piece of datarequested by the user. When issuing thislookupRPC, a Deferred gets created. The low-level getcode adds a first callback to process the lookupresponse and then returns it to the user.

              1st callback       2nd callback
 
   Deferred:  index lookup  -->  user callback
   result: (none available)
 

Eventually, the lookup RPCcompletes, and the Deferred is given thelookupresponse. So before triggering the first callback, the Deferred willbe in this state:

              1st callback       2nd callback
 
   Deferred:  index lookup  -->  user callback
   result: lookup response
 

The first callback runs and now knows whereto find the piece of data initially requested. It issues thegetrequest to the right storage node. Doing so creates anotherDeferred,let's call it (B),which is then returned by theindex lookup callback. And this is wherethe magic happens. Now we're in this state:

   (A)        2nd callback    |   (B)
                              |
   Deferred:  user callback   |   Deferred:  (no more callbacks)
   result: Deferred (B)       |   result: (none available)
 

Because a callback returned a Deferred, wecan't invoke the user callback just yet, since the user doesn't want theircallback receive aDeferred, they want it to receive a byte array. Thecurrent callback getspaused and stops processing thecallback chain. This callback chain needs to be resumed whenever theDeferred ofthe get call[(B)]completes. In order to achieve that, a callback is added to that otherDeferred thatwill resume the execution of the callback chain.

   (A)        2nd callback    |   (B)        1st callback
                              |
   Deferred:  user callback   |   Deferred:  resume (A)
   result: Deferred (B)       |   result: (none available)
 

Once (A) addedthe callback on (B), it can return immediately, there's noneed to wait, block a thread or anything like that. So the whole process ofreceiving thelookupresponse and sending out the get RPC happened really quickly, withoutblocking anything.

Now when the getresponse comes back from the network, the RPC layer de-serializes the bytearray, as usual, and hands it to(B):

   (A)        2nd callback    |   (B)        1st callback
                              |
   Deferred:  user callback   |   Deferred:  resume (A)
   result: Deferred (B)       |   result: byte array
 

(B)'sfirst and only callback is going to set the result of (A) andresume (A)'scallback chain.

   (A)        2nd callback    |   (B)        1st callback
                              |
   Deferred:  user callback   |   Deferred:  resume (A)
   result: byte array         |   result: byte array
 

So now (A)resumes its callback chain, and invokes the user's callback with the byte arrayin argument, which is what they wanted.

   (A)                        |   (B)        1st callback
                              |
   Deferred:  (no more cb)    |   Deferred:  resume (A)
   result: (return value of   |   result: byte array
            the user's cb)
 

Then (B) moveson to its next callback in the chain, but there are none, so(B) isdone too.

   (A)                        |   (B)
                              |
   Deferred:  (no more cb)    |   Deferred:  (no more cb)
   result: (return value of   |   result: byte array
            the user's cb)
 

The whole process of reading the getresponse, resuming the initialDeferred and executing the second Deferredhappened all in the same thread, sequentially, and without blocking anything(provided that the user's callback didn't block, as it must not).

What we've done isessentially equivalent to dynamically building an implicit finite state machineto handle the life cycle of theget request. This simpleAPI allows you to build arbitrarily complex processing pipelines that makedynamic decisions at each stage of the pipeline as to what to do next.

Handling errors

A Deferred hasin fact not one but two callback chains. The first chain is the"normal" processing chain, and the second is the error handlingchain. Twisted calls an error handling callback an "errback", sowe've kept that term here. When the asynchronous processing completes with anerror, the Deferred mustbe given the Exception thatwas caught instead of giving it the result (or if noException wascaught, one must be created and handed to the Deferred).When the current result of a Deferred is an instance ofException, thenext errback is invoked. As for normal callbacks, whatever the errback returnsbecomes the current result. If the current result is still an instance ofException, thenext errback is invoked. If the current result is no longer anException, thenext callback is invoked.

When a callback oran errback itself throws an exception, it is caught by theDeferredand becomes the current result, which means that the next errback in the chainwill be invoked with that exception in argument. Note thatDeferredwill only catch Exceptions, not any Throwable or Error.

Contract and Invariants

Read this carefully as this is yourwarranty.

  • A Deferred can receive only one initial result.
  • Only one thread at a time is going to execute the callback chain.
  • Each action taken by a callback happens-before the next callback is invoked. In other words, if a callback chain manipulates a variable (and no one else manipulates it), no synchronization is required.
  • The thread that executes the callback chain is the thread that hands the initial result to theDeferred. This class does not create or manage any thread or executor.
  • As soon as a callback is executed, the Deferred will lose its reference to it.
  • Every method that adds a callback to a Deferred does so in O(1).
  • A Deferred cannot receive itself as an initial or intermediate result, as this would cause an infinite recursion.
  • You must not build a cycle of mutually dependant Deferreds, as this would cause an infinite recursion (thankfully, it will quickly fail with aCallbackOverflowError).
  • Callbacks and errbacks cannot receive a Deferred in argument. This is because they always receive the result of a previous callback, and when the result becomes aDeferred, we suspend the execution of the callback chain until the result of that otherDeferred is available.
  • Callbacks cannot receive an Exception in argument. This because they're always given to the errbacks.
  • Using the monitor of a Deferred can lead to a deadlock, so don't use it. In other words, writing
synchronized (some_deferred) { ... }

(or anything equivalent) voids yourwarranty.


简单示例:

package test;

import com.stumbleupon.async.Callback;
import com.stumbleupon.async.Deferred;

public class TestDeferred {

	/**
	 * @param args
	 * @throws Exception 
	 */
	public static void main(String[] args) throws Exception {
		
		
		System.out.println("张三向李四借钱");
		long l = lend().joinUninterruptibly();
		
		System.out.println("张三向李四借钱" + l);
		System.out.println("张三开始打牌");
	}
	
	public static Deferred lend() {
		System.out.println("李四说回家看看家里还有多少钱");
		
		class GetCB implements Callback {

			@Override
			public Long call(Long arg) throws Exception {
				return arg;
			}
		}
		
		return check().addCallback(new GetCB());
	}
	
	public static Deferred check() {
		System.out.println("李四在家查看还有多少钱");
		try {
			Thread.sleep(10*1000);
		} catch (InterruptedException e) {
			e.printStackTrace();
		}
		return Deferred.fromResult(new Long(1000));
	}

}

你可能感兴趣的:(OpenTSDB)