Kestrel is a very simple message queue that runs on the JVM. Itsupports multiple protocols:
A single kestrel server has a set of queues identified by aname, which isalso the filename of that queue's journal file(usually in/var/spool/kestrel
). Each queue is astrictly-ordered FIFO of "items" ofbinary data. Usually this datais in some serialized format like JSON orruby's marshal format.
Generally queue names should be limited to alphanumerics[A-Za-z0-9]
, dash(-
) and underline(_
). In practice, kestrel doesn't enforceanyrestrictions other than the name can't contain slash(/
) because that can'tbe used in filenames, squiggle(~
) because it's used for temporary files,plus(+
) because it's used for fanout queues, and dot(.
) because it'sreserved for future use. Queue namesare case-sensitive, but if you're runningkestrel on OS X orWindows, you will want to refrain from taking advantage ofthis,since the journal filenames on those two platforms arenotcase-sensitive.
A cluster of kestrel servers is like a memcache cluster: theservers don'tknow about each other, and don't do anycross-communication, so you can add asmany as you like. Clientshave a list of all servers in the cluster, and pickone at randomfor each operation. In this way, each queue appears to be spreadoutacross every server, with items in a loose ordering.
When kestrel starts up, it scans the journal folder and createsqueues basedon any journal files it finds there, to restore stateto the way it was whenit last shutdown (or was killed or died). Newqueues are created by referringto them (for example, adding ortrying to remove an item). A queue can bedeleted with the "delete"command.
The config files for kestrel are scala expressions loaded atruntime, usuallyfrom production.scala
, although youcan use development.scala
bypassing-Dstage=development
to the java commandline.
The config file evaluates to a KestrelConfig
objectthat's used to configurethe server as a whole, a default queue, andany overrides for specific namedqueues. The fields onKestrelConfig
are documented here with theirdefaultvalues:
To confirm the current configuration of each queue, send"dump_config" toa server (which can be done over telnet).
To reload the config file on a running server, send "reload" thesame way.You should immediately see the changes in "dump_config",to confirm. Reloadingwill only affect queue configuration,中云融信, notglobal server configuration. Tochange the server configuration,restart the server.
Logging is configured according to util-logging
.The logging configurationsyntax is described here:
Per-queue configuration is documented here:
Queue alias configuration is documented here:
A queue can have the following limits set on it:
If either of these limits is reached, no new items can be addedto the queue.(Clients will receive an error when trying to add.) Ifyou setdiscardOldWhenFull
to true, then all adds willsucceed, and the oldestitem(s) will be silently discarded until thequeue is back within the itemand size limits.
maxItemSize
limits the size of any individual item.If an add is attemptedwith an item larger than this limit, italways fails.
The journal file is the only on-disk storage of a queue'scontents, and it'sjust a sequential record of each add or removeoperation that's happened onthat queue. When kestrel starts up, itreplays each queue's journal to buildup the in-memory queue that ituses for client queries.
The journal file is rotated in one of two conditions:
For example, if defaultJournalSize
is 16MB (thedefault), then if the queueis empty and the journal is larger than16MB, it will be truncated into a new(empty) file. If the journalis larger than maxJournalSize
(1GB by default),thejournal will be rewritten periodically to contain just the liveitems.
You can turn the journal off for a queue(keepJournal
= false) and the queuewill exist only inmemory. If the server restarts, all enqueued items arelost. You canalso force a queue's journal to be sync'd to disk periodically,oreven after every write operation, at a performance cost,usingsyncJournal
.
If a queue grows past maxMemorySize
bytes (128MB bydefault), only thefirst 128MB is kept in memory. The journal isused to track later items, andas items are removed, the journal isplayed forward to keep 128MB in memory.This is usually known as"read-behind" mode, but Twitter engineers sometimesrefer to it asthe "square snake" because of the diagram used to brainstormtheimplementation. When a queue is in read-behind mode, removing anitem willoften cause 2 disk operations instead of one: one torecord the remove, andone to read an item in from disk to keep128MB in memory. This is thetrade-off to avoid filling memory andcrashing the JVM.
When they come from a client, expiration times are handled inthe same way asmemcache: if the number is small (less than onemillion), it's interpreted asa relative number of seconds from now.Otherwise it's interpreted as anabsolute unix epoch time, inseconds since the beginning of 1 January 1970GMT.
Expiration times are immediately translated into an absolutetime, inmilliseconds, and if it's further in the futurethan the queue's maxAge
,the maxAge
isused instead. An expiration of 0, which is usually thedefault,means an item never expires.
Expired items are flushed from a queue whenever a new item isadded orremoved. Additionally, if the global config optionexpirationTimerFrequency<wbr></wbr>
is set,中云融信, a backgroundthread will periodically remove expired items from thehead of eachqueue. The provided production.conf
sets this to onesecond.If this is turned off, an idle queue won't have any itemsexpired, but youcan still trigger a check by doing a "peek" onit.
Normally, expired items are discarded. IfexpireToQueue
is set, thenexpired items are moved tothe specified queue just as if a client had putit there. The itemis added with no expiration time, but that can beoverridden if thenew queue has a default expiration policy.
To prevent stalling the server when it encounters a swarm ofitems that allexpired at the same time, maxExpireSweep
limits the number of items thatwill be removed by the backgroundthread in a single round. This is primarilyuseful as a throttlingmechanism when using a queue as a way to delay work.
Whole queues can be configured to expire as well. IfmaxQueueAge
issetexpirationTimerFrequency<wbr></wbr>
is used to check the queueage. If the queue isempty, and it has been longer thanmaxQueueAge
since it was created thenthe queue will bedeleted.
If a queue name has a +
in it (like"orders+audit
"), it's treated as afanout queue, usingthe format +
. These queues belong to aparent queue --in this example, the "orders" queue. Every item written intoaparent queue will also be written into each of its children.
Fanout queues each have their own journal file (if the parentqueue has ajournal file) and otherwise behave exactly like anyother queue. You can getand peek and even add items directly to achild queue if you want. It uses theparent queue's configurationinstead of having independent child queueconfiguration blocks.
When a fanout queue is first referenced by a client, the journalfile (if any)is created, and it will start receiving new itemswritten to the parent queue.Existing items are not copied over. Afanout queue can be deleted to stop itfrom receiving new items.
fanoutOnly
may be set to true if the queue inquestion will only serve writepoint for fanout queues. No journalfile will be kept for the parent, onlyfor the child queues. Thissaves the overhead of writing to the parent andremoves the need toempty it. Note that setting fanoutOnly
to trueandhaving no fanouts for the queue effectively makes it a blackhole.
Queue aliases are somewhat similar to fanout queues, but withouta requirednaming convention or implicit creation of child queues. Aqueue alias canonly be used in set operations. Kestrel responds toattempts to retrieveitems from the alias as if it were an emptyqueue. Delete and flush requestsare also ignored.
Kestrel supports three protocols: memcache, thrift and text. Thecan be used to connect clientsto a Kestrel server via the memcacheor thrift protocols.
The thrift protocol is documented in the thrift IDL:
Reliable reads via the thrift protocol are specified byindicating how long the servershould wait before aborting theunacknowledged read.
The official memcache protocol is described here:
The kestrel implementation of the memcache protocol commands isdescribed below.
For example, to open a new read, waiting up to 500msec for anitem:
GET work/t=500/open
Or to close an existing read and open a new one:
GET work/close/open
Note: this section is specific to the memcache protocol.
Normally when a client removes an item from the queue, kestrelimmediatelydiscards the item and assumes the client has takenownership. This isn'talways safe, because a client could crash orlose the network connectionbefore it gets the item. So kestrel alsosupports a "reliable read" thathappens in two stages, using the/open
and /close
options toGET
.
When /open
is used, and an item is available,kestrel will remove it fromthe queue and send it to the client asusual. But it will also set the itemaside. If a client disconnectswhile it has an open read, the item is put backinto the queue, atthe head, so it will be the next item fetched. Only oneitem can be"open" per client connection.
A previous open request is closed with /close
. Theserver will reject anyattempt to open another read when one isalready open, but it will ignore/close
if there's noopen request, so that you can add /close
toeveryGET
request for convenience.
If for some reason you want to abort a read withoutdisconnecting, you can use/abort
. But because aborteditems are placed back at the head of the queue,this isn't a goodway to deal with client errors. Since the error-causing itemwillalways be the next one available, you'll end up bouncing the sameitemaround between clients instead of making progress.
There's always a trade-off: either potentially lose items orpotentiallyreceive the same item multiple times. Reliable readschoose the latter option.To use this tactic successfully, workitems should be idempotent, meaning thework could be done 2 or 3times and have the same effect as if it had beendone only once(except wasting some resources).
Example:
GET dirty_jobs/close/open(receives job 1)GET dirty_jobs/close/open(closes job 1, receives job 2)...etc...
Kestrel supports a limited, text-only protocol. You areencouraged to use thememcache protocol instead.
The text protocol does not support reliable reads.
Global stats reported by kestrel are:
For each queue, the following stats are also reported:
Statistics may be retrieved by accessing the on the admin HTTPport.For example:http://kestrel.host:2223/stats.json?period=60
.
Statistics are also available via the memcache protocol usingthe STATS
command.
You can use kestrel as a library by just sticking the jar onyour classpath.It's a cheap way to get a durable work queue forinter-process or inter-threadcommunication. Each queue isrepresented by a PersistentQueue
object:
class PersistentQueue(val name: String, persistencePath: String, @volatile var config: QueueConfig, timer: Timer, queueLookup: Option[(String => Option[PersistentQueue])]) {
and must be initialized before using:
def setup(): Unit
specifying the path for the journal files (if the queue will bejournaled),the name of the queue, a QueueConfig
object(derived from QueueBuilder
),a timer for handlingtimeout reads, and optionally a way to find other namedqueues (forexpireToQueue
support).
To add an item to a queue:
def add(value: Array[Byte], expiry: Option[Time]): Boolean
It will return false
if the item was rejectedbecause the queue was full.
Queue items are represented by a case class:
case class QItem(addTime: Time, expiry: Option[Time], data: Array[Byte], var xid: Int)
and several operations exist to remove or peek at the headitem:
def peek(): Option[QItem]def remove(): Option[QItem]
To open a reliable read, set transaction
true, andlater confirm or unremovethe item by its xid
:
def remove(transaction: Boolean): Option[QItem]def unremove(xid: Int)def confirmRemove(xid: Int)
You can also asynchronously remove or peek at items usingfutures.
def waitRemove(deadline: Option[Time], transaction: Boolean): Future[Option[QItem]]def waitPeek(deadline: Option[Time]): Future[Option[QItem]]
When done, you should close the queue:
def close(): Unitdef isClosed: Boolean
Here's a short example:
var queue = new PersistentQueue("work", "/var/spool/kestrel", config, timer, None)queue.setup()// add an item with no expiration:queue.add("hello".getBytes, 0)// start to remove it, then back out:val item = queue.remove(true)queue.unremove(item.xid)// remove an item with a 500msec timeout, and confirm it:queue.waitRemove(500.milliseconds.fromNow, true)() match { case None => println("nothing. :(") case Some(item) => println("got: " + new String(item.data)) queue.confirmRemove(item.xid)}queue.close()