Queue.Queue vs collections.deque

Queue.Queue vs. collections.deque

up vote 78 down vote favorite
22

I need a queue which multiple threads can put stuff into, and multiple threads may read from.

Python has at least two queue classes, Queue.Queue and collections.deque, with the former seemingly using the latter internally. Both claim to be thread-safe in the documentation.

However, the Queue docs also state:

collections.deque is an alternative implementation of unbounded queues with fast atomic append() and popleft() operations that do not require locking.

Which I guess I don't quite unterstand: Does this mean deque isn't fully thread-safe after all?

If it is, I may not fully understand the difference between the two classes. I can see that Queue adds blocking functionality. On the other hand, it loses some deque features like support for the in-operator.

Accessing the internal deque object directly, is

x in Queue().deque

thread-safe?

Also, why does Queue employ a mutex for it's operations when deque is thread-safe already?

share improve this question
 
 
RuntimeError: deque mutated during iteration is what you could be getting is using a shared deque between several thread and no locking... –  toine  Nov 7 '15 at 16:28

5 Answers

active oldest votes
up vote 131 down vote accepted

Queue.Queue and collections.deque serve different purposes. Queue.Queue is intended for allowing different threads to communicate using queued messages/data, whereas collections.deque is simply intended as a datastructure. That's why Queue.Queue has methods like put_nowait()get_nowait(), and join(), whereas collections.dequedoesn't. Queue.Queue isn't intended to be used as a collection, which is why it lacks the likes of the in operator.

It boils down to this: if you have multiple threads and you want them to be able to communicate without the need for locks, you're looking for Queue.Queue; if you just want a queue or a double-ended queue as a datastructure, use collections.deque.

Finally, accessing and manipulating the internal deque of a Queue.Queue is playing with fire - you really don't want to be doing that.

share improve this answer
 
 
If you need a thread-safe queue and performance is important, deque seems a better option than Queue: see Jonathan's answer. Plus with deque you get list-like flexibility and visibility as a bonus, although that is not all thread-safe. –  fantabolous  Dec 7 '15 at 7:30
 
No, that's not a good idea at all. If you look at the source of Queue.Queue, it uses deque under the hood. collections.deque is a collection, while Queue.Queue is a communications mechanism. The overhead in Queue.Queue is to make it threadsafe. Using deque for communicating between threads will only lead to painful races. Whenever deque happens to be threadsafe, that's a happy accident of how the interpreter is implemented, and not something to be relied upon. That's why Queue.Queue exists in the first place. –  Keith Gaughan  Dec 7 '15 at 14:02 
 
The docs say deque is thread-safe for pops and appends, which are all you need for a simple queue, perhaps wrapping pops with a try: except IndexError for when it's empty. Some interpreters disobey this? It's true I stick with cpython. I use deque myself because Queue is way too slow for my purposes. docs.python.org/2/library/collections.html#deque-objects –  fantabolous  Dec 8 '15 at 1:22 
 
Just keep in mind that if you're communicating across threads, you're playing with fire by using deque. deque is threadsafe by accident due to the existence of the GIL. An GIL-less implementation will have completely different performance characteristics, so discounting other implementations isn't wise. Besides, have you timed Queue vs deque for use across threads as opposed to a naive benchmark of its use in a single thread? If your code is that sensitive to the speed of Queue vs deque, Python might not be the language you're looking for. –  Keith Gaughan  Dec 9 '15 at 17:24 
 
Ok, thanks for the info. Yes, my benchmarks are across threads. True python isn't the fastest language but I use it because of pandas. –  fantabolous  Dec 10 '15 at 5:38
up vote 14 down vote

If all you're looking for is a thread-safe way to transfer objects between threads, then both would work (both for FIFO and LIFO). For FIFO:

  • Queue.put() and Queue.get() are thread-safe
  • deque.append() and deque.popleft() are thread-safe

Note:

  • Other operations on deque might not be thread safe, I'm not sure.
  • deque does not block on pop() or popleft() so you can't base your consumer thread flow on blocking till a new item arrives.

However, it seems that deque has a significant efficiency advantage. Here are some benchmark results in seconds using CPython 2.7.3 for inserting and removing 100k items

deque 0.0747888759791
Queue 1.60079066852

Here's the benchmark code:

import time
import Queue
import collections

q = collections.deque()
t0 = time.clock()
for i in xrange(100000):
    q.append(1)
for i in xrange(100000):
    q.popleft()
print 'deque', time.clock() - t0

q = Queue.Queue(200000)
t0 = time.clock()
for i in xrange(100000):
    q.put(1)
for i in xrange(100000):
    q.get()
print 'Queue', time.clock() - t0
share improve this answer
 
 
You claim that "Other operations on deque may not be thread safe". Where do you get that from? –  Matt Apr 23 '14 at 18:28
 
@Matt - rephrased to better convey my meaning –  Jonathan  Apr 28 '14 at 21:36
1  
Ok, thanks. That was stopping me from using deque because I thought you knew something I didn't. I guess I'll just assume it's thread safe until I discover otherwise. –  Matt  Apr 29 '14 at 13:17
up vote 2 down vote

deque is thread-safe. "operations that do not require locking" means that you don't have to do the locking yourself, the deque takes care of it.

Taking a look at the Queue source, the internal deque is called self.queue and uses a mutex for accessors and mutations, so Queue().queue is not thread-safe to use.

If you're looking for an "in" operator, then a deque or queue is possibly not the most appropriate data structure for your problem.

share improve this answer
 
1  
Well, what I want to do is make sure that no duplicates are added to the queue. Is this not something a queue could potentially support? –  miracle2k  Apr 4 '09 at 15:06
1  
It'd probably be best to have a separate set, and update that when you add/remove something from the queue. That'll be O(log n) rather than O(n), but you'll have to be careful to keep the set and queue in sync (i.e. locking). –  brian-brazil  Apr 4 '09 at 15:41
 
A Python set is a hash table, so it would be O(1). But yes, you would still have to do locking. –  AFoglia  Jan 18 '12 at 15:27
up vote 1 down vote

For information there is a Python ticket referenced for deque thread-safety (https://bugs.python.org/issue15329). Title "clarify which deque methods are thread-safe"

Bottom line here: https://bugs.python.org/issue15329#msg199368

The deque's append(), appendleft(), pop(), popleft(), and len(d) operations are thread-safe in CPython. The append methods have a DECREF at the end (for cases where maxlen has been set), but this happens after all of the structure updates have been made and the invariants have been restored, so it is okay to treat these operations as atomic.

Anyway, if you are not 100% sure and you prefer reliability over performance, just put a like Lock ;)

你可能感兴趣的:(Python,Queue,collections)