Python: Sequence Types

There are three basic sequence types: lists, tuples, and range objects.

sequence

An iterable which supports efficient element access using integer indices via the __getitem__() special method and defines a __len__() method that returns the length of the sequence. Some built-in sequence types are list, str, tuple, and bytes. Note that dict also supports __getitem__() and __len__(), but is considered a mapping rather than a sequence because the lookups use arbitrary immutable keys rather than integers.

Common Sequence Operations

s and t are sequences of the same type, n, i,j and k are integers and x is an arbitrary object that meets any type and value restrictions imposed by s.

s * n or n * s
equivalent to adding s to itself n times

!Note:Values of n less than 0 are treated as 0 (which yields an empty sequence of the same type as s). Note that items in the sequence s are not copied; they are referenced multiple times.

>>> lists = [[]] * 3
>>> lists
[[], [], []]
>>> lists[0].append(3)
>>> lists
[[3], [3], [3]]

What has happened is that [[]] is a one-element list containing an empty list, so all three elements of [[]] * 3 are references to this single empty list. Modifying any of the elements of lists modifies this single list. You can create a list of different lists this way:

>>> lists = [[] for i in range(3)]
>>> lists[0].append(3)
>>> lists[1].append(5)
>>> lists[2].append(7)
>>> lists
[[3], [5], [7]]

inand not in
We can use them for subsequence testing

>>> 'gg' in 'eggs'
True

Empty strings are always considered to be a substring of any other string

>>> '' in 'ab'
True

s[i]
ith item of s, origin 0

!Note: If i is negative, the index is relative to the end of sequence s: len(s) + i is substituted. But note that -0 is still 0.

s[i:j]
slice of s from i to j

The slice of s from i to j is defined as the sequence of items with index k such that i <= k < j. If i or j is greater than len(s), use len(s). If i is omitted or None, use 0. If j is omitted or None, use len(s). If i is greater than or equal to j, the slice is empty.

>>> a = list(range(10))
>>> a
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
>>> a[:]
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
>>> a[:3]
[0, 1, 2]
>>> a[5:]
[5, 6, 7, 8, 9]
>>> a[2:2]
[]

s[i:j:k]
slice of s from i to j with step k

!Note: The slice of s from i to j with step k is defined as the sequence of items with index x = i + n*k such that 0 <= n < (j-i)/k. In other words, the indices are i, i+k, i+2*k, i+3*k and so on, stopping when j is reached (but never including j). When k is positive, i and j are reduced to len(s) if they are greater. When k is negative, i and j are reduced to len(s) - 1 if they are greater. If i or j are omitted or None, they become “end” values (which end depends on the sign of k). Note, k cannot be zero. If k is None, it is treated like 1.

>>> a
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
>>> a[::3]
[0, 3, 6, 9]
>>> a[::-3]
[9, 6, 3, 0]

s.index(x[, i[, j]])
index of the first occurrence of x in s (at or after index i and before index j)

!Note: Passing the extra arguments is roughly equivalent to using s[i:j].index(x), only without copying any data and with the returned index being relative to the start of the sequence rather than the start of the slice.

>>> a
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
>>> a.index(3)
3
>>> a.index(3,4)
Traceback (most recent call last):
  File "", line 1, in 
ValueError: 3 is not in list

s.count(x)
total number of occurrences of x in s

>>> letters = 'aabbccc'
>>> letters.count('c')
3

Only for Mutable Sequence Types

s is an instance of a mutable sequence type, t is any iterable object and x is an arbitrary object that meets any type and value restrictions imposed by s.

s[i:j] = t
slice of s from i to j is replaced by the contents of the iterable t

>>> s
[0, 1, 2, 3, 4, 5]
>>> s[3:5] = [1]
>>> s
[0, 1, 2, 1, 5]

del s[i:j]
same as s[i:j] = []

>>> s
[0, 1, 2, 3, 4, 5]
>>> del s[3:]
>>> s
[0, 1, 2]

s[i:j:k] = t
the elements of s[i:j:k] are replaced by those of t

!Note: t must have the same length as the slice it is replacing.

>>> s
[0, 1, 2, 3, 4, 5]
>>> s[::2] = [0, 1, 0]
>>> s
[0, 1, 1, 3, 0, 5]

del s[i:j:k]
removes the elements of s[i:j:k] from the list

>>> s
[0, 1, 2, 3, 4, 5]
>>> del s[::2]
>>> s
[1, 3, 5]

s.append(x)
appends x to the end of the sequence (same as s[len(s):len(s)] = [x])

>>> s
[0, 1, 0]
>>> s.append(1)
>>> s
[0, 1, 0, 1]

s.clear()
removes all items from s (same as del s[:])

>>> s
[0, 1, 0, 1]
>>> s.clear()
>>> s
[]

s.copy()
creates a shallow copy of s (same as s[:])

>>> s = [0,1,0]
>>> s
[0, 1, 0]
>>> a = s.copy()
>>> a
[0, 1, 0]
>>> id(a) == id(s)
False

s.extend(t) or s += t
extends s with the contents of t (for the most part the same as s[len(s):len(s)] = t)

>>> s
[0, 1, 0]
>>> t
[0, 1]
>>> s.extend(t)
>>> s
[0, 1, 0, 0, 1]

s *= n
updates s with its contents repeated n times

>>> s
[0, 1]
>>> s *= 3
>>> s
[0, 1, 0, 1, 0, 1]

s.insert(i, x)
inserts x into sat the index given by i (same as s[i:i] = [x])

>>> s
[0, 1, 0, 1, 0, 1]
>>> s.insert(0,1)
>>> s
[1, 0, 1, 0, 1, 0, 1]

s.pop([i])
retrieves the item at i and also removes it from s

!Note: The optional argument i defaults to -1, so that by default the last item is removed and returned.

>>> s
[1, 0, 1, 0, 1, 0, 1]

>>> s.pop()
1
>>> s
[1, 0, 1, 0, 1, 0]
>>> s.pop(1)
0
>>> s
[1, 1, 0, 1, 0]

s.remove(x)
remove the first item from s where s[i] == x

>>> s
[0, 1, 0, 1]
>>> s.remove(0)
>>> s
[1, 0, 1]

!Note: remove raises ValueError when x is not found in s.

s.reverse()
reverses the items of s in place

!Note: The reverse() method modifies the sequence in place for economy of space when reversing a large sequence.

>>> s
[0, 1, 0, 1]
>>> s.reverse()
>>> s
[1, 0, 1, 0]

List

Lists are mutable sequences.

Lists may be constructed in several ways:
- Using a pair of square brackets to denote the empty list: []
- Using square brackets, separating items with commas: [a], [a, b, c]
- Using a list comprehension: [x for x in iterable]
- Using the type constructor: list() or list(iterable)
If iterable is already a list, a copy is made and returned, similar to iterable[:].

>>> a
[1, 2, 3]
>>> b = list(a)
>>> b
[1, 2, 3]
>>> id(a) == id(b)
False

sort(*, key=None, reverse=False)¶

This method sorts the list in place, using only < comparisons between items. Exceptions are not suppressed - if any comparison operations fail, the entire sort operation will fail (and the list will likely be left in a partially modified state).

>>> a = [2,1,5, 'a', 3]
>>> a.sort()
Traceback (most recent call last):
  File "", line 1, in 
TypeError: unorderable types: str() < int()
>>> a
[1, 2, 5, 'a', 3]

Key specifies a function of one argument that is used to extract a comparison key from each list element.
Reverse is a boolean value. If set to True, then the list elements are sorted as if each comparison were reversed.
The sort() method is guaranteed to be stable. A sort is stable if it guarantees not to change the relative order of elements that compare equal — this is helpful for sorting in multiple passes.

>>> data = [('red', 1), ('blue', 1), ('red', 2), ('blue', 2)]
>>>
>>> data.sort(key=lambda x: x[0])
>>> data
[('blue', 1), ('blue', 2), ('red', 1), ('red', 2)]

sorted(iterable, *, key=None, reverse=False)¶

Return a new sorted list from the items in iterable.
Another difference is that the list.sort() method is only defined for lists. In contrast, the sorted() function accepts any iterable.

>>> sorted({1: 'D', 2: 'B', 3: 'B', 4: 'E', 5: 'A'})
[1, 2, 3, 4, 5]

The value of the key parameter should be a function that takes a single argument and returns a key to use for sorting purposes.

>>> student_tuples = [
...     ('john', 'A', 15),
...     ('jane', 'B', 12),
...     ('dave', 'B', 10),
... ]
>>> sorted(student_tuples, key=lambda student: student[2])
[('dave', 'B', 10), ('jane', 'B', 12), ('john', 'A', 15)]

operator module

>>> from operator import itemgetter
>>> sorted(student_tuples, key=itemgetter(2))
[('dave', 'B', 10), ('jane', 'B', 12), ('john', 'A', 15)]

The operator module functions allow multiple levels of sorting.

>>> sorted(student_tuples, key=itemgetter(1,2))
[('john', 'A', 15), ('dave', 'B', 10), ('jane', 'B', 12)]

This idiom is called Decorate-Sort-Undecorate after its three steps:

First, the initial list is decorated with new values that control the sort order.
Second, the decorated list is sorted.
Finally, the decorations are removed, creating a list that contains only the initial values in the new order.

>>> class Student:
...     def __init__(self, name, grade, age):
...         self.name = name
...         self.grade = grade
...         self.age = age
...     def __repr__(self):
...         return repr((self.name, self.grade, self.age))
... 
>>> 
>>> student_objects = [
...     Student('john', 'A', 15),
...     Student('jane', 'B', 12),
...     Student('dave', 'B', 10),
... ]
>>> 
>>> decorated = [(student.grade, i, student) for i, student in enumerate(student_objects)]
>>> decorated.sort()
>>> [student for grade, i, student in decorated]
[('john', 'A', 15), ('jane', 'B', 12), ('dave', 'B', 10)]

This idiom works because tuples are compared lexicographically; the first items are compared; if they are the same then the second items are compared, and so on.

The sort is stable – if two items have the same key, their order will be preserved in the sorted list.

In Py2.x, sort allowed an optional function which can be called for doing the comparisons. That function should take two arguments to be compared and then return a negative value for less-than, return zero if they are equal, or return a positive value for greater-than.

>>> def cmp_to_key(mycmp):
...     'Convert a cmp= function into a key= function'
...     class K:
...         def __init__(self, obj, *args):
...             self.obj = obj
...         def __lt__(self, other):
...             return mycmp(self.obj, other.obj) < 0
...         def __gt__(self, other):
...             return mycmp(self.obj, other.obj) > 0
...         def __eq__(self, other):
...             return mycmp(self.obj, other.obj) == 0
...         def __le__(self, other):
...             return mycmp(self.obj, other.obj) <= 0
...         def __ge__(self, other):
...             return mycmp(self.obj, other.obj) >= 0
...         def __ne__(self, other):
...             return mycmp(self.obj, other.obj) != 0
...     return K
... 
>>> def reverse_numeric(x, y):
...     return y - x
... 
>>> sorted([5, 2, 4, 1, 3], key=cmp_to_key(reverse_numeric))
[5, 4, 3, 2, 1]

__lt__()
The sort routines are guaranteed to use __lt__() when making comparisons between two objects.

>>> class Student:
...     def __init__(self, name, grade, age):
...         self.name = name
...         self.grade = grade
...         self.age = age
...     def __lt__(self, other):
...         return self.age < other.age
...     def __repr__(self):
...         return repr((self.name, self.grade, self.age))
...  
>>> student_objects = [
...     Student('john', 'A', 15),
...     Student('jane', 'B', 12),
...     Student('dave', 'B', 10),
... ]
>>> 
>>> sorted(student_objects)
[('dave', 'B', 10), ('jane', 'B', 12), ('john', 'A', 15)]

Tuples

Tuples are immutable sequences.

Tuples may be constructed in a number of ways:
- Using a pair of parentheses to denote the empty tuple: ()
- Using a - trailing comma for a singleton tuple: a, or (a,)
- Separating items with commas: a, b, c or (a, b, c)
- Using the tuple() built-in: tuple() or tuple(iterable)

Note that it is actually the comma which makes a tuple, not the parentheses.

If iterable is already a tuple, it is returned unchanged.

>>> a
(1, 2, 3)
>>> b = tuple(a)
>>> b
(1, 2, 3)
>>> id(b) == id(a)
True

The constructor builds a tuple whose items are the same and in the same order as iterable’s items.

>>> tuple('abc')
('a', 'b', 'c')

range

The range type represents an immutable sequence of numbers.

class range(stop)
class range(start=0, stop[, step=1])

The arguments to the range constructor must be integers

For a positive step, the contents of a range r are determined by the formula r[i] = start + step*i where i >= 0 and r[i] < stop.

>>> list(range(1,10,2))
[1, 3, 5, 7, 9]

For a negative step, the contents of the range are still determined by the formula r[i] = start + step*i, but the constraints are i >= 0 and r[i] > stop.

>>> list(range(10,1,-3))
[10, 7, 4]

A range object will be empty if r[0] does not meet the value constraint.

>>> list(range(0))
[]
>>> list(range(1, 0))
[]

The advantage of the range type over a regular list or tuple is that a range object will always take the same (small) amount of memory, no matter the size of the range it represents (as it only stores the start, stop and step values, calculating individual items and subranges as needed)

>>> r = range(0, 10)
>>> r
range(0, 10)
>>> 3 in r
True
>>> r[3]
3
>>> r[4:]
range(4, 10)
>>> r[-1]
9

Testing range objects for equality with == and != compares them as sequences. That is, two range objects are considered equal if they represent the same sequence of values.

>>> range(0) == range(1, 0)
True

Sequence Types — list, tuple, range¶
Sorting HOW TO¶