4.8. Binary Sequence Types — bytes
, bytearray
, memoryview
The core built-in types for manipulating binary data are bytes
and bytearray
. They are supported by memoryview
which uses the buffer protocolto access the memory of other binary objects without needing to make a copy.
The array
module supports efficient storage of basic data types like 32-bit integers and IEEE754 double-precision floating values.
4.8.1. Bytes Objects
Bytes objects are immutable sequences of single bytes. Since many major binary protocols are based on the ASCII text encoding, bytes objects offer several methods that are only valid when working with ASCII compatible data and are closely related to string objects in a variety of other ways.
- class
bytes
([source[, encoding[, errors]]]) -
Firstly, the syntax for bytes literals is largely the same as that for string literals, except that a
b
prefix is added:- Single quotes:
b'still allows embedded "double" quotes'
- Double quotes:
b"still allows embedded 'single' quotes"
. - Triple quoted:
b'''3 single quotes'''
,b"""3 double quotes"""
Only ASCII characters are permitted in bytes literals (regardless of the declared source code encoding). Any binary values over 127 must be entered into bytes literals using the appropriate escape sequence.
As with string literals, bytes literals may also use a
r
prefix to disable processing of escape sequences. See String and Bytes literals for more about the various forms of bytes literal, including supported escape sequences.While bytes literals and representations are based on ASCII text, bytes objects actually behave like immutable sequences of integers, with each value in the sequence restricted such that
0 <= x < 256
(attempts to violate this restriction will triggerValueError
. This is done deliberately to emphasise that while many binary formats include ASCII based elements and can be usefully manipulated with some text-oriented algorithms, this is not generally the case for arbitrary binary data (blindly applying text processing algorithms to binary data formats that are not ASCII compatible will usually lead to data corruption).In addition to the literal forms, bytes objects can be created in a number of other ways:
- A zero-filled bytes object of a specified length:
bytes(10)
- From an iterable of integers:
bytes(range(20))
- Copying existing binary data via the buffer protocol:
bytes(obj)
Also see the bytes built-in.
Since 2 hexadecimal digits correspond precisely to a single byte, hexadecimal numbers are a commonly used format for describing binary data. Accordingly, the bytes type has an additional class method to read data in that format:
- classmethod
fromhex
(string) -
This
bytes
class method returns a bytes object, decoding the given string object. The string must contain two hexadecimal digits per byte, with ASCII whitespace being ignored.
A reverse conversion function exists to transform a bytes object into its hexadecimal representation.
hex
()-
Return a string object containing two hexadecimal digits for each byte in the instance.
New in version 3.5.
- Single quotes:
Since bytes objects are sequences of integers (akin to a tuple), for a bytes object b, b[0]
will be an integer, while b[0:1]
will be a bytes object of length 1. (This contrasts with text strings, where both indexing and slicing will produce a string of length 1)
The representation of bytes objects uses the literal format (b'...'
) since it is often more useful than e.g. bytes([46, 46, 46])
. You can always convert a bytes object into a list of integers using list(b)
.
Note
For Python 2.x users: In the Python 2.x series, a variety of implicit conversions between 8-bit strings (the closest thing 2.x offers to a built-in binary data type) and Unicode strings were permitted. This was a backwards compatibility workaround to account for the fact that Python originally only supported 8-bit text, and Unicode text was a later addition. In Python 3.x, those implicit conversions are gone - conversions between 8-bit binary data and Unicode text must be explicit, and bytes and string objects will always compare unequal.
4.8.2. Bytearray Objects
bytearray
objects are a mutable counterpart to bytes
objects.
- class
bytearray
([source[, encoding[, errors]]]) -
There is no dedicated literal syntax for bytearray objects, instead they are always created by calling the constructor:
- Creating an empty instance:
bytearray()
- Creating a zero-filled instance with a given length:
bytearray(10)
- From an iterable of integers:
bytearray(range(20))
- Copying existing binary data via the buffer protocol:
bytearray(b'Hi!')
As bytearray objects are mutable, they support the mutable sequence operations in addition to the common bytes and bytearray operations described in Bytes and Bytearray Operations.
Also see the bytearray built-in.
Since 2 hexadecimal digits correspond precisely to a single byte, hexadecimal numbers are a commonly used format for describing binary data. Accordingly, the bytearray type has an additional class method to read data in that format:
- classmethod
fromhex
(string) -
This
bytearray
class method returns bytearray object, decoding the given string object. The string must contain two hexadecimal digits per byte, with ASCII whitespace being ignored.
A reverse conversion function exists to transform a bytearray object into its hexadecimal representation.
hex
()-
Return a string object containing two hexadecimal digits for each byte in the instance.
New in version 3.5.
- Creating an empty instance:
Since bytearray objects are sequences of integers (akin to a list), for a bytearray object b, b[0]
will be an integer, while b[0:1]
will be a bytearray object of length 1. (This contrasts with text strings, where both indexing and slicing will produce a string of length 1)
The representation of bytearray objects uses the bytes literal format (bytearray(b'...')
) since it is often more useful than e.g. bytearray([46,46, 46])
. You can always convert a bytearray object into a list of integers using list(b)
.
4.8.3. Bytes and Bytearray Operations
Both bytes and bytearray objects support the common sequence operations. They interoperate not just with operands of the same type, but with any bytes-like object. Due to this flexibility, they can be freely mixed in operations without causing errors. However, the return type of the result may depend on the order of operands.
Note
The methods on bytes and bytearray objects don’t accept strings as their arguments, just as the methods on strings don’t accept bytes as their arguments. For example, you have to write:
and:
Some bytes and bytearray operations assume the use of ASCII compatible binary formats, and hence should be avoided when working with arbitrary binary data. These restrictions are covered below.
Note
Using these ASCII based operations to manipulate binary data that is not stored in an ASCII based format may lead to data corruption.
The following methods on bytes and bytearray objects can be used with arbitrary binary data.
bytes.``count
(sub[, start[, end]])bytearray.``count
(sub[, start[, end]])-
Return the number of non-overlapping occurrences of subsequence sub in the range [start, end]. Optional arguments start and end are interpreted as in slice notation.
The subsequence to search for may be any bytes-like object or an integer in the range 0 to 255.
Changed in version 3.3: Also accept an integer in the range 0 to 255 as the subsequence.
bytes.``decode
(encoding="utf-8", errors="strict")bytearray.``decode
(encoding="utf-8", errors="strict")-
Return a string decoded from the given bytes. Default encoding is
'utf-8'
. errors may be given to set a different error handling scheme. The default for errors is'strict'
, meaning that encoding errors raise aUnicodeError
. Other possible values are'ignore'
,'replace'
and any other name registered viacodecs.register_error()
, see section Error Handlers. For a list of possible encodings, see section Standard Encodings.Note
Passing the encoding argument to
str
allows decoding any bytes-like object directly, without needing to make a temporary bytes or bytearray object.Changed in version 3.1: Added support for keyword arguments.
bytes.``endswith
(suffix[, start[, end]])bytearray.``endswith
(suffix[, start[, end]])-
Return
True
if the binary data ends with the specified suffix, otherwise returnFalse
. suffix can also be a tuple of suffixes to look for. With optional start, test beginning at that position. With optional end, stop comparing at that position.The suffix(es) to search for may be any bytes-like object.
bytes.``find
(sub[, start[, end]])bytearray.``find
(sub[, start[, end]])-
Return the lowest index in the data where the subsequence sub is found, such that sub is contained in the slice
s[start:end]
. Optional arguments start and end are interpreted as in slice notation. Return-1
if sub is not found.The subsequence to search for may be any bytes-like object or an integer in the range 0 to 255.
Note
The
find()
method should be used only if you need to know the position of sub. To check if sub is a substring or not, use thein
operator:Changed in version 3.3: Also accept an integer in the range 0 to 255 as the subsequence.
bytes.``index
(sub[, start[, end]])bytearray.``index
(sub[, start[, end]])-
Like
find()
, but raiseValueError
when the subsequence is not found.The subsequence to search for may be any bytes-like object or an integer in the range 0 to 255.
Changed in version 3.3: Also accept an integer in the range 0 to 255 as the subsequence.
bytes.``join
(iterable)bytearray.``join
(iterable)-
Return a bytes or bytearray object which is the concatenation of the binary data sequences in iterable. A
TypeError
will be raised if there are any values in iterable that are not bytes-like objects, includingstr
objects. The separator between elements is the contents of the bytes or bytearray object providing this method.
- static
bytes.``maketrans
(from, to) - static
bytearray.``maketrans
(from, to) -
This static method returns a translation table usable for
bytes.translate()
that will map each character in from into the character at the same position in to; from and to must both be bytes-like objects and have the same length.New in version 3.1.
bytes.``partition
(sep)bytearray.``partition
(sep)-
Split the sequence at the first occurrence of sep, and return a 3-tuple containing the part before the separator, the separator itself or its bytearray copy, and the part after the separator. If the separator is not found, return a 3-tuple containing a copy of the original sequence, followed by two empty bytes or bytearray objects.
The separator to search for may be any bytes-like object.
bytes.``replace
(old, new[, count])bytearray.``replace
(old, new[, count])-
Return a copy of the sequence with all occurrences of subsequence old replaced by new. If the optional argument count is given, only the first count occurrences are replaced.
The subsequence to search for and its replacement may be any bytes-like object.
Note
The bytearray version of this method does not operate in place - it always produces a new object, even if no changes were made.
bytes.``rfind
(sub[, start[, end]])bytearray.``rfind
(sub[, start[, end]])-
Return the highest index in the sequence where the subsequence sub is found, such that sub is contained within
s[start:end]
. Optional arguments start and end are interpreted as in slice notation. Return-1
on failure.The subsequence to search for may be any bytes-like object or an integer in the range 0 to 255.
Changed in version 3.3: Also accept an integer in the range 0 to 255 as the subsequence.
bytes.``rindex
(sub[, start[, end]])bytearray.``rindex
(sub[, start[, end]])-
Like
rfind()
but raisesValueError
when the subsequence sub is not found.The subsequence to search for may be any bytes-like object or an integer in the range 0 to 255.
Changed in version 3.3: Also accept an integer in the range 0 to 255 as the subsequence.
bytes.``rpartition
(sep)bytearray.``rpartition
(sep)-
Split the sequence at the last occurrence of sep, and return a 3-tuple containing the part before the separator, the separator itself or its bytearray copy, and the part after the separator. If the separator is not found, return a 3-tuple containing a copy of the original sequence, followed by two empty bytes or bytearray objects.
The separator to search for may be any bytes-like object.
bytes.``startswith
(prefix[, start[, end]])bytearray.``startswith
(prefix[, start[, end]])-
Return
True
if the binary data starts with the specified prefix, otherwise returnFalse
. prefix can also be a tuple of prefixes to look for. With optional start, test beginning at that position. With optional end, stop comparing at that position.The prefix(es) to search for may be any bytes-like object.
bytes.``translate
(table, delete=b'')bytearray.``translate
(table, delete=b'')-
Return a copy of the bytes or bytearray object where all bytes occurring in the optional argument delete are removed, and the remaining bytes have been mapped through the given translation table, which must be a bytes object of length 256.
You can use the
bytes.maketrans()
method to create a translation table.Set the table argument to
None
for translations that only delete characters:Changed in version 3.6: delete is now supported as a keyword argument.
The following methods on bytes and bytearray objects have default behaviours that assume the use of ASCII compatible binary formats, but can still be used with arbitrary binary data by passing appropriate arguments. Note that all of the bytearray methods in this section do not operate in place, and instead produce new objects.
bytes.``center
(width[, fillbyte])bytearray.``center
(width[, fillbyte])-
Return a copy of the object centered in a sequence of length width. Padding is done using the specified fillbyte (default is an ASCII space). For
bytes
objects, the original sequence is returned if width is less than or equal tolen(s)
.Note
The bytearray version of this method does not operate in place - it always produces a new object, even if no changes were made.
bytes.``ljust
(width[, fillbyte])bytearray.``ljust
(width[, fillbyte])-
Return a copy of the object left justified in a sequence of length width. Padding is done using the specified fillbyte (default is an ASCII space). For
bytes
objects, the original sequence is returned if width is less than or equal tolen(s)
.Note
The bytearray version of this method does not operate in place - it always produces a new object, even if no changes were made.
bytes.``lstrip
([chars])bytearray.``lstrip
([chars])-
Return a copy of the sequence with specified leading bytes removed. The chars argument is a binary sequence specifying the set of byte values to be removed - the name refers to the fact this method is usually used with ASCII characters. If omitted or
None
, the chars argument defaults to removing ASCII whitespace. The chars argument is not a prefix; rather, all combinations of its values are stripped:The binary sequence of byte values to remove may be any bytes-like object.
Note
The bytearray version of this method does not operate in place - it always produces a new object, even if no changes were made.
bytes.``rjust
(width[, fillbyte])bytearray.``rjust
(width[, fillbyte])-
Return a copy of the object right justified in a sequence of length width. Padding is done using the specified fillbyte (default is an ASCII space). For
bytes
objects, the original sequence is returned if width is less than or equal tolen(s)
.Note
The bytearray version of this method does not operate in place - it always produces a new object, even if no changes were made.
bytes.``rsplit
(sep=None, maxsplit=-1)bytearray.``rsplit
(sep=None, maxsplit=-1)-
Split the binary sequence into subsequences of the same type, using sep as the delimiter string. If maxsplit is given, at most maxsplit splits are done, the rightmost ones. If sep is not specified or
None
, any subsequence consisting solely of ASCII whitespace is a separator. Except for splitting from the right,rsplit()
behaves likesplit()
which is described in detail below.
bytes.``rstrip
([chars])bytearray.``rstrip
([chars])-
Return a copy of the sequence with specified trailing bytes removed. The chars argument is a binary sequence specifying the set of byte values to be removed - the name refers to the fact this method is usually used with ASCII characters. If omitted or
None
, the chars argument defaults to removing ASCII whitespace. The chars argument is not a suffix; rather, all combinations of its values are stripped:The binary sequence of byte values to remove may be any bytes-like object.
Note
The bytearray version of this method does not operate in place - it always produces a new object, even if no changes were made.
bytes.``split
(sep=None, maxsplit=-1)bytearray.``split
(sep=None, maxsplit=-1)-
Split the binary sequence into subsequences of the same type, using sep as the delimiter string. If maxsplit is given and non-negative, at mostmaxsplit splits are done (thus, the list will have at most
maxsplit+1
elements). If maxsplit is not specified or is-1
, then there is no limit on the number of splits (all possible splits are made).If sep is given, consecutive delimiters are not grouped together and are deemed to delimit empty subsequences (for example,
b'1,,2'.split(b',')
returns[b'1', b'', b'2']
). The sep argument may consist of a multibyte sequence (for example,b'1<>2<>3'.split(b'<>')
returns[b'1', b'2', b'3']
). Splitting an empty sequence with a specified separator returns[b'']
or[bytearray(b'')]
depending on the type of object being split. The sep argument may be any bytes-like object.For example:
If sep is not specified or is
None
, a different splitting algorithm is applied: runs of consecutive ASCII whitespace are regarded as a single separator, and the result will contain no empty strings at the start or end if the sequence has leading or trailing whitespace. Consequently, splitting an empty sequence or a sequence consisting solely of ASCII whitespace without a specified separator returns[]
.For example:
bytes.``strip
([chars])bytearray.``strip
([chars])-
Return a copy of the sequence with specified leading and trailing bytes removed. The chars argument is a binary sequence specifying the set of byte values to be removed - the name refers to the fact this method is usually used with ASCII characters. If omitted or
None
, the charsargument defaults to removing ASCII whitespace. The chars argument is not a prefix or suffix; rather, all combinations of its values are stripped:The binary sequence of byte values to remove may be any bytes-like object.
Note
The bytearray version of this method does not operate in place - it always produces a new object, even if no changes were made.
The following methods on bytes and bytearray objects assume the use of ASCII compatible binary formats and should not be applied to arbitrary binary data. Note that all of the bytearray methods in this section do not operate in place, and instead produce new objects.
bytes.``capitalize
()bytearray.``capitalize
()-
Return a copy of the sequence with each byte interpreted as an ASCII character, and the first byte capitalized and the rest lowercased. Non-ASCII byte values are passed through unchanged.
Note
The bytearray version of this method does not operate in place - it always produces a new object, even if no changes were made.
bytes.``expandtabs
(tabsize=8)bytearray.``expandtabs
(tabsize=8)-
Return a copy of the sequence where all ASCII tab characters are replaced by one or more ASCII spaces, depending on the current column and the given tab size. Tab positions occur every tabsize bytes (default is 8, giving tab positions at columns 0, 8, 16 and so on). To expand the sequence, the current column is set to zero and the sequence is examined byte by byte. If the byte is an ASCII tab character (
b'\t'
), one or more space characters are inserted in the result until the current column is equal to the next tab position. (The tab character itself is not copied.) If the current byte is an ASCII newline (b'\n'
) or carriage return (b'\r'
), it is copied and the current column is reset to zero. Any other byte value is copied unchanged and the current column is incremented by one regardless of how the byte value is represented when printed:Note
The bytearray version of this method does not operate in place - it always produces a new object, even if no changes were made.
bytes.``isalnum
()bytearray.``isalnum
()-
Return true if all bytes in the sequence are alphabetical ASCII characters or ASCII decimal digits and the sequence is not empty, false otherwise. Alphabetic ASCII characters are those byte values in the sequence
b'abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ'
. ASCII decimal digits are those byte values in the sequenceb'0123456789'
.For example:
bytes.``isalpha
()bytearray.``isalpha
()-
Return true if all bytes in the sequence are alphabetic ASCII characters and the sequence is not empty, false otherwise. Alphabetic ASCII characters are those byte values in the sequence
b'abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ'
.For example:
bytes.``isdigit
()bytearray.``isdigit
()-
Return true if all bytes in the sequence are ASCII decimal digits and the sequence is not empty, false otherwise. ASCII decimal digits are those byte values in the sequence
b'0123456789'
.For example:
bytes.``islower
()bytearray.``islower
()-
Return true if there is at least one lowercase ASCII character in the sequence and no uppercase ASCII characters, false otherwise.
For example:
Lowercase ASCII characters are those byte values in the sequence
b'abcdefghijklmnopqrstuvwxyz'
. Uppercase ASCII characters are those byte values in the sequenceb'ABCDEFGHIJKLMNOPQRSTUVWXYZ'
.
bytes.``isspace
()bytearray.``isspace
()-
Return true if all bytes in the sequence are ASCII whitespace and the sequence is not empty, false otherwise. ASCII whitespace characters are those byte values in the sequence
b' \t\n\r\x0b\f'
(space, tab, newline, carriage return, vertical tab, form feed).
bytes.``istitle
()bytearray.``istitle
()-
Return true if the sequence is ASCII titlecase and the sequence is not empty, false otherwise. See
bytes.title()
for more details on the definition of “titlecase”.For example:
bytes.``isupper
()bytearray.``isupper
()-
Return true if there is at least one uppercase alphabetic ASCII character in the sequence and no lowercase ASCII characters, false otherwise.
For example:
Lowercase ASCII characters are those byte values in the sequence
b'abcdefghijklmnopqrstuvwxyz'
. Uppercase ASCII characters are those byte values in the sequenceb'ABCDEFGHIJKLMNOPQRSTUVWXYZ'
.
bytes.``lower
()bytearray.``lower
()-
Return a copy of the sequence with all the uppercase ASCII characters converted to their corresponding lowercase counterpart.
For example:
Lowercase ASCII characters are those byte values in the sequence
b'abcdefghijklmnopqrstuvwxyz'
. Uppercase ASCII characters are those byte values in the sequenceb'ABCDEFGHIJKLMNOPQRSTUVWXYZ'
.Note
The bytearray version of this method does not operate in place - it always produces a new object, even if no changes were made.
bytes.``splitlines
(keepends=False)bytearray.``splitlines
(keepends=False)-
Return a list of the lines in the binary sequence, breaking at ASCII line boundaries. This method uses the universal newlines approach to splitting lines. Line breaks are not included in the resulting list unless keepends is given and true.
For example:
Unlike
split()
when a delimiter string sep is given, this method returns an empty list for the empty string, and a terminal line break does not result in an extra line:
bytes.``swapcase
()bytearray.``swapcase
()-
Return a copy of the sequence with all the lowercase ASCII characters converted to their corresponding uppercase counterpart and vice-versa.
For example:
Lowercase ASCII characters are those byte values in the sequence
b'abcdefghijklmnopqrstuvwxyz'
. Uppercase ASCII characters are those byte values in the sequenceb'ABCDEFGHIJKLMNOPQRSTUVWXYZ'
.Unlike
str.swapcase()
, it is always the case thatbin.swapcase().swapcase() == bin
for the binary versions. Case conversions are symmetrical in ASCII, even though that is not generally true for arbitrary Unicode code points.Note
The bytearray version of this method does not operate in place - it always produces a new object, even if no changes were made.
bytes.``title
()bytearray.``title
()-
Return a titlecased version of the binary sequence where words start with an uppercase ASCII character and the remaining characters are lowercase. Uncased byte values are left unmodified.
For example:
Lowercase ASCII characters are those byte values in the sequence
b'abcdefghijklmnopqrstuvwxyz'
. Uppercase ASCII characters are those byte values in the sequenceb'ABCDEFGHIJKLMNOPQRSTUVWXYZ'
. All other byte values are uncased.The algorithm uses a simple language-independent definition of a word as groups of consecutive letters. The definition works in many contexts but it means that apostrophes in contractions and possessives form word boundaries, which may not be the desired result:
A workaround for apostrophes can be constructed using regular expressions:
Note
The bytearray version of this method does not operate in place - it always produces a new object, even if no changes were made.
bytes.``upper
()bytearray.``upper
()-
Return a copy of the sequence with all the lowercase ASCII characters converted to their corresponding uppercase counterpart.
For example:
Lowercase ASCII characters are those byte values in the sequence
b'abcdefghijklmnopqrstuvwxyz'
. Uppercase ASCII characters are those byte values in the sequenceb'ABCDEFGHIJKLMNOPQRSTUVWXYZ'
.Note
The bytearray version of this method does not operate in place - it always produces a new object, even if no changes were made.
bytes.``zfill
(width)bytearray.``zfill
(width)-
Return a copy of the sequence left filled with ASCII
b'0'
digits to make a sequence of length width. A leading sign prefix (b'+'
/b'-'
is handled by inserting the padding after the sign character rather than before. Forbytes
objects, the original sequence is returned if width is less than or equal tolen(seq)
.For example:
Note
The bytearray version of this method does not operate in place - it always produces a new object, even if no changes were made.
4.8.4. printf
-style Bytes Formatting
Note
The formatting operations described here exhibit a variety of quirks that lead to a number of common errors (such as failing to display tuples and dictionaries correctly). If the value being printed may be a tuple or dictionary, wrap it in a tuple.
Bytes objects (bytes
/bytearray
) have one unique built-in operation: the %
operator (modulo). This is also known as the bytes formatting or interpolation operator. Given format % values
(where format is a bytes object), %
conversion specifications in format are replaced with zero or more elements of values. The effect is similar to using the sprintf()
in the C language.
If format requires a single argument, values may be a single non-tuple object. [5] Otherwise, values must be a tuple with exactly the number of items specified by the format bytes object, or a single mapping object (for example, a dictionary).
A conversion specifier contains two or more characters and has the following components, which must occur in this order:
- The
'%'
character, which marks the start of the specifier. - Mapping key (optional), consisting of a parenthesised sequence of characters (for example,
(somename)
). - Conversion flags (optional), which affect the result of some conversion types.
- Minimum field width (optional). If specified as an
'*'
(asterisk), the actual width is read from the next element of the tuple in values, and the object to convert comes after the minimum field width and optional precision. - Precision (optional), given as a
'.'
(dot) followed by the precision. If specified as'*'
(an asterisk), the actual precision is read from the next element of the tuple in values, and the value to convert comes after the precision. - Length modifier (optional).
- Conversion type.
When the right argument is a dictionary (or other mapping type), then the formats in the bytes object must include a parenthesised mapping key into that dictionary inserted immediately after the '%'
character. The mapping key selects the value to be formatted from the mapping. For example:
In this case no *
specifiers may occur in a format (since they require a sequential parameter list).
The conversion flag characters are:
Flag | Meaning |
---|---|
'#' |
The value conversion will use the “alternate form” (where defined below). |
'0' |
The conversion will be zero padded for numeric values. |
'-' |
The converted value is left adjusted (overrides the '0' conversion if both are given). |
' ' |
(a space) A blank should be left before a positive number (or empty string) produced by a signed conversion. |
'+' |
A sign character ('+' or '-' ) will precede the conversion (overrides a “space” flag). |
Flag | Meaning |
---|---|
'#' |
The value conversion will use the “alternate form” (where defined below). |
'0' |
The conversion will be zero padded for numeric values. |
'-' |
The converted value is left adjusted (overrides the '0' conversion if both are given). |
' ' |
(a space) A blank should be left before a positive number (or empty string) produced by a signed conversion. |
'+' |
A sign character ('+' or '-' ) will precede the conversion (overrides a “space” flag). |
A length modifier (h
, l
, or L
) may be present, but is ignored as it is not necessary for Python – so e.g. %ld
is identical to %d
.
The conversion types are:
Conversion | Meaning | Notes |
---|---|---|
'd' |
Signed integer decimal. | |
'i' |
Signed integer decimal. | |
'o' |
Signed octal value. | (1) |
'u' |
Obsolete type – it is identical to 'd' . |
(8) |
'x' |
Signed hexadecimal (lowercase). | (2) |
'X' |
Signed hexadecimal (uppercase). | (2) |
'e' |
Floating point exponential format (lowercase). | (3) |
'E' |
Floating point exponential format (uppercase). | (3) |
'f' |
Floating point decimal format. | (3) |
'F' |
Floating point decimal format. | (3) |
'g' |
Floating point format. Uses lowercase exponential format if exponent is less than -4 or not less than precision, decimal format otherwise. | (4) |
'G' |
Floating point format. Uses uppercase exponential format if exponent is less than -4 or not less than precision, decimal format otherwise. | (4) |
'c' |
Single byte (accepts integer or single byte objects). | |
'b' |
Bytes (any object that follows the buffer protocol or has __bytes__() ). |
(5) |
's' |
's' is an alias for 'b' and should only be used for Python2/3 code bases. |
(6) |
'a' |
Bytes (converts any Python object using repr(obj).encode('ascii','backslashreplace) ). |
(5) |
'r' |
'r' is an alias for 'a' and should only be used for Python2/3 code bases. |
(7) |
'%' |
No argument is converted, results in a '%' character in the result. |
Conversion | Meaning | Notes |
---|---|---|
'd' |
Signed integer decimal. | |
'i' |
Signed integer decimal. | |
'o' |
Signed octal value. | (1) |
'u' |
Obsolete type – it is identical to 'd' . |
(8) |
'x' |
Signed hexadecimal (lowercase). | (2) |
'X' |
Signed hexadecimal (uppercase). | (2) |
'e' |
Floating point exponential format (lowercase). | (3) |
'E' |
Floating point exponential format (uppercase). | (3) |
'f' |
Floating point decimal format. | (3) |
'F' |
Floating point decimal format. | (3) |
'g' |
Floating point format. Uses lowercase exponential format if exponent is less than -4 or not less than precision, decimal format otherwise. | (4) |
'G' |
Floating point format. Uses uppercase exponential format if exponent is less than -4 or not less than precision, decimal format otherwise. | (4) |
'c' |
Single byte (accepts integer or single byte objects). | |
'b' |
Bytes (any object that follows the buffer protocol or has __bytes__() ). |
(5) |
's' |
's' is an alias for 'b' and should only be used for Python2/3 code bases. |
(6) |
'a' |
Bytes (converts any Python object using repr(obj).encode('ascii','backslashreplace) ). |
(5) |
'r' |
'r' is an alias for 'a' and should only be used for Python2/3 code bases. |
(7) |
'%' |
No argument is converted, results in a '%' character in the result. |
Notes:
The alternate form causes a leading octal specifier (
'0o'
) to be inserted before the first digit.The alternate form causes a leading
'0x'
or'0X'
(depending on whether the'x'
or'X'
format was used) to be inserted before the first digit.-
The alternate form causes the result to always contain a decimal point, even if no digits follow it.
The precision determines the number of digits after the decimal point and defaults to 6.
-
The alternate form causes the result to always contain a decimal point, and trailing zeroes are not removed as they would otherwise be.
The precision determines the number of significant digits before and after the decimal point and defaults to 6.
If precision is
N
, the output is truncated toN
characters.b'%s'
is deprecated, but will not be removed during the 3.x series.b'%r'
is deprecated, but will not be removed during the 3.x series.See PEP 237.
Note
The bytearray version of this method does not operate in place - it always produces a new object, even if no changes were made.
See also
PEP 461.
New in version 3.5.
4.8.5. Memory Views
memoryview
objects allow Python code to access the internal data of an object that supports the buffer protocol without copying.
- class
memoryview
(obj) -
Create a
memoryview
that references obj. obj must support the buffer protocol. Built-in objects that support the buffer protocol includebytes
andbytearray
.A
memoryview
has the notion of an element, which is the atomic memory unit handled by the originating object obj. For many simple types such asbytes
andbytearray
, an element is a single byte, but other types such asarray.array
may have bigger elements.len(view)
is equal to the length oftolist
. Ifview.ndim = 0
, the length is 1. Ifview.ndim = 1
, the length is equal to the number of elements in the view. For higher dimensions, the length is equal to the length of the nested list representation of the view. Theitemsize
attribute will give you the number of bytes in a single element.A
memoryview
supports slicing and indexing to expose its data. One-dimensional slicing will result in a subview:If
format
is one of the native format specifiers from thestruct
module, indexing with an integer or a tuple of integers is also supported and returns a single element with the correct type. One-dimensional memoryviews can be indexed with an integer or a one-integer tuple. Multi-dimensional memoryviews can be indexed with tuples of exactly ndim integers where ndim is the number of dimensions. Zero-dimensional memoryviews can be indexed with the empty tuple.Here is an example with a non-byte format:
If the underlying object is writable, the memoryview supports one-dimensional slice assignment. Resizing is not allowed:
One-dimensional memoryviews of hashable (read-only) types with formats ‘B’, ‘b’ or ‘c’ are also hashable. The hash is defined as
hash(m) ==hash(m.tobytes())
:Changed in version 3.3: One-dimensional memoryviews can now be sliced. One-dimensional memoryviews with formats ‘B’, ‘b’ or ‘c’ are now hashable.
Changed in version 3.4: memoryview is now registered automatically with
collections.abc.Sequence
Changed in version 3.5: memoryviews can now be indexed with tuple of integers.
memoryview
has several methods:__eq__
(exporter)-
A memoryview and a PEP 3118 exporter are equal if their shapes are equivalent and if all corresponding values are equal when the operands’ respective format codes are interpreted using
struct
syntax.For the subset of
struct
format strings currently supported bytolist()
,v
andw
are equal ifv.tolist() == w.tolist()
:If either format string is not supported by the
struct
module, then the objects will always compare as unequal (even if the format strings and buffer contents are identical):Note that, as with floating point numbers,
v is w
does not implyv == w
for memoryview objects.Changed in version 3.3: Previous versions compared the raw memory disregarding the item format and the logical array structure.
tobytes
()-
Return the data in the buffer as a bytestring. This is equivalent to calling the
bytes
constructor on the memoryview.For non-contiguous arrays the result is equal to the flattened list representation with all elements converted to bytes.
tobytes()
supports all format strings, including those that are not instruct
module syntax.
hex
()-
Return a string object containing two hexadecimal digits for each byte in the buffer.
New in version 3.5.
tolist
()-
Return the data in the buffer as a list of elements.
Changed in version 3.3:
tolist()
now supports all single character native formats instruct
module syntax as well as multi-dimensional representations.
release
()-
Release the underlying buffer exposed by the memoryview object. Many objects take special actions when a view is held on them (for example, a
bytearray
would temporarily forbid resizing); therefore, calling release() is handy to remove these restrictions (and free any dangling resources) as soon as possible.After this method has been called, any further operation on the view raises a
ValueError
(exceptrelease()
itself which can be called multiple times):The context management protocol can be used for a similar effect, using the
with
statement:New in version 3.2.
cast
(format[, shape])-
Cast a memoryview to a new format or shape. shape defaults to
[byte_length//new_itemsize]
, which means that the result view will be one-dimensional. The return value is a new memoryview, but the buffer itself is not copied. Supported casts are 1D -> C-contiguous and C-contiguous -> 1D.The destination format is restricted to a single element native format in
struct
syntax. One of the formats must be a byte format (‘B’, ‘b’ or ‘c’). The byte length of the result must be the same as the original length.Cast 1D/long to 1D/unsigned bytes:
Cast 1D/unsigned bytes to 1D/char:
Cast 1D/bytes to 3D/ints to 1D/signed char:
Cast 1D/unsigned bytes to 1D/char:
>>>
Cast 1D/bytes to 3D/ints to 1D/signed char:
>>>
Cast 1D/unsigned bytes to 1D/char:
>>>
Cast 1D/bytes to 3D/ints to 1D/signed char:
>>>
Cast 1D/unsigned bytes to 1D/char:
>>>
Cast 1D/bytes to 3D/ints to 1D/signed char:
>>>
Cast 1D/unsigned char to 2D/unsigned long:
New in version 3.3.
Changed in version 3.5: The source format is no longer restricted when casting to a byte view.
There are also several readonly attributes available:
obj
-
The underlying object of the memoryview:
New in version 3.3.
nbytes
-
nbytes == product(shape) * itemsize == len(m.tobytes())
. This is the amount of space in bytes that the array would use in a contiguous representation. It is not necessarily equal to len(m):
nbytes
nbytes == product(shape) * itemsize == len(m.tobytes())
. This is the amount of space in bytes that the array would use in a contiguous representation. It is not necessarily equal to len(m):
>>>
Multi-dimensional arrays:
New in version 3.3.
readonly
-
A bool indicating whether the memory is read only.
format
-
A string containing the format (in
struct
module style) for each element in the view. A memoryview can be created from exporters with arbitrary format strings, but some methods (e.g.tolist()
) are restricted to native single element formats.Changed in version 3.3: format
'B'
is now handled according to the struct module syntax. This means thatmemoryview(b'abc')[0] ==b'abc'[0] == 97
.
itemsize
-
The size in bytes of each element of the memoryview:
ndim
-
An integer indicating how many dimensions of a multi-dimensional array the memory represents.
shape
-
A tuple of integers the length of
ndim
giving the shape of the memory as an N-dimensional array.Changed in version 3.3: An empty tuple instead of
None
when ndim = 0.
strides
-
A tuple of integers the length of
ndim
giving the size in bytes to access each element for each dimension of the array.Changed in version 3.3: An empty tuple instead of
None
when ndim = 0.
suboffsets
-
Used internally for PIL-style arrays. The value is informational only.
c_contiguous
-
A bool indicating whether the memory is C-contiguous.
New in version 3.3.
f_contiguous
-
A bool indicating whether the memory is Fortran contiguous.
New in version 3.3.
contiguous
-
A bool indicating whether the memory is contiguous.
New in version 3.3.