The most classical topic for Python novices

The most classical topic for Python novices

Python 101 -- Introduction to Python

Dave Kuhlman

http://www.rexx.com/~dkuhlman
Email:

Release 1.00
June 6, 2003


Front Matter

Copyright (c) 2003 Dave Kuhlman

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

Abstract:

This document is a syllabus for a first course in Python programming. This course contains an introduction to the Python language, instruction in the important and commonly used features of the language, and practical excercises in the use of those features.



Contents

  • 1. Python 101 -- Introduction to Python
    • 1.1 Important Features of Python
    • 1.2 Where to Go For Additional help
  • 2. Python -- Feature by Feature
    • 2.1 Interactive Python
    • 2.2 Data Types
      • 2.2.1 Strings
      • 2.2.2 Sequences
      • 2.2.3 Dictionaries
    • 2.3 Simple Statements
      • 2.3.1 print
      • 2.3.2 import
      • 2.3.3 assert
      • 2.3.4 global
    • 2.4 Control Structures
      • 2.4.1 if
      • 2.4.2 for
      • 2.4.3 while
      • 2.4.4 try-except and raise -- Exceptions
      • 2.4.5 Reading Text Files
      • 2.4.6 Iterator objects
    • 2.5 Organization
      • 2.5.1 Functions
      • 2.5.2 Classes and Instances
      • 2.5.3 Modules
      • 2.5.4 Packages


1. Python 101 -- Introduction to Python

Python is a high-level general purpose programming language. Because code is automatically compiled to byte code and executed, Python is suitable for use as a scripting language, Web application implementation language, etc. Because Python can be extended in C and C++, Python can provide the speed needed for even compute intensive tasks.

1.1 Important Features of Python

  • Built-in high level data types: strings, lists, dictionaries, etc.

  • The usual control structures: if, if-else, if-elif-else, while, plus a powerful collection iterator (for).

  • Multiple levels of organizational structure: functions, classes, modules, and packages. These assist in organizing code. An excellent and large example is the Python standard library.

  • Compile on the fly to byte code -- Source code is compiled to byte code without a separate compile step. Source code modules can also be "pre-compiled" to byte code files.

  • Extensions in C and C++ -- Extension modules and extension types can be written by hand. There are also tools that help with this, for example, SWIG, sip, Pyrex.

Some things you will need to know:

  • Python uses indentation to show block structure. Indent one level to show the beginning of a block. Out-dent one level to show the end of a block. As an example, the following C-style code:

    if (x)
    {
        if (y)
        {
            f1()
        }
        f2()
    }
    

    in Python would be:

    if x:
        if y:
            f1()
        f2()
    

    And, the convention is to use four spaces (and no tabs) for each level of indentation.

1.2 Where to Go For Additional help

  • The standard Python documentation set -- It contains a tutorial, a language reference, the standard library reference, and documents on extending Python in C/C++.

  • Other Python tutorials -- See especially:

    • Python introductions

    • Python for beginners

  • Other Python resources -- See especially:

    • Python documentation

    • the Python home Web site

    • The whole Python FAQ


2. Python -- Feature by Feature

2.1 Interactive Python

If you execute Python from the command line with no script, Python gives you an interactive prompt. This is an excellent facility for learning Python and for trying small snippets of code. Many of the examples that follow were developed using the Python interactive prompt.

In addition, there are tools that will give you a more powerful and fancy Python interactive mode. One example is IPython, which is available at http://ipython.scipy.org/. You may also want to consider using IDLE. IDLE is a graphical integrated development environment for Python; it contains a Python shell. You will find a script to start up IDLE in the Tools/scripts directory of your Python distribution. IDLE requires Tkinter.

2.2 Data Types

2.2.1 Strings

2.2.1.1 What

In Python, strings are immutable sequences of characters. They are immutable in that in order to modify a string, you must produce a new string.

2.2.1.2 When

Any text information.

2.2.1.3 How

Create a new string from a constant:

s1 = 'abce'
s2 = "xyz"
s3 = """A
multi-line
string.
"""

Use any of the string methods, for example:

>>> 'The happy cat ran home.'.upper()
'THE HAPPY CAT RAN HOME.'
>>> 'The happy cat ran home.'.find('cat')
10
>>> 'The happy cat ran home.'.find('kitten')
-1
>>> 'The happy cat ran home.'.replace('cat', 'dog')
'The happy dog ran home.'

Type "help(str)" or see http://www.python.org/doc/current/lib/string-methods.html for more information on string methods.

You can also use the equivalent functions from the string module. For example:

>>> import string
>>> s1 = 'The happy cat ran home.'
>>> string.find(s1, 'happy')
4

See http://www.python.org/doc/current/lib/module-string.htmlfor more information on the string module.

There is also a string formatting operator: "%".

>>> state = 'California'
>>> 'It never rains in sunny %s.' % state
'It never rains in sunny California.'

You can use any of the following formatting characters:

Conversion  Meaning  Notes 
d Signed integer decimal.  
i Signed integer decimal.  
o Unsigned octal. (1)
u Unsigned decimal.  
x Unsigned hexidecimal (lowercase). (2)
X Unsigned hexidecimal (uppercase). (2)
e Floating point exponential format (lowercase).  
E Floating point exponential format (uppercase).  
f Floating point decimal format.  
F Floating point decimal format.  
g Same as "e" if exponent is greater than -4 or less than precision, "f" otherwise.  
G Same as "E" if exponent is greater than -4 or less than precision, "F" otherwise.  
c Single character (accepts integer or single character string).  
r String (converts any python object using repr()). (3)
s String (converts any python object using str()). (4)
% No argument is converted, results in a "%" character in the result.  

And these flags:

Flag  Meaning 
# The value conversion will use the ``alternate form'' (where defined below).
0 The conversion will be zero padded for numeric values.
- The converted value is left adjusted (overrides the "0" conversion if both are given).
  (a space) A blank should be left before a positive number (or empty string) produced by a signed conversion.
+ A sign character ("+" or "-") will precede the conversion (overrides a "space" flag).

See http://www.python.org/doc/current/lib/typesseq-strings.htmlfor more information on string formatting.

You can also write strings to a file and read them from a file. Here are some examples:

  • Writing - For example:

    >>> outfile = file('tmp.txt', 'w')
    >>> outfile.write('This is line #1\n')
    >>> outfile.write('This is line #2\n')
    >>> outfile.write('This is line #3\n')
    >>> outfile.close()
    

    Notes:

    • Note the end-of-line character at the end of each string.

    • The file constructor creates a file object. It takes as arguments (1) the file name and (2) a mode. Commonly used modes are "r" (read), "w" (write), and "a"(append). See http://www.python.org/doc/current/lib/built-in-funcs.htmlfor more modes and more on file.

  • Reading an entire file:

    >>> infile = file('tmp.txt', 'r')
    >>> content = infile.read()
    >>> print content
    This is line #1
    This is line #2
    This is line #3
    
    >>> infile.close()
    

  • Reading a file one line at a time:

    >>> infile = file('tmp.txt', 'r')
    >>> for line in infile.readlines():
    ...     print 'Line:', line
    ...
    Line: This is line #1
    
    Line: This is line #2
    
    Line: This is line #3
    
    >>> infile.close()
    

    Notes:

    • "infile.readlines()" returns a list of lines in the file. For large files use the file object itself or "infile.xreadlines()", both of which are iterators for the lines in the file.

A few additional comments about strings:

  • A string is a special kind of sequence. So, you can index into the characters of a string and you can iterate over the characters in a string. For example:

    >>> s1 = 'abcd'
    >>> s1[1]
    'b'
    >>> s1[2]
    'c'
    >>> for ch in s1:
    ...   print ch
    ...
    a
    b
    c
    d
    

  • If you need to do fast or complex string searches, there is a regular expression module in the standard library: re.

  • An interesting feature of string formatting is the ability to use dictionaries to supply the values that are inserted. Here is an example:

    names = {'tree': 'sycamore', 'flower': 'poppy', 'herb': 'arugula'}
    
    print 'The tree is %(tree)s' % names
    print 'The flower is %(flower)s' % names
    print 'The herb is %(herb)s' % names
    

2.2.2 Sequences

2.2.2.1 What

There are several types of sequences in Python. We've already discussed strings. In this section we will describe lists and tuples. See http://www.python.org/doc/current/lib/typesseq.html for a description of the other sequence types (e.g. buffers and xrange objects).

Lists are dynamic arrays. They are arrays in the sense that you can index items in a list (for example "mylist[3]") and you can select sub-ranges (for example "mylist[2:4]"). They are dynamic in the sense that you can add and remove items after the list is created.

Tuples are light-weight lists, but differ from lists in that they are immutable. That is, once a tuple has been created, you cannot modify it. You can, of course, modify any (modifiable) objects that the tuple refers to.

Capabilities of lists:

  • Append items.

  • Insert items.

  • Add a list of items.

Capabilities of lists and tuples:

  • Index items.

  • Select a subsequence of items (also known as a slice).

  • Iterate over the items in the list or tuple.

2.2.2.2 When

  • Whenever you want to process a colletion of items.

  • Whenever you want to iterate over a collection of items.

  • Whenever you want to index into a collection of items.

2.2.2.3 How

To create a list use:

>>> items = [111, 222, 333]
>>> items
[111, 222, 333]

To add an item to the end of a list, use:

>>> items.append(444)
>>> items
[111, 222, 333, 444]

To insert an item into a list, use:

>>> items.insert(0, -1)
>>> items
[-1, 111, 222, 333, 444]

You can also push items onto the right end of a list and pop items off the right end of a list with append and pop.

>>> items.append(555)
>>> items
[-1, 111, 222, 333, 444, 555]
>>> items.pop()
555
>>> items
[-1, 111, 222, 333, 444]

And, you can iterate over the items in a list with the for statement:

>>> for item in items:
...   print 'item:', item
...
item: -1
item: 111
item: 222
item: 333
item: 444

2.2.3 Dictionaries

2.2.3.1 What

Associative arrays.

Capabilities:

  • Ability to iterate over keys or values.

  • Ability to add key-value pairs dynamically.

  • Look-up by key.

For help on dictionaries, type:

>>> help dict

at Python's interactive prompt, or:

$ pydoc help

at the command line.

2.2.3.2 When

  • When you need look-up by key.

  • When you need a "structured" lite-weight object or an object with named fields. (But, don't forget classes.)

  • When you need to map a name or label to any kind of object, even an executable one such as a function.

2.2.3.3 How

Create a dictionary with:

>>> lookup = {}
>>> lookup
{}

or:

>>> def fruitfunc():
...    print "I'm a fruit."
>>> def vegetablefunc():
...    print "I'm a vegetable."
>>>
>>> lookup = {'fruit': fruitfunc, 'vegetable': vegetablefunc}
>>> lookup
{'vegetable': <function vegetablefunc at 0x4028980c>,
'fruit': <function fruitfunc at 0x4028e614>}
>>> lookup['fruit']()
I'm a fruit.
>>> lookup['vegetable']()
I'm a vegetable.

or:

>>> lookup = dict((('aa', 11), ('bb', 22), ('cc', 33)))
>>> lookup
{'aa': 11, 'cc': 33, 'bb': 22}
>>>

Test for the existence of a key with:

>>> if lookup.has_key('fruit'):
...   print 'contains key "fruit"'
...
contains key "fruit"
>>>

or:

>>> if 'fruit' in lookup:
...   print 'contains key "fruit"'
...
contains key "fruit"
>>>

Access the value of a key as follows:

>>> print lookup['fruit']
<function fruitfunc at 0x4028e614>
>>>

>>> for key in lookup:
...     print 'key: %s' % key
...     lookup[key]()
...
key: vegetable
I'm a vegetable.
key: fruit
I'm a fruit.
>>>

				

And, remember that you can sub-class dictionaries. Here are two versions of the same example. The keyword arguments in the second version require Python 2.3:

#
# This example works with Python 2.2.
class MyDict_for_python_22(dict):
    def __init__(self, **kw):
        for key in kw.keys():
            self[key] = kw[key]
    def show(self):
        print 'Showing example for Python 2.2 ...'
        for key in self.keys():
            print 'key: %s  value: %s' % (key, self[key])

def test_for_python_22():
    d = MyDict_for_python_22(one=11, two=22, three=33)
    d.show()

test_for_python_22()

#
# This example works with Python 2.3.
#   Keyword support, when subclassing dictionaries, seems to have
#   been enhanced in Python 2.3.
class MyDict(dict):
    def show(self):
        print 'Showing example for Python 2.3 ...'
        for key in self.keys():
            print 'key: %s  value: %s' % (key, self[key])

def test():
    d = MyDict(one=11, two=22, three=33)
    d.show()

test()

Running this example produces:

Showing example for Python 2.2 ...
key: one  value: 11
key: three  value: 33
key: two  value: 22
Showing example for Python 2.3 ...
key: three  value: 33
key: two  value: 22
key: one  value: 11

A few comments about this example:

  • The class MyDict does not define a constructor (__init__). This enables us to re-use the contructor from dict and any of its forms. Type "help dict" at the Python interactive prompt to learn about the various ways to call the dict constructor.

  • The show method is the specialization added to our sub-class.

  • In our sub-class, we can refer to any methods in the super-class (dict). For example: "self.keys()".

  • In our sub-class, we can refer the dictionary itself. For example: "self[key]".

2.3 Simple Statements

2.3.1 print

The print statement sends output to stdout.

Here are a few examples:

print obj
print obj1, obj2, obj3
print "My name is %s" % name

Notes:

  • To print multiple items, separate them with commas. The print statement inserts a blank between objects.

  • The print statement automatically appends a newline to output. To print without a newline, add a comma after the last object, or use "sys.stdout", for example:

    print 'Output with no newline',
    

    which will append a blank, or:

    import sys
    sys.stdout.write("Some output")
    

  • To re-define the destination of output from the print statement, replace sys.stdout with an instance of a class that supports the write method. For example:

    import sys
    
    class Writer:
        def __init__(self, filename):
            self.filename = filename
        def write(self, msg):
            f = file(self.filename, 'a')
            f.write(msg)
            f.close()
    
    sys.stdout = Writer('tmp.log')
    print 'Log message #1'
    print 'Log message #2'
    print 'Log message #3'
    

More information on the print statement is at http://www.python.org/doc/current/ref/print.html.

Note: Note to Jython users - Jython does not appear to support the file constructor for files. In the above example, replace file with open.

2.3.2 import

The import statement makes a module and its contents available for use.

Here are several forms of the import statement:

import test
Import module test. Refer to x in test with " test.x".

from test import x
Import x from test. Refer to x in test with " x".

from test import *
Import all objects from test. Refer to x in test with " x".

import test as theTest
Import test; make it available as theTest. Refer to object x with " theTest.x".

A few comments about import:

  • The import statement also evaluates the code in the imported module.

  • But, the code in a module is only evaluated the first time it is imported in a program. So, for example, if a module mymodule.py is imported from two other modules in a program, the statements in mymodule will be evaluated only the first time it is imported.

  • If you need even more variety that the import statement offers, see the imp module. Documentation at http://www.python.org/doc/current/lib/module-imp.html. Also see the __import__ built-in function. Documentation at http://www.python.org/doc/current/lib/built-in-funcs.html.

More information on import at http://www.python.org/doc/current/ref/import.html.

2.3.3 assert

Use the assert statement to place error checking statements in you code. Here is an example:

def test(arg1, arg2):
    arg1 = float(arg1)
    arg2 = float(arg2)
    assert arg2 != 0, 'Bad dividend -- arg1: %f arg2: %f' % (arg1, arg2)
    ratio = arg1 / arg2
    print 'ratio:', ratio

When arg2 is zero, running this code will produce something like the following:

Traceback (most recent call last):
  File "tmp.py", line 22, in ?
    main()
  File "tmp.py", line 18, in main
    test(args[0], args[1])
  File "tmp.py", line 8, in test
    assert arg2 != 0, 'Bad dividend -- arg1: %f arg2: %f' % (arg1, arg2)
AssertionError: Bad dividend -- arg1: 2.000000 arg2: 0.000000

A few comments:

  • Notice that the trace-back identifies the file and line where the test is made and shows the test itself.

  • If you run python with the optimize options (-O and -OO), the assertion test is not performed.

  • The second argument to assert is optional.

2.3.4 global

The problem -- Imagine a global variable NAME. If, in a function, the first mention of that variable is "name = NAME", then I'll get the value of the the global variable NAME. But, if, in a function, my first mention of that variable is an assignment to that variable, then I will create a new local variable, and will not refer to the global variable at all. Consider:

NAME = "Peach"

def show_global():
    name = NAME
    print '(show_global) name: %s' % name

def set_global():
    NAME = 'Nectarine'
    name = NAME
    print '(set_global) name: %s' % name

show_global()
set_global()
show_global()

Running this code produces:

(show_global) name: Peach
(set_global) name: Nectarine
(show_global) name: Peach

The set_global modifies a local variable and not the global variable as I might have intended.

The solution -- How can I fix that? Here is how:

NAME = "Peach"

def show_global():
    name = NAME
    print '(show_global) name: %s' % name

def set_global():
    global NAME
    NAME = 'Nectarine'
    name = NAME
    print '(set_global) name: %s' % name

show_global()
set_global()
show_global()

Notice the global statement in function set_global. Running this code does modify the global variable NAME, and produces the following output:

(show_global) name: Peach
(set_global) name: Nectarine
(show_global) name: Nectarine

Comments:

  • You can list more than one veriable in the global statement. For example:

    global NAME1, NAME2, NAME3
    

2.4 Control Structures

2.4.1 if

The if statement enables us to execute code (or not) depending on a condition.

  • "if condition: ..."

  • "if condition: ... else: ..."

  • "if condition1: ... elif condition2: ... else: ..."

Here is an example:

>>> y = 25
>>>
>>> if y > 15:
...     print 'y is large'
... else:
...     print 'y is small'
...
y is large

A few notes:

  • The condition can be any expression, i.e. something that returns a value. A detailed description of expressions can be found at http://www.python.org/doc/current/ref/expressions.html.

  • Parentheses are not needed around the condition. Use parentheses to group sub-expressions and control order of evaluation when the natural operator precedence is not what you want. Python's operator precedences are described at http://www.python.org/doc/current/ref/summary.html.

  • Python has no switch statement. Use if-elif. Or consider using a dictionary, for example:

    def function1():
        print "Hi. I'm function1."
    def function2():
        print "Hi. I'm function2."
    def function3():
        print "Hi. I'm function3."
    
    mapper = {'one': function1, 'two': function2, 'three': function3}
    
    while 1:
        code = raw_input('Enter "one", "two", "three", or "quit": ')
        if code == 'quit':
            break
        if code not in mapper:
            continue
        mapper[code]()
    

2.4.2 for

The for statement enables us to iterate over collections. It enables us to repeat a set of lines of code once for each item in a collection. Collections are things like strings (arrays of characters), lists, tuples, and dictionaries.

Here is an example:

>>> collection = [111,222,333]
>>> for item in collection:
...     print 'item:', item
...
item: 111
item: 222
item: 333

Comments:

  • You can iterate over strings, lists, and tuples.

  • Iterate over the keys or values in a dictionary with "aDict.keys()" and "aDict.values()". Here is an example:

    >>> aDict = {'cat': 'furry and cute', 'dog': 'friendly and smart'}
    >>> aDict.keys()
    ['dog', 'cat']
    >>> aDict.values()
    ['friendly and smart', 'furry and cute']
    >>> for key in aDict.keys():
    ...     print 'A %s is %s.' % (key, aDict[key])
    ...
    A dog is friendly and smart.
    A cat is furry and cute.
    

  • In recent versions of Python, a dictionary itself is an iterator for its keys. Therefore, you can also do the following:

    >>> for key in aDict:
    ...     print 'A %s is %s.' % (key, aDict[key])
    ...
    A dog is friendly and smart.
    A cat is furry and cute.
    

  • And, in recent versions of Python, a file is also an iterator over the lines in the file. Therefore, you can do the following:

    >>> infile = file('tmp.txt', 'r')
    >>> for line in infile:
    ...       print line,
    ... 
    This is line #1
    This is line #2
    This is line #3
    >>> infile.close()
    

  • There are other kinds of iterations. For example, the built-in iter will produce an iterator from a collection. Here is an example:

    >>> anIter = iter([11,22,33])
    >>> for item in anIter:
    ...     print 'item:', item
    ...
    item: 11
    item: 22
    item: 33
    

  • You can also implement iterators of your own. To do so, define a function that returns values with yield (instead of with return). Here is an example:

    >>> def t(collection):
    ...     icollection = iter(collection)
    ...     for item in icollection:
    ...         yield '||%s||' % item
    ...
    >>> for x in t(collection): print x
    ...
    ||111||
    ||222||
    ||333||
    

2.4.3 while

while is another repeating statement. It executes a block of code until a condition is false.

Here is an example:

>>> reply = 'repeat'
>>> while reply == 'repeat':
...     print 'Hello'
...     reply = raw_input('Enter "repeat" to do it again: ')
...
Hello
Enter "repeat" to do it again: repeat
Hello
Enter "repeat" to do it again: bye

Comments:

  • Use the break statement to exit from a loop. This works for both for and while. Here is an example that uses break in a for statement:

    # for_break.py
    """Count lines until a line that begins with a double #.
    """
    
    import sys
    
    def countLines(infilename):
        infile = file(infilename, 'r')
        count = 0
        for line in infile.readlines():
            line = line.strip()
            if line[:2] == '##':
                break
            count += 1
        return count
    
    def usage():
        print 'Usage: python python_101_for_break.py <infilename>'
        sys.exit(-1)
    
    def main():
        args = sys.argv[1:]
        if len(args) != 1:
            usage()
        count = countLines(args[0])
        print 'count:', count
    
    if __name__ == '__main__':
        main()
    

  • Use the continue statement to skip the remainder of the code block in a for or while. A continue is a short-circuit which, in effect, branches back to the top of the for or while (or if you prefer, to the end of the block).

  • The test "if __name__ == '__main__':" is used to enable the script to both be (1) imported and (2) run from the command line. That condition is true only when the script is run, but not imported. This is a common Python idiom, which you should consider including at the end of your scripts, whether (1) to give your users a demonstration of what your script does and how to use it or (2) to provide a test of the script.

2.4.4 try-except and raise -- Exceptions

Use a try:except: block to catch an exception.

Use raise to raise an exception.

Comments and hints:

  • Catch all exceptions with a "bare" except:. For example:

    >>> try:
    ...     x = y
    ... except:
    ...     print 'y not defined'
    ...
    y not defined
    

  • Catch a specific error by refering to an exception class in the except:. To determine what error or exception you want to catch, generate it and try it. Because Python reports errors with a walk-back that ends with reporting the exception, you can learn which exception to catch. For example, suppose I want to learn which exception is thrown when a Python can't open a file. I can try the following from the interactive prompt:

    >>> myfile = file('amissingfile.py', 'r')
    Traceback (most recent call last):
      File "<stdin>", line 1, in ?
    IOError: [Errno 2] No such file or directory: 'amissingfile.py'
    

    So, now I know that I can do:

    >>> try:
    ...     myfile = file('amissingfile.py', 'r')
    ... except IOError:
    ...     print 'amissingfile.py is missing'
    ...
    amissingfile.py is missing
    

  • You can customize your error handling still further by passing an object on the raise and catching that object in the except:. By doing so, you can pass information up from the raise statement to an exception handler. One way of doing this is to pass an object. A reasonable strategy is to define a sub-class of a standard exception. For example:

    >>> class E(RuntimeError):
    ...   def __init__(self, msg):
    ...     self.msg = msg
    ...   def getMsg(self):
    ...     return self.msg
    ...
    >>>
    >>> try:
    ...   raise E('my test error')
    ... except E, obj:
    ...   print 'Msg:', obj.getMsg()
    ...
    Msg: my test error
    

2.4.5 Reading Text Files

To read a text file, first create a file object. Here is an example:

inFile = file('messages.log', 'r')

Then use one or more of the file object's methods to process the contents of the file. Here are a few strategies:

  • Use "inFile.read()" to get the entire contents of the file (a string). Example:

    >>> inFile = file('tmp.txt', 'r')
    >>> content = inFile.read()
    >>> inFile.close()
    >>> print content
    aaa bbb ccc
    ddd eee fff
    ggg hhh iii
    
    >>> words = content.split()
    >>> print words
    ['aaa', 'bbb', 'ccc', 'ddd', 'eee', 'fff', 'ggg', 'hhh', 'iii']
    >>> for word in words:
    ...     print word
    ...
    aaa
    bbb
    ccc
    ddd
    eee
    fff
    ggg
    hhh
    iii
    

  • Use "for line in inFile:" to process one line at a time. You can do this because (at least since Python 2.3) file objects obey the iterator protocol, that is they support methods __iter__ and next. For more on the iterator protocol see http://www.python.org/doc/current/lib/typeiter.html.

    Example:

    >>> inFile = file('tmp.txt', 'r')
    >>> for line in inFile:
    ...     print 'Line:', line,
    ...
    Line: aaaaa
    Line: bbbbb
    Line: ccccc
    Line: ddddd
    Line: eeeee
    >>> inFile.close()
    

  • For earlier versions of Python, use "inFile.readlines()" or "inFile.xreadlines()".

  • If your want to get the contents of an entire text file as a collection of lines, use readlines. Example:

    >>> inFile = file('tmp.txt', 'r')
    >>> lines = inFile.readlines()
    >>> print lines
    ['aaaaa\n', 'bbbbb\n', 'ccccc\n', 'ddddd\n', 'eeeee\n']
    >>> print lines[2]
    ccccc
    
    >>> inFile.close()
    

2.4.6 Iterator objects

This section explains how to implement iterator objects, that is a class that obeys the iterator protocol.

We explain generators first -- A generator is a function which uses yield. Because it returns values with yield instead of with return, the function can be resumed immediately after the yield. Here is an example:

def generateItems(seq):
    for item in seq:
        yield 'item: %s' % item

anIter = generateItems([])
print 'dir(anIter):', dir(anIter)
anIter = generateItems([111,222,333])
for x in anIter:
    print x
anIter = generateItems(['aaa', 'bbb', 'ccc'])
print anIter.next()
print anIter.next()
print anIter.next()
print anIter.next()

Running this example produces the following output:

dir(anIter): ['__class__', '__delattr__', '__doc__', '__getattribute__',
'__hash__', '__init__', '__iter__', '__new__', '__reduce__',
'__reduce_ex__', '__repr__', '__setattr__', '__str__', 'gi_frame',
'gi_running', 'next']
item: 111
item: 222
item: 333
item: aaa
item: bbb
item: ccc
Traceback (most recent call last):
  File "iterator_generator.py", line 14, in ?
    print anIter.next()
StopIteration

Notes:

  • Notice that the value returned by the call to the generator (function) is an iterator. It obeys the iterator protocol. That is, "dir(anIter)" shows that it has both __iter__ and next.

  • Because it is an iterator, we can use a for statement to iterate over the values returned by the generator.

  • We can also get its values by repeatedly calling the next method, until it raises the StopIteration exception. This ability to call the next method enables us to pass the iterator object around and get values at different locations in our code.

  • Once we have obtained all the values from an iterator, it is, in effect, "empty". The iterator protocol, in fact, specifies that once an iterator raises the StopIteration exception, it should continue to do so. Another way to say this, is that there is no "rewind" operation. But, you can call the the generator function again to get a "fresh" iterator.

Now, we will implement a class that obeys the iterator protocol. By doing so, we can produce "iterable" objects. Here is an example:

class IteratorExample:
    def __init__(self, s):
        self.s = s
        self.next = self._next().next
        self.exhausted = 0
    def _next(self):
        if not self.exhausted:
            flag = 0
            for x in self.s:
                if flag:
                    flag = 0
                    yield x
                else:
                    flag = 1
            self.exhausted = 1
    def __iter__(self):
        return self._next()

def main():
    a = IteratorExample('edcba')
    for x in a:
        print x
    print '=' * 30
    a = IteratorExample('abcde')
    print a.next()
    print a.next()
    print a.next()
    print a.next()
    print a.next()
    print a.next()

if __name__ == '__main__':
    main()

Iterating over an instance of the above class produces every other object from the sequence with which the instance was constructed. Running the above example produces the following:

d
b
==============================
b
d
Traceback (most recent call last):
  File "iterator_class.py", line 24, in ?
    print a.next()
StopIteration

Explanation:

  • Method __iter__ should return an iterator. So, in our example it calls method _next to produce that iterator. _next is a generator (it contains yield), which means that it returns an iterator.

  • Method next should return the next item in the sequence. But, this is exactly what the next method of the iterator returned by _next does. So, we just capture that using "self.next = self._next().next".

The definition of the iterator protocol is at http://www.python.org/doc/current/lib/typeiter.html.

2.5 Organization

This section describes Python features that you can use to organize and structure your code.

2.5.1 Functions

2.5.1.1 A basic function

Use def to define a function. Here is a simple example:

def test(msg, count):
    for idx in range(count):
        print '%s %d' % (msg, idx)

test('Test #', 4)

Comments:

  • After evaluation def creates a function object.

  • Call the function using the parentheses function call notation, in this case "test('Test #', 4)".

  • As with other Python objects, you can stuff a function object into other structures such as tuples, lists, and dictionaries. Here is an example:

    # Create a tuple:
    val = (test, 'A label:', 5)
    
    # Call the function:
    val[0](val[1], val[2])
    

2.5.1.2 A function with default arguments

Providing default arguments allows the caller to omit some arguments. Here is an example:

def testDefaultArgs(arg1='default1', arg2='default2'):
    print 'arg1:', arg1
    print 'arg2:', arg2

testDefaultArgs('Explicit value')

The above example prints:

arg1: Explicit value
arg2: default2

2.5.1.3 Argument lists and keyword argument lists

Here is an example:

def testArgLists_1(*args, **kwargs):
    print 'args:', args
    print 'kwargs:', kwargs
            
testArgLists_1('aaa', 'bbb', arg1='ccc', arg2='ddd')

def testArgLists_2(arg0, *args, **kwargs):
    print 'arg0: "%s"' % arg0
    print 'args:', args
    print 'kwargs:', kwargs
            
print '=' * 40
testArgLists_2('a first argument', 'aaa', 'bbb', arg1='ccc', arg2='ddd')

Running this example displays:

args: ('aaa', 'bbb')
kwargs: {'arg1': 'ccc', 'arg2': 'ddd'}
========================================
arg0: "a first argument"
args: ('aaa', 'bbb')
kwargs: {'arg1': 'ccc', 'arg2': 'ddd'}

A little guidance:

  • Positional arguments must proceed all keyword arguments when you call the function.

  • You can also have "normal" arguments in the function definition. For example: "def test(arg0, *args, **kwargs):". See the second example above.

  • The keyword argument parameter is a dictionary, so you can do anything with it that you do with a normal dictionary.

2.5.2 Classes and Instances

2.5.2.1 A basic class

Define a basic class as follows:

class Basic:
    def __init__(self, name):
        self.name = name
    def show(self):
        print 'Basic -- name: %s' % self.name

obj1 = Basic('Apricot')
obj1.show()

Running the above example produces the following:

Basic -- name: Apricot

Explanation:

  • Methods are added to the class with def. The first argument to a method is the class instance. By convention it is spelled "self".

  • The constructor for a class is a method named "__init__".

  • The self variable must be explicitly listed as the first argument to a method. You could spell it differently from "self", but don't do so.

  • Instance variables are referred to with "self.XXX". Notice how in our example an argument to the constructor is saved as an instance variable.

  • An instance is created by calling the class. For example: "obj = Basic('Apricot')".

  • In addition to __init__ there are other special method names of the form "__XXX__", which are used to customize classes and their instances. These are described at http://www.python.org/doc/current/ref/specialnames.html.

A few more notes on self:

  • self is a reference to the instance. Think of it (in part) as a reference to the container for the data or state for the object.

  • In many object-oriented programming languages, the instance is hidden in the method definitions. These languages typically explain this by saying something like ``The instance is passed as an implicit first argument to the method.''

  • In Python, the instance is visible and explicit in method definitions. You must explicitly declare the instance as the first parameter of each (instance) method. This first parameter is (almost) always spelled ``self''.

2.5.2.2 Inheritance

Define a class Special that inherits from a super-class Basic as follows:

class Basic:
    def __init__(self, name):
        self.name = name
    def show(self):
        print 'Basic -- name: %s' % self.name

class Special(Basic):
    def __init__(self, name, edible):
        Basic.__init__(self, name)
        self.upper = name.upper()
        self.edible = edible
    def show(self):
        Basic.show(self)
        print 'Special -- upper name: %s.' % self.upper,
        if self.edible:
            print "It's edible."
        else:
            print "It's not edible."
    def edible(self):
        return self.edible

obj1 = Basic('Apricot')
obj1.show()
print '=' * 30
obj2 = Special('Peach', 1)
obj2.show()

Running this example produces the following:

Basic -- name: Apricot
==============================
Basic -- name: Peach
Special -- upper name: PEACH. It's edible.

Commentary:

  • The super-class is named after the class name in parentheses. For multiple inheritence, separate the super-classes with commas.

  • Call a method in the super-class, by-passing the method with the same name in the sub-class, from the sub-class by using the super-class name. For example: "Basic.__init__(self, name)"and "Basic.show(self)".

  • In our example (above), the sub-class (Special) specializes the super-class (Basic) by adding additional member variables (self.upper and self.edible) and by adding an additional method (edible).

2.5.2.3 Class data

A class data member is a member that has only one value for the class and all its instances. Here is an example from the Python FAQ at http://www.python.org/doc/FAQ.html:

class C:
    count = 0   # number of times C.__init__ called
    def __init__(self):
        C.count = C.count + 1
    def getcount(self):
        return C.count  # or return self.count

c1 = C()
print 'Current count:', c1.getcount()
c2 = C()
print  'Current count:', c2.getcount()

Running this example produces:

Current count: 1
Current count: 2


2.5.2.4 Static methods and class methods

Here is an example that shows how to define static methods and class methods:

class Advanced:
    def __init__(self, name):
        self.name = name
    def Description():
        return 'This is an advanced class.'
    def ClassDescription(cls):
        return 'This is advanced class: %s' % repr(cls)
    Description = staticmethod(Description)
    ClassDescription = classmethod(ClassDescription)

obj1 = Advanced('Nectarine')
print obj1.Description()
print obj1.ClassDescription()
print '=' * 30
print Advanced.Description()
print  Advanced.ClassDescription()

Running the above produces the following output:

This is an advanced class.
This is advanced class: <class __main__.Advanced at 0x401c926c>
==============================
This is an advanced class.
This is advanced class: <class __main__.Advanced at 0x401c926c>

Notes:

  • Create a static method with "x = staticmethod(y)", where y is a normal method but without the self/first parameter.

  • Create a class method with "x = classmethod(y)", where y is a normal method.

  • The difference between static and class methods is that a class method receives the class (not the instance) as its first argument.

  • You can call static and class methods using either an instance or a class. In our example either "obj1.Description()" or "Advanced.Description()" will work.

You should also review the relevant standard Python documentation which you can find at Python Library Reference - 2.1 Built-in Functions.

By now, you are likely to be asking: ``Why and when should I use class methods and static methods?'' Here is a bit of guidance, though

  • Most of the time, almost always, implement plain instance methods. Implement an instance method whenever the method needs access to the values that are specific to the instance or needs to call other methods that have access to instance specific values. If the method needs self, then you probably need an instance method.

  • Implement a class method (1) when the method does not need access to instance variables and (2) when you do not want to require the caller of the method to create an instance and (3) when the method needs access to class variables. A class method may be called on either an instance or the class. A class method gets the class as a first argument, whether it is called on the class or the instance. If the method needs access to the class but does not need self, then think class method.

  • Implement a static method if you merely want to put the code of the method within the scope of the class, perhaps for purposes of organizing your code, but the method needs access to neither class nor instance variables (though you can access class variables through the class itself). A static method may be called on either an instance or the class. A static method gets neither the class nor the instance as an argument.

To summarize:

  • Implement an instance method, unless ...

  • ... the method needs access to class variables but not instance variables, then implement a class method, unless ...

  • ... the method needs access to neither instance variables nor class variables and you still want to include it within the class definition, then implement a static method.

  • Above all, write clear, plain code that will be understandable to your readers. Do not use a more confusing language feature and do not force your readers to learn a new language feature unles you have a good reason.

2.5.3 Modules

You can use a module to organize a number of Python definitions in a single file. Here is an example:

# python_101_module_simple.py

"""
This simple module contains definitions of a class and several 
functions.
"""

LABEL = '===== Testing a simple module ====='

class Person:
    """Sample of a simple class definition.
    """
    def __init__(self, name, description):
        self.name = name
        self.description = description
    def show(self):
        print 'Person -- name: %s  description: %s' % (self.name, self.description)

def test(msg, count):
    """A sample of a simple function.
    """
    for idx in range(count):
        print '%s %d' % (msg, idx)

def testDefaultArgs(arg1='default1', arg2='default2'):
    """A function with default arguments.
    """
    print 'arg1:', arg1
    print 'arg2:', arg2

def testArgLists(*args, **kwargs):
    """
    A function which references the argument list and keyword arguments.
    """
    print 'args:', args
    print 'kwargs:', kwargs

def main():
    """
    A test harness for this module.
    """
    print LABEL
    person = Person('Herman', 'A cute guy')
    person.show()
    print '=' * 30
    test('Test #', 4)
    print '=' * 30
    testDefaultArgs('Explicit value')
    print '=' * 30
    testArgLists('aaa', 'bbb', arg1='ccc', arg2='ddd')

if __name__ == '__main__':
    main()

Running the above produces the following output:

===== Testing a simple module =====
Person -- name: Herman  description: A cute guy
==============================
Test # 0
Test # 1
Test # 2
Test # 3
==============================
arg1: Explicit value
arg2: default2
==============================
args: ('aaa', 'bbb')
kwargs: {'arg1': 'ccc', 'arg2': 'ddd'}

Comments:

  • The string definitions at the beginning of each of the module, class definitions, and function definitions serve as documentation for these items. You can show this documentation with the following from the command-line:

    $ pydoc python_101_module_simple
    

    Or this, from the Python interactive prompt:

    >>> import python_101_module_simple
    >>> help(python_101_module_simple)
    

  • It is common and it is a good practice to include a test harness for the module at the end of the source file. Note that the test "if __name__ == '__main__':" will be true only when the file is run (e.g. from the command-line with "$ python python_101_module_simple.py"), but not when the module is imported.

  • Remember that the code in module is only evaluated the first time it is imported in a program. So, for example, global variables in a module cause behavior that users of the module might not expect.

  • Constants, on the other hand, are safe. A constant, in Python, is a variable whose value is initialized but not changed. An example is LABEL, above.

2.5.4 Packages

A package is a way to organize a number of modules together as a unit. Python packages can also contain other packages.

To give us an example to talk about, consider the follow package structure:

package_example/
package_example/__init__.py
package_example/module1.py
package_example/module2.py
package_example/A.py
package_example/B.py

And, here are the contents:

# __init__.py

# Expose definitions from modules in this package.
from module1 import class1
from module2 import class2
# module1.py

class class1:
    def __init__(self):
        self.description = 'class #1'
    def show(self):
        print self.description
# module2.py

class class2:
    def __init__(self):
        self.description = 'class #2'
    def show(self):
        print self.description
# A.py

import B
# B.py

def function_b():
    print 'Hello from function_b'

In order to be used as a Python package (e.g. so that modules can be imported from it) a directory must contain a file whose name is __init__.py. The code in this module is evaluated the first time a module is imported from the package.

In order to import modules from a package, you may either add the package directory to sys.path or, if the parent directory is on sys.path, use dot-notation to explicitly specify the path. In our example, you might use: "import package_example.module1".

A module in a package can import another module from the same package directly without using the path to the package. For example, the module A in our sample package package_example can import module B in the same package with "import B". Module A does not need to use "import package_example.B".

You can find additional information on packages at http://www.python.org/doc/essays/packages.html.

Suggested techniques:

  • In the __init__.py file, import and make available objects defined in modules in the package. Our sample package package_example does this. Then, you can use "from package_example import *" to import the package and its contents. For example:

    >>> from package_example import *
    >>> dir()
    ['__builtins__', '__doc__', '__file__', '__name__',
    'atexit', 'class1', 'class2', 'module1', 'module2',
    'readline', 'rlcompleter', 'sl', 'sys']
    >>>
    >>> c1 = class1()
    >>> c2 = class2()
    >>> c1.show()
    class #1
    >>> c2.show()
    class #2
    

A few additional notes:

  • With Python 2.3, you can collect the modules in a package into a Zip file by using PyZipFile from the Python standard library. See http://www.python.org/doc/current/lib/pyzipfile-objects.html.

    >>> import zipfile
    >>> a = zipfile.PyZipFile('mypackage.zip', 'w', zipfile.ZIP_DEFLATED)
    >>> a.writepy('Examples')
    >>> a.close()
    

    Then you can import and use this archive by inserting its path in sys.path. In the following example, class_basic_1 is a module within package mypackage:

    >>> import sys
    >>> sys.path.insert(0, '/w2/Txt/Training/mypackage.zip')
    >>> import class_basic_1
    Basic -- name: Apricot
    >>> obj = class_basic_1.Basic('Wilma')
    >>> obj.show()
    Basic -- name: Wilma
    


End Matter

Acknowledgements and Thanks

Thanks to the implementors of Python for producing an exceptionally usable and enjoyable programming language.

See Also:

The main Python Web Site
for more information on Python

Python Documentation
for lots of documentation on Python

The Python XML Special Interest Group
for more information on processing XML with Python

Dave's Web Site
for more software and information on using Python for XML and the Web

About this document ...

Python 101 -- Introduction to Python, June 6, 2003, Release 1.00

This document was generated using the LaTeX2HTML translator.

LaTeX2HTML is Copyright © 1993, 1994, 1995, 1996, 1997, Nikos Drakos, Computer Based Learning Unit, University of Leeds, and Copyright © 1997, 1998, Ross Moore, Mathematics Department, Macquarie University, Sydney.

The application of LaTeX2HTML to the Python documentation has been heavily tailored by Fred L. Drake, Jr. Original navigation icons were contributed by Christopher Petrilli.

你可能感兴趣的:(The most classical topic for Python novices)