Python基础学习笔记之(一)

Python基础学习笔记之(一)

[email protected]

http://blog.csdn.net/zouxy09

 

       前段时间参加微软的windows Azure云计算的一个小培训,其中Python被用的还是蛮多的。另外,一些大公司如Google(实现web爬虫和搜索引擎中的很多组件),Yahoo(管理讨论组),NASA,YouTube(视频分享服务大部分由Python编写)等等对Python都很青睐。而国内的豆瓣可以说是给Python予千万宠爱了,它的前台后台清一色的都是Python的身影。另外,我们计算机视觉这块用的很频繁的OpenCV也提供了Python的接口,网上还提供了不少Python的机器学习的库(例如milk,scikit-learn,Pylearn2等),Deep learning的一个知名的Python的库theano,自然语言处理的库NLTK。此外,Python为数学、科学、工程和绘图等提供了有趣的标准库(例如,NumPy ,SciPy和matplotlib等),这使得一部分Matlab的使用者慢慢的倒戈到Python阵营,没办法,谁叫Matlab贵呢!当然了,Python不仅是免费,它还具有Matlab较弱或者没有的一些其他的功能,例如文件管理、界面设计、网络通信等。这就使得Python占有的用户群更广。

       Python是“蟒蛇”的意思,这个名字是有点故事的。当然了,这个得追溯到它的发明者Guido van Rossum。在1989年圣诞节期间,Guido身处阿姆斯特丹。这个都市的美丽和繁华没能填满Guido空虚的内心,周边的喧闹使他内心的寂寞无处安放(夜的黑已不再纯粹,哈哈)。为了打发这种无趣,Guido决心为他之前孕育的ABC语言开发一个插件,这个插件就是大名鼎鼎的Python(男人因为孤独而优秀啊!)。一个新的脚本解释程序就此诞生。在给这个孩子起名字的时候,Guido取其所爱的一个叫Monty Python的喜剧团体的“Python”。

Python基础学习笔记之(一)_第1张图片

       官方点说,Python是一种解释型、面向对象、动态数据类型的高级程序设计语言。Python语法简洁而清晰,具有丰富和强大的类库。它常被昵称为胶水语言,它能够很轻松的把用其他语言制作的各种模块(尤其是C/C++)轻松地联结在一起。自从20世纪90年代初Python语言诞生至今,它逐渐被广泛应用于处理系统管理任务和Web编程。Python已经成为最受欢迎的程序设计语言之一。2011年1月,它被TIOBE编程语言排行榜评为2010年度语言。自从2004年以后,python的使用率是呈线性增长。

       这篇博文是我看了“中谷教育-Python视频教程”的一个笔记,记录的是一些Python的学习基础,整理到这里,一是总结,二是备查。网上也有比较好的教程,例如“Python 精要参考”。

 

目录

一、安装、编译与运行

二、变量、运算与表达式

三、数据类型

      1、数字

      2、字符串

      3、元组

      4、列表

      5、字典

四、流程控制

      1、if-else

      2、for

      3、while

      4、switch

五、函数

      1、自定义函数

      2、Lambda函数

      3、Python内置函数

六、包与模块

      1、模块module

      2、包package

七、正则表达式

      1、元字符

      2、常用函数

      3、分组 

      4、一个小实例-爬虫

八、深拷贝与浅拷贝

九、文件与目录

      1、文件读写

      2、OS模块

      3、目录遍历

十、异常处理

 

一、安装、编译与运行

       Python的安装很容易,直接到官网:http://www.python.org/下载安装就可以了。Ubuntu一般都预安装了。没有的话,就可以#apt-get install python。Windows的话直接下载msi包安装即可。Python 程序是通过解释器执行的,所以安装后,可以看到Python提供了两个解析器,一个是IDLE (Python GUI),一个是Python (command line)。前者是一个带GUI界面的版本,后者实际上和在命令提示符下运行python是一样的。运行解释器后,就会有一个命令提示符>>>,在提示符后键入你的程序语句,键入的语句将会立即执行。就像Matlab一样。

       另外,Matlab有.m的脚步文件,python也有.py后缀的脚本文件,这个文件除了可以解释执行外,还可以编译运行,编译后运行速度要比解释运行要快。

       例如,我要打印一个helloWorld。

方法1:直接在解释器中,>>> print ‘helloWorld’。

方法2:将这句代码写到一个文件中,例如hello.py。运行这个文件有三种方式:

1)在终端中:python hello.py

2)先编译成.pyc文件:

import py_compile

py_compile.compile("hello.py")

再在终端中:python hello.pyc

3)在终端中:

python -O -m py_compile hello.py

python hello.pyo

       编译成.pyc和.pyo文件后,执行的速度会更快。所以一般一些重复性并多次调用的代码会被编译成这两种可执行的方式来待调用。

 

二、变量、运算与表达式

         这里没什么好说的,有其他语言的编程基础的话都没什么问题。和Matlab的相似度比较大。这块差别不是很大。具体如下:

Python基础学习笔记之(一)_第2张图片

         需要注意的一个是:5/2 等于2,5.0/2才等于2.5。

###################################
### compute #######
# raw_input() get input from keyboard to string type
# So we should transfer to int type
# Some new support computing type:
# and or not in is < <= != == | ^ & << + - / % ~ **
print 'Please input a number:'
number = int(raw_input()) 
number += 1
print number**2 # ** means ^
print number and 1
print number or 1
print not number
5/2 # is 2
5.0/2 # is 2.5, should be noted

三、数据类型

1、数字

         通常的int, long,float,long等等都被支持。而且会看你的具体数字来定义变量的类型。如下:

###################################
### type of value #######
# int, long, float
# do not need to define the type of value, python will
# do this according to your value
num = 1   # stored as int type
num = 1111111111111   # stored as long int type
num = 1.0   # stored as float type
num = 12L # L stands for long type
num = 1 + 12j # j stands for complex type
num = '1' # string type

2、字符串

         单引号,双引号和三引号都可以用来定义字符串。三引号可以定义特别格式的字符串。字符串作为一种序列类型,支持像Matlab一样的索引访问和切片访问。

###################################
### type of string #######
num = "1" # string type
num = "Let's go" # string type
num = "He's \"old\"" # string type
mail = "Xiaoyi: \n hello \n I am you!"
mail = """Xiaoyi:
	hello
	I am you!
	""" # special string format
string = 'xiaoyi' # get value by index
copy = string[0] + string[1] + string[2:6] # note: [2:6] means [2 5] or[2 6)
copy = string[:4] # start from 1
copy = string[2:] # to end
copy = string[::1] # step is 1, from start to end
copy = string[::2] # step is 2
copy = string[-1] # means 'i', the last one
copy = string[-4:-2:-1] # means 'yoa', -1 step controls direction
memAddr = id(num) # id(num) get the memory address of num
type(num) # get the type of num

3、元组

         元组tuple用()来定义。相当于一个可以存储不同类型数据的一个数组。可以用索引来访问,但需要注意的一点是,里面的元素不能被修改。

###################################
### sequence type #######
## can access the elements by index or slice
## include: string, tuple(or array? structure? cell?), list
# basis operation of sequence type
firstName = 'Zou'
lastName = 'Xiaoyi'
len(string) # the length
name = firstName + lastName # concatenate 2 string
firstName * 3 # repeat firstName 3 times
'Z' in firstName # check contain or not, return true
string = '123'
max(string)
min(string)
cmp(firstName, lastName) # return 1, -1 or 0

## tuple(or array? structure? cell?)
## define this type using ()
user = ("xiaoyi", 25, "male")
name = user[0]
age = user[1]
gender = user[2]
t1 = () # empty tuple
t2 = (2, ) # when tuple has only one element, we should add a extra comma
user[1] = 26 # error!! the elements can not be changed
name, age, gender = user # can get three element respectively
a, b, c = (1, 2, 3)

4、列表

         列表list用[]来定义。它和元组的功能一样,不同的一点是,里面的元素可以修改。List是一个类,支持很多该类定义的方法,这些方法可以用来对list进行操作。

## list type (the elements can be modified)
## define this type using []
userList = ["xiaoyi", 25, "male"]
name = userList[0]
age = userList[1]
gender = userList[2]
userList[3] = 88888 # error! access out of range, this is different with Matlab
userList.append(8888) # add new elements
"male" in userList # search
userList[2] = 'female' # can modify the element (the memory address not change)
userList.remove(8888) # remove element
userList.remove(userList[2]) # remove element
del(userList[1]) # use system operation api
## help(list.append)

################################
######## object and class ######
## object = property + method
## python treats anything as class, here the list type is a class,
## when we define a list "userList", so we got a object, and we use
## its method to operate the elements

5、字典

         字典dictionary用{}来定义。它的优点是定义像key-value这种键值对的结构,就像struct结构体的功能一样。它也支持字典类支持的方法进行创建和操作。

################################
######## dictionary type ######
## define this type using {}
item = ['name', 'age', 'gender']
value = ['xiaoyi', '25', 'male']
zip(item, value) # zip() will produce a new list: 
# [('name', 'xiaoyi'), ('age', '25'), ('gender', 'male')]
# but we can not define their corresponding relationship
# and we can define this relationship use dictionary type
# This can be defined as a key-value manner
# dic = {key1: value1, key2: value2, ...}, key and value can be any type
dic = {'name': 'xiaoyi', 'age': 25, 'gender': 'male'}
dic = {1: 'zou', 'age':25, 'gender': 'male'}
# and we access it like this: dic[key1], the key as a index
print dic['name']
print dic[1]
# another methods create dictionary
fdict = dict(['x', 1], ['y', 2]) # factory mode
ddict = {}.fromkeys(('x', 'y'), -1) # built-in mode, default value is the same which is none
# access by for circle
for key in dic
	print key
	print dic[key]

# add key or elements to dictionary, because dictionary is out of sequence,
# so we can directly and a key-value pair like this:
dic['tel'] = 88888 	
# update or delete the elements
del dic[1] # delete this key
dic.pop('tel') # show and delete this key
dic.clear() # clear the dictionary
del dic # delete the dictionary
dic.get(1) # get the value of key
dic.get(1, 'error') # return a user-define message if the dictionary do not contain the key
dic.keys()
dic.values()
dic.has_key(key)
# dictionary has many operations, please use help to check out

四、流程控制

         在这块,Python与其它大多数语言有个非常不同的地方,Python语言使用缩进块来表示程序逻辑(其它大多数语言使用大括号等)。例如:

if age < 21:

    print("你不能买酒。")

    print("不过你能买口香糖。")

print("这句话处于if语句块的外面。")

         这个代码相当于c语言的:

if (age < 21)

{

    print("你不能买酒。")

    print("不过你能买口香糖。")

}

print("这句话处于if语句块的外面。")

       可以看到,Python语言利用缩进表示语句块的开始和退出(Off-side规则),而非使用花括号或者某种关键字。增加缩进表示语句块的开始(注意前面有个:号),而减少缩进则表示语句块的退出。根据PEP的规定,必须使用4个空格来表示每级缩进(不清楚4个空格的规定如何,在实际编写中可以自定义空格数,但是要满足每级缩进间空格数相等)。使用Tab字符和其它数目的空格虽然都可以编译通过,但不符合编码规范。

       为了使我们自己编写的程序能很好的兼容别人的程序,我们最好还是按规范来,用四个空格来缩减(注意,要么都是空格,要是么都制表符,千万别混用)。

1、if-else

         If-else用来判断一些条件,以执行满足某种条件的代码。

################################
######## procedure control #####
## if else
if expression: # bool type and do not forget the colon
	statement(s) # use four space key 

if expression: 
statement(s) # error!!!! should use four space key 
	
if 1<2:
	print 'ok, ' # use four space key
	print 'yeah' # use the same number of space key
	
if True: # true should be big letter True
	print 'true'

def fun():
	return 1

if fun():
	print 'ok'
else:
	print 'no'
	
con = int(raw_input('please input a number:'))
if con < 2:
	print 'small'
elif con > 3:
	print 'big'
else:
	print 'middle'
	
if 1 < 2:
	if 2 < 3:
		print 'yeah'
	else:
		print 'no'	
	print 'out'
else:
	print 'bad'

if 1<2 and 2<3 or 2 < 4 not 0: # and, or, not
	print 'yeah'

2、for

         for的作用是循环执行某段代码。还可以用来遍历我们上面所提到的序列类型的变量。

################################
######## procedure control #####
## for
for iterating_val in sequence:
	statements(s)
# sequence type can be string, tuple or list

for i in "abcd":
	print i

for i in [1, 2, 3, 4]:
	print i

# range(start, end, step), if not set step, default is 1, 
# if not set start, default is 0, should be noted that it is [start, end), not [start, end]
range(5) # [0, 1, 2, 3, 4]
range(1, 5) # [1, 2, 3, 4]
range(1, 10, 2) # [1, 3, 5, 7, 9]
for i in range(1, 100, 1): 
	print i

# ergodic for basis sequence
fruits = ['apple', 'banana', 'mango']
for fruit in range(len(fruits)): 
	print 'current fruit: ', fruits[fruit]

# ergodic for dictionary
dic = {1: 111, 2: 222, 5: 555}
for x in dic:
	print x, ': ', dic[x]
	
dic.items() # return [(1, 111), (2, 222), (5, 555)]
for key,value in dic.items(): # because we can: a,b=[1,2]
	print key, ': ', value
else:
	print 'ending'

################################
import time
# we also can use: break, continue to control process
for x in range(1, 11):
	print x
	time.sleep(1) # sleep 1s
	if x == 3:
		pass # do nothing
	if x == 2:
		continue
	if x == 6:
		break
	if x == 7:	
		exit() # exit the whole program
	print '#'*50

3、while

         while的用途也是循环。它首先检查在它后边的循环条件,若条件表达式为真,它就执行冒号后面的语句块,然后再次测试循环条件,直至为假。冒号后面的缩近语句块为循环体。

################################
######## procedure control #####
## while
while expression:
	statement(s)

while True:
	print 'hello'
	x = raw_input('please input something, q for quit:')
	if x == 'q':
		break
else:
	print 'ending'

4、switch

         其实Python并没有提供switch结构,但我们可以通过字典和函数轻松的进行构造。例如:

#############################
## switch ####
## this structure do not support by python
## but we can implement it by using dictionary and function
## cal.py ##
#!/usr/local/python

from __future__ import division
# if used this, 5/2=2.5, 6/2=3.0

def add(x, y):
	return x + y
def sub(x, y):
	return x - y
def mul(x, y):
	return x * y
def div(x, y):
	return x / y

operator = {"+": add, "-": sub, "*": mul, "/": div}
operator["+"](1, 2) # the same as add(1, 2)
operator["%"](1, 2) # error, not have key "%", but the below will not
operator.get("+")(1, 2) # the same as add(1, 2)

def cal(x, o, y):
	print operator.get(o)(x, y)
cal(2, "+", 3)
# this method will effect than if-else


五、函数

1、自定义函数

         在Python中,使用def语句来创建函数:

################################
######## function #####	
def functionName(parameters): # no parameters is ok
	bodyOfFunction

def add(a, b):
	return a+b # if we do not use a return, any defined function will return default None 
	
a = 100
b = 200
sum = add(a, b)

##### function.py #####
#!/usr/bin/python
#coding:utf8  # support chinese
def add(a = 1, b = 2): # default parameters
	return a+b	# can return any type of data
# the followings are all ok
add()
add(2)
add(y = 1)
add(3, 4)

###### the global and local value #####
## global value: defined outside any function, and can be used
##				in anywhere, even in functions, this should be noted
## local value: defined inside a function, and can only be used
##				in its own function
## the local value will cover the global if they have the same name
val = 100 # global value
def fun():
	print val # here will access the val = 100
print val # here will access the val = 100, too

def fun():
	a = 100 # local value
	print a
print a # here can not access the a = 100

def fun():
	global a = 100 # declare as a global value
	print a

print a # here can not access the a = 100, because fun() not be called yet
fun()
print a # here can access the a = 100

############################
## other types of parameters
def fun(x):
	print x
# the follows are all ok
fun(10)	# int
fun('hello') # string
fun(('x', 2, 3))  # tuple
fun([1, 2, 3])	  # list
fun({1: 1, 2: 2}) # dictionary

## tuple
def fun(x, y):
	print "%s : %s" % (x,y) # %s stands for string
fun('Zou', 'xiaoyi')
tu = ('Zou', 'xiaoyi')
fun(*tu)	# can transfer tuple parameter like this

## dictionary
def fun(name = "name", age = 0):
	print "name: %s" % name
	print "age: " % age
dic = {name: "xiaoyi", age: 25} # the keys of dictionary should be same as fun()
fun(**dic) # can transfer dictionary parameter like this
fun(age = 25, name = 'xiaoyi') # the result is the same
## the advantage of dictionary is can specify value name

#############################
## redundancy parameters ####
## the tuple
def fun(x, *args): # the extra parameters will stored in args as tuple type 
	print x
	print args
# the follows are ok
fun(10)
fun(10, 12, 24) # x = 10, args = (12, 24)

## the dictionary
def fun(x, **args): # the extra parameters will stored in args as dictionary type 
	print x
	print args
# the follows are ok
fun(10)
fun(x = 10, y = 12, z = 15) # x = 10, args = {'y': 12, 'z': 15}

# mix of tuple and dictionary
def fun(x, *args, **kwargs):
	print x
	print args
	print kwargs
fun(1, 2, 3, 4, y = 10, z = 12) # x = 1, args = (2, 3, 4), kwargs = {'y': 10, 'z': 12}

2、Lambda函数

         Lambda函数用来定义一个单行的函数,其便利在于:

#############################
## lambda function ####
## define a fast single line function
fun = lambda x,y : x*y # fun is a object of function class
fun(2, 3)
# like
def fun(x, y):
	return x*y

## recursion
# 5=5*4*3*2*1, n!
def recursion(n):
	if n > 0:
		return n * recursion(n-1) ## wrong

def mul(x, y):
	return x * y
numList = range(1, 5)
reduce(mul, numList) # 5! = 120
reduce(lambda x,y : x*y, numList) # 5! = 120, the advantage of lambda function avoid defining a function

### list expression
numList = [1, 2, 6, 7]
filter(lambda x : x % 2 == 0, numList)
print [x for x in numList if x % 2 == 0] # the same as above
map(lambda x : x * 2 + 10, numList)
print [x * 2 + 10 for x in numList] # the same as above

3、Python内置函数

       Python内置了很多函数,他们都是一个个的.py文件,在python的安装目录可以找到。弄清它有那些函数,对我们的高效编程非常有用。这样就可以避免重复的劳动了。下面也只是列出一些常用的:

###################################
## built-in function of python ####
## if do not how to use, please use help()
abs, max, min, len, divmod, pow, round, callable,
isinstance, cmp, range, xrange, type, id, int()
list(), tuple(), hex(), oct(), chr(), ord(), long()

callable # test a function whether can be called or not, if can, return true
# or test a function is exit or not

isinstance # test type
numList = [1, 2]
if type(numList) == type([]):
	print "It is a list"
if isinstance(numList, list): # the same as above, return true
	print "It is a list"
	
for i in range(1, 10001) # will create a 10000 list, and cost memory
for i in xrange(1, 10001)# do not create such a list, no memory is cost

## some basic functions about string
str = 'hello world'
str.capitalize() # 'Hello World', first letter transfer to big
str.replace("hello", "good") # 'good world'
ip = "192.168.1.123"
ip.split('.') # return ['192', '168', '1', '123']
help(str.split)

import string
str = 'hello world'
string.replace(str, "hello", "good") # 'good world'

## some basic functions about sequence
len, max, min
# filter(function or none, sequence)
def fun(x):
	if x > 5:
		return True
numList = [1, 2, 6, 7]
filter(fun, numList) # get [6, 7], if fun return True, retain the element, otherwise delete it
filter(lambda x : x % 2 == 0, numList)
# zip()
name = ["me", "you"]
age = [25, 26]
tel = ["123", "234"]
zip(name, age, tel) # return a list: [('me', 25, '123'), ('you', 26, '234')]
# map()
map(None, name, age, tel) # also return a list: [('me', 25, '123'), ('you', 26, '234')]
test = ["hello1", "hello2", "hello3"]
zip(name, age, tel, test) # return [('me', 25, '123', 'hello1'), ('you', 26, '234', 'hello2')]
map(None, name, age, tel, test) # return [('me', 25, '123', 'hello1'), ('you', 26, '234', 'hello2'), (None, None, None, 'hello3')]
a = [1, 3, 5]
b = [2, 4, 6]
def mul(x, y):
	return x*y
map(mul, a, b) # return [2, 12, 30]
# reduce()
reduce(lambda x, y: x+y, [1, 2, 3, 4, 5]) # return ((((1+2)+3)+4)+5)


你可能感兴趣的:(Python基础学习笔记之(一))