计算机科学和Python编程导论week5

算法复杂度

不再使用毫秒测量时间,而是以程序执行的基本步数为单位进行测量。而人们通常最关注最差情形。

通常对于规模特别大的输入,我们才会担心算法的效率。作为一种对“特别大”的表示方法,渐近表示法描述了输入规模
趋近于无穷大时的算法复杂度。

渐进复杂度

# 书中例子练习如下：

>>> def f(x):
...         """假设x是正整数"""
...         ans = 0
...         #常数时间循环
...         for i in range(1000):
...                 ans += 1
...         print('Number of additions so far', ans)
...         #x时间循环
...         for i in range(x):
...                 ans += 1
...         print('Number of additions so far', ans)
...         #x**2时间的嵌套循环
...         for i in range(x):
...                 for j in range(x):
...                         ans += 1
...                         ans += 1
...         print('Number of additions so far', ans)
...         return ans
... 
# 调用10次
>>> f(10)
Number of additions so far 1000
Number of additions so far 1010
Number of additions so far 1210
1210

#调用1000次数
>>> f(1000)
Number of additions so far 1000
Number of additions so far 2000
Number of additions so far 2002000
2002000

如果假设执行每行代码需要一个单位时间,那么这个函数的运行时间可以描述为1000 + x +2x^2 。常数1000对应着第一个循环执行的次数。x项对应着第二个循环执行的次数。

通过上面的分析,我们可以使用以下规则描述算法的渐近复杂度:
1、如果运行时间是一个多项式的和,那么保留增长速度最快的项,去掉其他各项;
2、如果剩下的项是个乘积,那么去掉所有常数。

1、常数复杂度

2、对数复杂度

>>> def intToStr(i):
...     """假设i是非负整数，返回一个表示i的十进制字符串"""
...     digits = '0123456789'
...     if i == 0:
...             return '0'
...     result = ''
...     while i > 0:
...             result = digits[i%10] + result
...             i = i//10
...     return result
... 
>>> intToStr(10)
'10'

# intToStr 的复杂度是O(log(i))。

3、线性复杂度

>>> def addDigits(s):
...     """假设s是字符串，其中每个字符都是十进制数。decimal digit. 返回s中所有数值之和"""
...     val = 0 
...     for c in s:
...             val += int(c)
...     return val
... 
>>> addDigits('12345')
15

# 我们仍然假设表示数字的字符在常数时间内被转换为整数,那么这个函数的复杂度就与s的长度成线性关系,也就是O(len(s))。

4、对数线性复杂度

5、多项式复杂度

>>> def isSubset(L1, L2):
...     """假设L1和L2是列表。
...     如果L1中的每个元素也在L2中出现,则返回True
...     否则返回False。"""
...     for e1 in L1:
...             matched = False
...             for e2 in L2:
...                     if e1 == e2:
...                             matched = True
...                             break
...             if not matched:
...                     return False
...     return True
... 
>>> isSubset([1,2,3],[3])
False
>>> isSubset([1,2,3],[3,2,1,6,9])
True
>>> 
"""
程序每次执行到内层循环时,内层循环都要执行O(len(L2))次。函数 isSubset 要执行外部循
环O(len(L1))次,所以执行到内层循环的次数也是O(len(L1))。因此,函数 isSubset 的复杂度是
O(len(L1))*O(len(L2))。
"""

6、指数复杂度

>>> def getBinaryRep(n, numDigits):
...     """
...     假设n和numDigits为非负整数
...     返回一个长度为numDigits的字符串,为n的二进制表示
...     """
...     result = ''
...     while n > 0:
...             result = str(n%2) + result
...             n = n//2
...     if len(result) > numDigits:
...             raise ValueError('not enough digits')
...     for i in range(numDigits - len(result)):
...             result = '0' + result
...     return result
... 
>>> def genPowerset(L):
...     """
...     假设L是列表
...     返回一个列表,包含L中元素所有可能的集合。例如,如果L=[1, 2],则返回的列表包含元素[1]、[2]和[1,2]
...     """
...     powerset = []
...     for i in range(0, 2**len(L)):
...             binStr = getBinaryRep(i, len(L))
...             subset = []
...             for j in range(len(L)):
...                     if binStr[j] == '1':
...                             subset.append(L[j])
...             powerset.append(subset)
...     return powerset
... 
>>> genPowerset(['x','y',1])
[[], [1], ['y'], ['y', 1], ['x'], ['x', 1], ['x', 'y'], ['x', 'y', 1]]
"""
函数 genPowerset(L) 具有指数复杂度，函数中调用getBinaryRep()函数。
返回一个列表的列表,包含 L 中元素所有可能的组合。
例如,如果 L 是['x', 'y'] ,那么 L 的幂集就是包含 [ ] 、 ['x'] 、 ['y'] 和 ['x', 'y'] 这些列表的列表。
"""

一些简单算法和数据结构

下面列出了一些最常用的大O表示法实例。 n表示函数的输入规模。
1、O(1)表示常数运行时间。
2、 O(logn)表示对数运行时间。
3、O(n)表示线性运行时间。
4、 O(nlogn)表示对数线性运行时间。
5、 O(nk)表示多项式运行时间，注意k是常数。
6、 O(cn)表示指数运行时间，这时常数c为底数，复杂度为c的n次方。

简单算法及数据结构

1、归并排序

# 书中的这个例子，自己练习完后发现无输出，应该是哪里出现问题了。
>>> def merge(left, right, compare):
...     """
...     假设left和right是两个有序列表,compare定义了一种元素排序规则。
...     返回一个新的有序列表(按照compare定义的顺序),其中包含与
...     (left+right)相同的元素。"""
...     result = []
...     i,j = 0, 0
...     while i < len(left) and j < len(right):
...             if compare(left[i], right[j]):
...                     result.append(left[i])
...                     i += 1
...             else:
...                     result.append(right[j])
...     j += 1
...     while (i < len(left)):
...             result.append(left[i])
...             i += 1
...     while (j < len(right)):
...             result.append(right[j])
...             j += 1
...     return result
... 
>>> def mergeSort(L, compare = lambda x, y: x < y):
...     """
...     假设L是列表,compare定义了L中元素的排序规则
...     on elements of L
...     返回一个新的具有L中相同元素的有序列表。"""
...     if len(L) < 2:
...             return L[:]
...     else:
...             middle = len(L)//2
...             left = mergeSort(L[:middle], compare)
...             right = mergeSort(L[middle:], compare)
...             return merge(left, right, compare)

2、将函数用作参数

>>> def lastNameFirstName(name1, name2):
...     arg1 = name1.split(' ')
...     arg2 = name2.split(' ')
...     if arg1[1] != arg2[1]:
...             return arg1[1] < arg2[1]
...     else: #姓相同,则按照名排序
...             return arg1[0] < arg2[0]
... 
>>> def firstNameLastName(name1, name2):
...     arg1 = name1.split(' ')
...     arg2 = name2.split(' ')
...     if arg1[0] != arg2[0]:
...             return arg1[0] < arg2[0]
...     else: #名相同,则按照姓排序
...             return arg1[1] < arg2[1]
... 
>>> L = ['Tom Brady', 'Eric Grimson', 'Gisele Bundchen']
>>> newL = mergeSort(L, lastNameFirstName)
print('Sorted by last name =', newL)

# 也是按照书中的例子输入完之后，并没有输出。也应该是哪里有问题，可能是版本问题。

3、python中的排序

>>> L = [3,5,2]
>>> L
[3, 5, 2]
>>> sorted(L)
[2, 3, 5]
>>> D = {'a':12,'c':5,'b':'dog'}
>>> L.sort()
>>> print(L)
[2, 3, 5]
>>> L
[2, 3, 5]
>>> sorted(D)
['a', 'b', 'c']
>>> D.sort()
Traceback (most recent call last):
  File "", line 1, in 
AttributeError: 'dict' object has no attribute 'sort'

视频中的一些练习

对列表进行升序排序

In [27]: def swapSort(L): 
    ...:     """ L is a list on integers """
    ...:     print ("Original L: ", L)
    ...:     for i in range(len(L)):
    ...:         for j in range(i+1, len(L)):
    ...:             if L[j] < L[i]:
    ...:                 # the next line is a short 
    ...:                 # form for swap L[i] and L[j]
    ...:                 L[j], L[i] = L[i], L[j] 
    ...:                 print (L)
    ...:     print ("Final L: ", L)
    ...:     

In [28]: L = [3,4,1,2,9,8]

In [30]: swapSort(L)
Original L:  [3, 4, 1, 2, 9, 8]
[1, 4, 3, 2, 9, 8]
[1, 3, 4, 2, 9, 8]
[1, 2, 4, 3, 9, 8]
[1, 2, 3, 4, 9, 8]
[1, 2, 3, 4, 8, 9]
Final L:  [1, 2, 3, 4, 8, 9]

变为降序排序

注意与上面进行区分，以及复杂度之间的变化

In [31]: def modSort(L): 
    ...:     """ L is a list on integers """
    ...:     print ("Original L: ", L)
    ...:     for i in range(len(L)):
    ...:         for j in range(len(L)):  #与上面的代码相比，这里进行一下变更，
    ...:             if L[j] < L[i]:
    ...:                 # the next line is a short 
    ...:                 # form for swap L[i] and L[j]
    ...:                 L[j], L[i] = L[i], L[j] 
    ...:                 print (L)
    ...:     print ("Final L: ", L)
    ...:     

In [32]: modSort(L)
Original L:  [1, 2, 3, 4, 8, 9]
[2, 1, 3, 4, 8, 9]
[3, 1, 2, 4, 8, 9]
[3, 2, 1, 4, 8, 9]
[4, 2, 1, 3, 8, 9]
[4, 3, 1, 2, 8, 9]
[4, 3, 2, 1, 8, 9]
[8, 3, 2, 1, 4, 9]
[8, 4, 2, 1, 3, 9]
[8, 4, 3, 1, 2, 9]
[8, 4, 3, 2, 1, 9]
[9, 4, 3, 2, 1, 8]
[9, 8, 3, 2, 1, 4]
[9, 8, 4, 2, 1, 3]
[9, 8, 4, 3, 1, 2]
[9, 8, 4, 3, 2, 1]
Final L:  [9, 8, 4, 3, 2, 1]

第10讲内存和查找

L10 PROBLEM 4

In [16]: def search(L, e):
    ...:     for i in range(len(L)):
    ...:         if L[i] == e:
    ...:             return True
    ...:         if L[i] > e:
    ...:             return False
    ...:     return False
    ...: 

In [17]: def search3(L, e):
    ...:     if L[0] == e:
    ...:         return True
    ...:     elif L[0] > e:
    ...:         return False
    ...:     else:
    ...:         return search3(L[1:], e)

答案给的是：
search and search3 return the same answers provided L is non-empty and e is in L

实际上下面这样输出结果也是一致的
In [18]: search([2,3,4,8],5)
Out[18]: False

In [19]: search3([2,3,4,8],5)
Out[19]: False

参考链接：
计算机科学和Python编程导论