堆:(二叉)堆是一个数组,它可以被看成一个近似的完全二叉树。
最大堆性质:除了根以外的所有结点 i i i都要满足: A [ P A R E N T ( i ) ] > = A [ i ] A[PARENT(i)]>=A[i] A[PARENT(i)]>=A[i]
最小堆性质:除了根以外的所有结点 i i i都要满足: A [ P A R E N T ( i ) ] < = A [ i ] A[PARENT(i)]<=A[i] A[PARENT(i)]<=A[i]
结点的高度:该结点到叶节点最长简单路径上边的数目。
常用表示:
堆的结点数: h e a p s i z e = n heapsize=n heapsize=n
堆的根结点个数: ⌊ n / 2 ⌋ \lfloor n/2\rfloor ⌊n/2⌋
堆的高度: ⌊ lg n ⌋ \lfloor \lg n\rfloor ⌊lgn⌋
高度为 h h h的堆最多包含结点数: ⌈ n / 2 k + 1 ⌉ \lceil n/2^{k+1}\rceil ⌈n/2k+1⌉
父结点: i i i
左子结点: 2 i 2i 2i
右子结点: 2 i + 1 2i+1 2i+1
MAX-HEAPIFY是用于维护最大堆性质的重要过程。
给定最大堆A中的一个元素i,它有可能不符合最大堆的性质,即A[i]有可能小于它的孩子,但是它的两棵子树已经满足了最大堆的性质。我们通过一个MAX-HEAPIFY过程来使得以A[i]为根的子树满足最大堆性质,MAX-HEAPIFY是通过让A[i]在最大堆中**“逐层下降”**来达成这一目标的。
伪代码:MAX-HEAPIFY(A,i)
l = LEFT(i)
r = RIGHT(i)
if l <= A.heap-size and A[l]>A[i]
largest = l
else largest = i
if r<= A.heap-size and A[r]>A[largest]
largest = r
if larget != i
exchange A[i] with A[largest]
MAX-HEAPIFY(A,largest)
python代码:
def max_heapify(A, i):
heap_size = len(A)
left = 2 * i
right = 2 * i + 1
if left <= heap_size and A[left] > A[i]:
largest = left
else:
largest = i
if right <= heap_size and A[right] > A[largest]:
largest = right
if largest != i:
temp = A[largest]
A[largest] = A[i]
A[i] = temp
max_heapify(A, largest)
return A
B = [16, 4, 10, 14, 7, 9, 3, 2, 8, 1]
k = 1
print(max_heapify(B, k))
运行时间: T ( n ) = O ( lg n ) T(n)=O(\lg n) T(n)=O(lgn)
时间复杂度: O ( h ) O(h) O(h)
初始化:在第一次循环之前, i = ⌊ n / 2 ⌋ i=\lfloor n/2\rfloor i=⌊n/2⌋,而 i = ⌊ n / 2 ⌋ + 1 , i = ⌊ n / 2 ⌋ + 2 , ⋅ ⋅ ⋅ , n i=\lfloor n/2\rfloor+1,i=\lfloor n/2\rfloor+2,···,n i=⌊n/2⌋+1,i=⌊n/2⌋+2,⋅⋅⋅,n都是叶结点,因而是平凡最大堆的根结点。
保持:每次迭代维护这个循环的不变量,即满足最大堆性质。
终止:过程终止时,满足最大堆性质。
伪代码:BUILD-MAX-HEAP(A)
A.heap-size = A.length
for i = (A.length/2)的下界 downto 1
MAX-HEAPIFY(A,i)
python代码:
def build_max_heap(A):
heap_size = len(A)
for i in range(1, heap_size//2):
max_heapify(A, i)
return A
B = [16, 4, 10, 14, 7, 9, 3, 2, 8, 1]
print(build_max_heap(B))
步骤:
伪代码:HEAPSORT(A)
BUILD-MAX-HEAP(A)
for i = A.length downto 2
exchange A[1] with A[i]
A.heap-size = A.heap-size-1
MAX-HEAPIFY(A,1)
python代码:
def max_heapify(A, i):
heap_size = A[0]
left = 2 * i
right = 2 * i + 1
if left <= heap_size and A[left] > A[i]:
largest = left
else:
largest = i
if right <= heap_size and A[right] > A[largest]:
largest = right
if largest != i:
temp = A[largest]
A[largest] = A[i]
A[i] = temp
max_heapify(A, largest)
return A
def build_max_heap(A):
heap_size = len(A)
for i in range(1, heap_size//2):
max_heapify(A, i)
return A
def heapsort(A):
build_max_heap(A)
for i in range(len(A)-1, 1, -1):
temp = A[1]
A[1] = A[i]
A[i] = temp
A[0] = A[0] - 1
max_heapify(A, 1)
return A
B = [10, 16, 4, 10, 14, 7, 9, 3, 2, 8, 1]
print(heapsort(B))
优先队列:优先队列是一种用来维护由一组元素构成的集合S的数据结构。
操作:
伪代码:
MAXIMUM(S)
BUILD-MAX-HEAP(A)
return A[1]
EXTRACT-MAX(S)
if A.heap-size < 1
error"heap underflow"
max = A[1]
A[1] = A[A.heap-size]
A.heap-size = A.heap-size - 1
MAX-HEAPIFY(A,1)
return max
INCREASE-KEY(S,X,K)
if key < A[i]
error"newe key is smaller than current key"
A[i] = key
while i>1 and A[PARENT(i)]
过程:
伪代码:
QUICKSORT(A, p, r)
if p < r
q = PARTITION(A, p, r)
QUICKSORT(A, p, q-1)
QUICKSORT(A, q+1, r)
PATITION(A, p, r)
x = A[r]
i = p - 1
for j = p to r-1
if A[j]<=x
i = i + 1
exchange A[i] with A[j]
exchange A[i+1] with A[j]
return i + 1
python代码::
def quicksort(array):
if len(array) < 2:
return array
else:
pivot = array[0]
less = [i for i in array[1:] if i <= pivot]
greater = [i for i in array[1:] if i > pivot]
return quicksort(less) + [pivot] + quicksort(greater)
print(quicksort([10, 16, 4, 10, 14, 7, 9, 3, 2, 8, 1]))
伪代码:
RANDOMIZED-PARTITION(A, p, r)
i = RANDOM(p, r)
exchange A[r] with A[i]
return PARTITION(A, p, r)
RANDOMIZED-QUICKSORT(A, p, r)
if p < r
q = RANDOMIZED-PARTITION(A, p, r)
RANDOMIZED-QUICKSORT(A, p, q-1)
RANDOMIZED-QUICKSORT(A, q+1, r)
思想:对每个输入元素x,确定小于x的元素个数。该数即为元素x在输出数组中的位置。
伪代码:COUNTING-SORT(A, B, k)
\\A为原数组,B用来存放排序输出,k为A中元素最大值
let C[0..k] be a new array \\提供临时存储空间
for i = 0 to k \\初始化C数组
C[i] = 0
for j = 1 to A.length \\让C数组中元素位置代表A中元素大小
C[A[j]] = C[A[j]] + 1
for i = 1 to k
C[i] = C[i] + C[i-1]\\统计小于元素C[i]的元素数量
for j = A.length downto 1
B[C[A[j]]]=A[j]
C[A[j]] = C[A[j]]-1
python代码:
def counting_sort(old_array, new_array, k):
temp_array = k * [0]
for j in range(0, len(old_array)):
temp_array[old_array[j]] = temp_array[old_array[j]] + 1
for i in range(1, k):
temp_array[i] = temp_array[i] + temp_array[i-1]
for j in range(len(old_array)-1, 0, -1):
new_array[temp_array[old_array[j]]] = old_array[j]
temp_array[old_array[j]] = temp_array[old_array[j]] - 1
return new_array
A = [16, 4, 10, 14, 7, 9, 3, 2, 8, 1]
B = len(A) * [0]
k = 30
print(counting_sort(A, B, k))
思想:假设输入数据服从均匀分布(由随机过程产生)。将待排序的数据分到几个有序的桶里,每个桶的数据单独排序,桶内排完序后,再按顺序依次取出,组成有序序列。
伪代码:BUCKET-SORT(A)
n = A.length
let B[0..n-1] be a new array
for i = 0 to n-1
make B[i] an empty list
for i = 1 to n
insert A[i] into list B[(nA[i])的向下取整]
for i = 0 to n-1
sort list B[i] with insertion sort
concatenate the lists B[0],B[1],···,B[n-1] together in order
python代码:
import math
import numpy as np
##### 插入排序
def insert_sort(mylist):
length = len(mylist) #获取列表长度
for i in range(1,length):
j = i - 1 #设置当前值前一个元素的标识
if(mylist[i] < mylist[j]): #如果当前值小于前一个元素,则将当前值作为一个临时变量存储,将前一个元素后移一位
temp = mylist[i]
mylist[i] = mylist[j]
j = j-1 #继续往前寻找,如果有比临时变量大的数字,则后移一位,直到找到比临时变量小的元素或者达到列表第一个元素
while j>=0 and mylist[j] > temp:
mylist[j+1] = mylist[j]
j = j-1
mylist[j+1] = temp #将临时变量赋值给合适位置
return mylist
##### 桶排序
def bucket_sort(old_list):
n = len(old_list)
new_list = n * []
for i in range(1, n):
p = math.floor(n*old_list[i])
new_list.insert(p, old_list[i])
for i in range(0, n-1):
insert_sort(new_list)
return new_list
A = np.random.random(10)
print(bucket_sort(A))
运行时间: θ ( n ) \theta(n) θ(n)
寻找最小值:依次遍历集合中的每个元素,并记录下当前最小元素。
伪代码:MINIMUM(A)
min = A[0]
for i = 2 to A.length
if min > A[i]
min = A[i]
return min
python代码:
def minimum(A):
min = A[0]
for i in range(1, len(A)):
if min > A[i]:
min = A[i]
return min
A = [10, 16, 4, 10, 14, 7, 9, 3, 2, 8, 1]
print(minimum(A))
伪代码:RANDOMIZED-SELECT(A, p, r, i)
if p == r
return A(p)
q = RANDOMIZED-PATITION(A, p, r)
k = q - p + 1
if i == k
return A[q]
else if i < k
return RANDOMIZED-SELECT(A, p, q-1, i)
else return RANDOMIZED-SELECT(A, q+1, r, i-k)
最坏情况运行时间: θ ( n 2 ) \theta(n^2) θ(n2)