归并排序算法的Python实现

思路

归并排序算法使用的是典型的分治思维。要对一个数组A排序,那么可以将这个数组分成两个部分BC,对BC分别排序后,再将BC按顺序进行归并。

这种分治的思想可以很轻松地应用到MapReduce架构。由于BC的排序过程是彼此独立的,因此可以进行并行运算(对应于Map的过程),而BC的归并过程则可以通过Reduce实现。

归并排序排序是一个递归的过程,需要将原始序列不停地拆分成两个小序列,直到序列中只有一个元素(此时自然是有序的),再逐层返回调用点进行两两归并。
归并排序算法的Python实现_第1张图片

为实现上述过程,需要编写两个函数:

  • merge()
    merge()函数用来实现两个子数组的归并,对应上图中的归并过程
  • mergeSort()
    mergeSort()函数用来实现递归调用

代码

最新代码请参考本人github

# merge sort


# assume that A is an array
# the goal is to sort an child sequnce of A(by ascending order)
# p is the start element index of the child sequnce
# r is the last element index of the child sequnce
# elements [p, q] is a sorted child sequence
# elements [q+1, r] is also a sorted child sequence
def merge(A, p, q, r):
    L = A[p:q+1]
    R = A[q+1:r+1]
    # append positive infinity as facility to aviod
    # L or R is out of elements during traversing
    # just for clean code
    L.append(float("inf"))
    R.append(float("inf"))
    print("L: ", L)
    print("R: ", R)

    idxL = 0
    idxR = 0
    for idxA in range(p, r+1):
        print("idxL: ", idxL, " idxR: ", idxR)
        if L[idxL] <= R[idxR]:
            A[idxA] = L[idxL]
            idxL = idxL + 1
        else:
            A[idxA] = R[idxR]
            idxR = idxR + 1
        print("round ", idxA, "of sorting result: ", A)


# recursion process
# A is the array to be sorted
# p is the index of the start element
# r is the index of the end element
def mergeSort(A, p, r):
    if p < r:
        # try to split equally
        q = (p + r)//2
        mergeSort(A, p, q)
        mergeSort(A, q+1, r)
        merge(A, p, q, r)


if __name__ == '__main__':
    print("--->Test function merge()...")
    A = [2, 4, 5, 7, 9, 1, 2, 3, 6]
    print("original seq: ", A)
    merge(A, 0, 4, 8)

    print("\n--->Test function mergeSort()...")
    B = [1, 3, 7, 2, 4, 9, 10, 1, 11, 12, 18, 9]
    mergeSort(B, 0, len(B)-1)

上述merge()函数的思想是:
由于参与merge的两个子序列已经是分别排好序的(假设按升序排列),那么只需要在每次循环中比较每个子序列中最小的值(也就是二者当前索引指向的值),将这两个值中较小的值插入到原始序列中去。两个子序列的长度加起来就是原始序列的长度N,因此遍历N次就可以完成这个过程。

上述代码中用到一个小技巧,在两个数组尾部中分别插入了一个正无穷大的值。这样做是为了防止其中一个子序列的元素已经遍历完了,索引溢出的情况。假设子序列B中的元素已经遍历完,那么此时B的索引指向正无穷大,由于子序列C中剩下的所有元素都比正无穷大要小,因此后面C中所有的元素都可以通过循环中同样的一段大小比较的代码逻辑,插入到原始数组A中,而不需要为其中一个子序列为空时编写另外的逻辑。

mergeSort()函数用来整合整个递归过程。注意,可以均分原始数组,也可以按自定义的方式切分原始数组,这里采用均分的方式。

下面是代码的输出,代码分别测试了merge()mergeSort()两个函数,可以通过打印看出归并递归的过程:

--->Test function merge()...
original seq:  [2, 4, 5, 7, 9, 1, 2, 3, 6]
L:  [2, 4, 5, 7, 9, inf]
R:  [1, 2, 3, 6, inf]
idxL:  0  idxR:  0
round  0 of sorting result:  [1, 4, 5, 7, 9, 1, 2, 3, 6]
idxL:  0  idxR:  1
round  1 of sorting result:  [1, 2, 5, 7, 9, 1, 2, 3, 6]
idxL:  1  idxR:  1
round  2 of sorting result:  [1, 2, 2, 7, 9, 1, 2, 3, 6]
idxL:  1  idxR:  2
round  3 of sorting result:  [1, 2, 2, 3, 9, 1, 2, 3, 6]
idxL:  1  idxR:  3
round  4 of sorting result:  [1, 2, 2, 3, 4, 1, 2, 3, 6]
idxL:  2  idxR:  3
round  5 of sorting result:  [1, 2, 2, 3, 4, 5, 2, 3, 6]
idxL:  3  idxR:  3
round  6 of sorting result:  [1, 2, 2, 3, 4, 5, 6, 3, 6]
idxL:  3  idxR:  4
round  7 of sorting result:  [1, 2, 2, 3, 4, 5, 6, 7, 6]
idxL:  4  idxR:  4
round  8 of sorting result:  [1, 2, 2, 3, 4, 5, 6, 7, 9]

--->Test function mergeSort()...
L:  [1, inf]
R:  [3, inf]
idxL:  0  idxR:  0
round  0 of sorting result:  [1, 3, 7, 2, 4, 9, 10, 1, 11, 12, 18, 9]
idxL:  1  idxR:  0
round  1 of sorting result:  [1, 3, 7, 2, 4, 9, 10, 1, 11, 12, 18, 9]
L:  [1, 3, inf]
R:  [7, inf]
idxL:  0  idxR:  0
round  0 of sorting result:  [1, 3, 7, 2, 4, 9, 10, 1, 11, 12, 18, 9]
idxL:  1  idxR:  0
round  1 of sorting result:  [1, 3, 7, 2, 4, 9, 10, 1, 11, 12, 18, 9]
idxL:  2  idxR:  0
round  2 of sorting result:  [1, 3, 7, 2, 4, 9, 10, 1, 11, 12, 18, 9]
L:  [2, inf]
R:  [4, inf]
idxL:  0  idxR:  0
round  3 of sorting result:  [1, 3, 7, 2, 4, 9, 10, 1, 11, 12, 18, 9]
idxL:  1  idxR:  0
round  4 of sorting result:  [1, 3, 7, 2, 4, 9, 10, 1, 11, 12, 18, 9]
L:  [2, 4, inf]
R:  [9, inf]
idxL:  0  idxR:  0
round  3 of sorting result:  [1, 3, 7, 2, 4, 9, 10, 1, 11, 12, 18, 9]
idxL:  1  idxR:  0
round  4 of sorting result:  [1, 3, 7, 2, 4, 9, 10, 1, 11, 12, 18, 9]
idxL:  2  idxR:  0
round  5 of sorting result:  [1, 3, 7, 2, 4, 9, 10, 1, 11, 12, 18, 9]
L:  [1, 3, 7, inf]
R:  [2, 4, 9, inf]
idxL:  0  idxR:  0
round  0 of sorting result:  [1, 3, 7, 2, 4, 9, 10, 1, 11, 12, 18, 9]
idxL:  1  idxR:  0
round  1 of sorting result:  [1, 2, 7, 2, 4, 9, 10, 1, 11, 12, 18, 9]
idxL:  1  idxR:  1
round  2 of sorting result:  [1, 2, 3, 2, 4, 9, 10, 1, 11, 12, 18, 9]
idxL:  2  idxR:  1
round  3 of sorting result:  [1, 2, 3, 4, 4, 9, 10, 1, 11, 12, 18, 9]
idxL:  2  idxR:  2
round  4 of sorting result:  [1, 2, 3, 4, 7, 9, 10, 1, 11, 12, 18, 9]
idxL:  3  idxR:  2
round  5 of sorting result:  [1, 2, 3, 4, 7, 9, 10, 1, 11, 12, 18, 9]
L:  [10, inf]
R:  [1, inf]
idxL:  0  idxR:  0
round  6 of sorting result:  [1, 2, 3, 4, 7, 9, 1, 1, 11, 12, 18, 9]
idxL:  0  idxR:  1
round  7 of sorting result:  [1, 2, 3, 4, 7, 9, 1, 10, 11, 12, 18, 9]
L:  [1, 10, inf]
R:  [11, inf]
idxL:  0  idxR:  0
round  6 of sorting result:  [1, 2, 3, 4, 7, 9, 1, 10, 11, 12, 18, 9]
idxL:  1  idxR:  0
round  7 of sorting result:  [1, 2, 3, 4, 7, 9, 1, 10, 11, 12, 18, 9]
idxL:  2  idxR:  0
round  8 of sorting result:  [1, 2, 3, 4, 7, 9, 1, 10, 11, 12, 18, 9]
L:  [12, inf]
R:  [18, inf]
idxL:  0  idxR:  0
round  9 of sorting result:  [1, 2, 3, 4, 7, 9, 1, 10, 11, 12, 18, 9]
idxL:  1  idxR:  0
round  10 of sorting result:  [1, 2, 3, 4, 7, 9, 1, 10, 11, 12, 18, 9]
L:  [12, 18, inf]
R:  [9, inf]
idxL:  0  idxR:  0
round  9 of sorting result:  [1, 2, 3, 4, 7, 9, 1, 10, 11, 9, 18, 9]
idxL:  0  idxR:  1
round  10 of sorting result:  [1, 2, 3, 4, 7, 9, 1, 10, 11, 9, 12, 9]
idxL:  1  idxR:  1
round  11 of sorting result:  [1, 2, 3, 4, 7, 9, 1, 10, 11, 9, 12, 18]
L:  [1, 10, 11, inf]
R:  [9, 12, 18, inf]
idxL:  0  idxR:  0
round  6 of sorting result:  [1, 2, 3, 4, 7, 9, 1, 10, 11, 9, 12, 18]
idxL:  1  idxR:  0
round  7 of sorting result:  [1, 2, 3, 4, 7, 9, 1, 9, 11, 9, 12, 18]
idxL:  1  idxR:  1
round  8 of sorting result:  [1, 2, 3, 4, 7, 9, 1, 9, 10, 9, 12, 18]
idxL:  2  idxR:  1
round  9 of sorting result:  [1, 2, 3, 4, 7, 9, 1, 9, 10, 11, 12, 18]
idxL:  3  idxR:  1
round  10 of sorting result:  [1, 2, 3, 4, 7, 9, 1, 9, 10, 11, 12, 18]
idxL:  3  idxR:  2
round  11 of sorting result:  [1, 2, 3, 4, 7, 9, 1, 9, 10, 11, 12, 18]
L:  [1, 2, 3, 4, 7, 9, inf]
R:  [1, 9, 10, 11, 12, 18, inf]
idxL:  0  idxR:  0
round  0 of sorting result:  [1, 2, 3, 4, 7, 9, 1, 9, 10, 11, 12, 18]
idxL:  1  idxR:  0
round  1 of sorting result:  [1, 1, 3, 4, 7, 9, 1, 9, 10, 11, 12, 18]
idxL:  1  idxR:  1
round  2 of sorting result:  [1, 1, 2, 4, 7, 9, 1, 9, 10, 11, 12, 18]
idxL:  2  idxR:  1
round  3 of sorting result:  [1, 1, 2, 3, 7, 9, 1, 9, 10, 11, 12, 18]
idxL:  3  idxR:  1
round  4 of sorting result:  [1, 1, 2, 3, 4, 9, 1, 9, 10, 11, 12, 18]
idxL:  4  idxR:  1
round  5 of sorting result:  [1, 1, 2, 3, 4, 7, 1, 9, 10, 11, 12, 18]
idxL:  5  idxR:  1
round  6 of sorting result:  [1, 1, 2, 3, 4, 7, 9, 9, 10, 11, 12, 18]
idxL:  6  idxR:  1
round  7 of sorting result:  [1, 1, 2, 3, 4, 7, 9, 9, 10, 11, 12, 18]
idxL:  6  idxR:  2
round  8 of sorting result:  [1, 1, 2, 3, 4, 7, 9, 9, 10, 11, 12, 18]
idxL:  6  idxR:  3
round  9 of sorting result:  [1, 1, 2, 3, 4, 7, 9, 9, 10, 11, 12, 18]
idxL:  6  idxR:  4
round  10 of sorting result:  [1, 1, 2, 3, 4, 7, 9, 9, 10, 11, 12, 18]
idxL:  6  idxR:  5
round  11 of sorting result:  [1, 1, 2, 3, 4, 7, 9, 9, 10, 11, 12, 18]

你可能感兴趣的:(算法)