Python——堆

        之前写代码用到堆的时候一直在用 C++ STL 的 priority_queue ,这段时间学习 python 顺便实现一下。同时也学习了使用 python 自带的 heapq。堆可以用来写堆排序也可用作优先队列等,虽然现在已经有封装好的模块可供我们调用,但是掌握其实现更有助于我们理解这种数据结构。

堆(heap)是计算机科学中一类特殊的数据结构的统称。堆通常是一个可以被看做一棵树的数组对象。堆总是满足下列性质:

(1)堆中某个节点的值总是不大于或不小于其父节点的值;

(2)堆总是一棵完全二叉树。

        将根节点最大的堆叫做最大堆或大根堆,根节点最小的堆叫做最小堆或小根堆。

        这里以小根堆为例,简单讲解堆的 python 实现。

一、Python实现

堆涉及的比较重要的几个操作有:插入新元素,堆判空,弹出堆顶元素,将一个数组转换为一个堆。

(1)插入新元素

        每次执行该操作时应保持堆的性质不变。对于小根堆:即插入新元素后每个元素的值小于其左右孩子的值。

        这里要用到一个 swim 的操作,翻译过来叫 “上浮” ,就是对新插入的元素进行调整使其满足堆的性质。如果当前节点小于其父节点的值,则交换该节点与其父节的值,父节点设为新的当前节点。这是一个递归的过程(也可以非递归实现),直到当前节点的值大于父节点的值时不再操作,就好像将一个元素在二叉树中自底向上 “浮起” ,故此得名。

        插入具体执行过程:

        1)将新元素加入到数组(即我们所建的堆)的尾部;

        2)对该元素进行 swim 操作

       实现如下:

#交换两元素
def swap(heap,x,y):
    t = heap[x]
    heap[x] = heap[y]
    heap[y] = t
#上浮操作
def swim(heap,x,location):
    parent = int(location / 2)
    if location > 1 and x < heap[parent]:
        swap(heap,location,parent)
        swim(heap,x,parent)
#返回堆的大小
def heapLen(heap):
    return len(heap) - 1
#堆中加入新元素
def push(heap,x):
    heap.append(x)
    location = heapLen(heap)
    swim(heap,x,location)
#一个例子
if __name__ == "__main__":
    data = [5,3,2,1,4]
    #保证数组下标从“1”开始
    heap = [0]
    for i in data:
        print("Insert :",i)
        push(heap,i)
        print("The heap :",heap[1:],"Heap size =",heapLen(heap))

        运行结果:

Insert : 5
The heap : [5] Heap size = 1
Insert : 3
The heap : [3, 5] Heap size = 2
Insert : 2
The heap : [2, 5, 3] Heap size = 3
Insert : 1
The heap : [1, 2, 3, 5] Heap size = 4
Insert : 4
The heap : [1, 2, 3, 5, 4] Heap size = 5

(2)堆判空

        判断堆是否为空,直接通过数组长度判断即可

#堆判空
def empty(heap):
    if heapLen(heap) == 0: return True
    else: return False

(3)弹出堆顶元素

        弹出是将堆顶的元素移除并通过调整使数组依然为一个堆。

        这个过程我们需要用到一个称为 sink 的操作,意为 “下沉” 。对于要执行 sink 操作的当前元素(小根堆),与其值最小的子节点交换(可保证只进行一次交换且保持堆的性质),与之交换的子节点设为新的当前节点,对新的当前节点继续执行 sink ,直到当前节点的值小于两个子节点的值,这个过程就好像将一个元素一直在二叉树中往下 “沉” 。

        弹出(pop)具体执行过程:

        1)将数组的最后一个元素与堆顶的元素(数组第一个元素)互换,并将数组最后一个元素删除;

        2)对堆顶的元素执行 sink

        实现:

#交换两元素
def swap(heap,x,y):
    t = heap[x]
    heap[x] = heap[y]
    heap[y] = t
#下沉操作
def sink(heap,x,location):
    left = location * 2
    right = left + 1
    nextLocation = location
    minValue = x
    heapLength = heapLen(heap)
    if left <= heapLength and minValue > heap[left]:
        minValue = heap[left]
        nextLocation = left
    if right <= heapLength and minValue > heap[right]:
        nextLocation = right
    if nextLocation != location:
        swap(heap,location,nextLocation)
        sink(heap,x,nextLocation)
#上浮操作
def swim(heap,x,location):
    parent = int(location / 2)
    if location > 1 and x < heap[parent]:
        swap(heap,location,parent)
        swim(heap,x,parent)
#返回堆的大小
def heapLen(heap):
    return len(heap) - 1
#取堆顶元素
def pop(heap):
    heap_top = heap[1]
    heapLength = heapLen(heap)
    swap(heap,1,heapLength)
    heap.pop()
    if heapLen(heap) > 0: sink(heap,heap[1],1)
    return heap_top
#堆中加入新元素
def push(heap,x):
    heap.append(x)
    location = heapLen(heap)
    swim(heap,x,location)
#堆判空
def empty(heap):
    if heapLen(heap) == 0: return True
    else: return False
#一个例子
if __name__ == "__main__":
    data = [5,3,2,1,4]
    #保证数组下标从“1”开始
    heap = [0]
    for i in data: push(heap,i)
    print("The heap is", heap[1:])
    while empty(heap) == False:
        print("The element pop :",pop(heap))
        print("The heap :",heap[1:],"Heap size =",heapLen(heap))

        运行结果:

The heap is [1, 2, 3, 5, 4]
The element pop : 1
The heap : [2, 4, 3, 5] Heap size = 4
The element pop : 2
The heap : [3, 4, 5] Heap size = 3
The element pop : 3
The heap : [4, 5] Heap size = 2
The element pop : 4
The heap : [5] Heap size = 1
The element pop : 5
The heap : [] Heap size = 0

(4)将一个数组转换为一个堆

        这个过程通常采用自底向上的方法。若数组长度为 n ,则所有的 heap[i] ( \left \lfloor n/2 \right \rfloor < i < n)均为二叉树的叶子节点,我们从二叉树的最后一个非叶子节点heap[ \left \lfloor n/2 \right \rfloor ]开始逆序对所有非叶子节点逐个调用 sink (这里约定同一层的节点:自右向左为逆序,同一颗树中的节点:自底向上为逆序),即得到一个小根堆。

        实现:

#交换两元素
def swap(heap,x,y):
    t = heap[x]
    heap[x] = heap[y]
    heap[y] = t
#下沉操作
def sink(heap,x,location):
    left = location * 2
    right = left + 1
    nextLocation = location
    minValue = x
    heapLength = heapLen(heap)
    if left <= heapLength and minValue > heap[left]:
        minValue = heap[left]
        nextLocation = left
    if right <= heapLength and minValue > heap[right]:
        nextLocation = right
    if nextLocation != location:
        swap(heap,location,nextLocation)
        sink(heap,x,nextLocation)
#返回堆的大小
def heapLen(heap):
    return len(heap) - 1
#将数组转化为堆
def heapAdjust(heap):
    lastLoc = int(heapLen(heap) / 2)
    for i in range(lastLoc,0,-1):
        sink(heap,heap[i],i)
#一个例子
if __name__ == "__main__":
    data = [5,3,2,1,4,9,7,8,6,0]
    #保证数组下标从“1”开始
    heap = [0]
    for i in data: heap.append(i)
    heapAdjust(heap)
    print("The heap is",heap[1:])

        运行结果:

The heap is [0, 1, 2, 5, 3, 9, 7, 8, 6, 4]

二、Python heapq

        在Python中也对堆进行了模块化,可以通过调用heapq模块来使用堆这种数据结构,同时heapq模块也提供了相应的方法来对堆进行操作,更简洁也更高效,这里只给出了一些基本操作。

import heapq

if __name__ == "__main__":
    data = [5,3,2,1,4]
    #创建堆,必须为list
    heap = []
    #往堆中插入元素
    for i in data:
        print("Insert :",i)
        heapq.heappush(heap,i)
        print("The heap is", heap)
    print("Finally :", heap)
    print("")
    k = 3
    # 返回 heap 中前 k 大的元素
    maxk = heapq.nlargest(k, heap)
    print("The max k element", maxk)
    #返回 heap 中前 k 小的元素
    mink = heapq.nsmallest(k, heap)
    print("The min k element", mink)
    print("")
    #若元素为元组也可指定关键字
    #klist = heapq.nsmallest(k,heap,key = lambda x : x[0])
    #弹出堆顶元素
    while len(heap) > 0:
        top = heapq.heappop(heap)
        print("The element pop :",top)
        print("The heap is", heap)
    print("")
    #以线性复杂度将一个列表转化为堆
    print("Before adjust :",data)
    heapq.heapify(data)
    print("After adjust :",data)

       运行结果:

Insert : 5
The heap is [5]
Insert : 3
The heap is [3, 5]
Insert : 2
The heap is [2, 5, 3]
Insert : 1
The heap is [1, 2, 3, 5]
Insert : 4
The heap is [1, 2, 3, 5, 4]
Finally : [1, 2, 3, 5, 4]

The max k element [5, 4, 3]
The min k element [1, 2, 3]

The element pop : 1
The heap is [2, 4, 3, 5]
The element pop : 2
The heap is [3, 4, 5]
The element pop : 3
The heap is [4, 5]
The element pop : 4
The heap is [5]
The element pop : 5
The heap is []

Before adjust : [5, 3, 2, 1, 4]
After adjust : [1, 3, 2, 5, 4]

 

你可能感兴趣的:(Python——堆)