例如输入1,2,3,4,5,6,7和8这8个数字,则最小的4个数字为1,2,3和4。
思路来自结构之法http://blog.csdn.net/v_july_v/article/details/6370650
方法一:
建K个元素的最大堆,Xmax为堆顶元素,然后遍历余下的n-k个元素,如果大于Xmax,与其交换,更新堆。所以时间复杂度为O(n*logk)
#!/usr/bin/env python
# -*- coding: utf-8 -*
#查找最小的k个树
def heapAdjust(A, i, length):
pa = i
child = 2*i + 1
tmp = A[i]
while child < length:
if child < length-1 and A[child] < A[child+1]:
child += 1
if A[pa] >= A[child]:
break
else:
A[pa],A[child] = A[child],A[pa]
pa = child
child = 2*pa + 1
def findKmin(A, k, length):
for i in range(k/2)[::-1]:
heapAdjust(A, i, k)
print 'The heap is :', A[:k]
for i in xrange(k,length):
if A[i] < A[0]:
A[i],A[0] = A[0],A[i]
heapAdjust(A, 0, k)
print 'The result is :', A[:k]
if __name__ == '__main__':
A = [6,3,7,2,9,1,4,5,11,10,8]
lens = len(A)
findKmin(A, 10, lens)
类似快速排序。如果主元随机选取或者五化中项的中项,时间复杂度可以达到O(n)。
下面的算法不是找最小的k个值,而是寻找第k大的值,找到了第k大的值,因为之前已经经过了快拍,所以找到最小的k个值也就不难了,时间复杂度可以达到线性的复杂度。详情请参考http://blog.csdn.net/v_july_v/article/details/6403777。
首先利用快速排序将数组A化分为A[0:m],A[m+1,length-1]
if par == k-1:
return A[par]
elif k-1 < par:
return findKth(s, 0, par-1, k)
else:
return findKth(s, par+1, n, k)
完整代码如下:
#!/usr/bin/env python
# -*- coding: utf-8 -*
#利用q_select查找第k大的值
def partition(s, m, n):
#s is a list
key = s[n-1]
l,r = m,n-2
while True:
while l <= n-2 and s[l] <= key:
l += 1
while r>= m and s[r] > key:
r -= 1
if l < r:
s[l],s[r] = s[r],s[l]
else:
break
s[l],s[n-1] = s[n-1],s[l]
return l
def medin3(s, m, n):
md = m + (n-m)/2
if s[m] > s[md]:
s[m],s[md] = s[md],s[m]
if s[m] > s[n]:
s[m],s[n] = s[n],s[m]
if s[md] > s[n]:
s[md],s[n] = s[n],s[md]
s[md],s[n-1] = s[n-1],s[md]
return s[n-1]
def findKth(s, m, n, k):
if k-1 > n or k-1 < m:
return False
if m<=n:
medin3(s, m, n)
par = partition(s, m, n)
if par == k-1:
return A[par]
elif k-1 < par:
return findKth(s, 0, par-1, k)
else:
return findKth(s, par+1, n, k)
if __name__ == '__main__':
A = [6,3,7,2,9,1,4,5,11,10,8]
lens = len(A)
print findKth(A, 0, lens-1, 6)