从JAVA工具类Arrays类的binarySearch谈谈有序数据集合的二分查找算法

JAVA中的二分查找

JAVA8源码的工具类Arrays类提供了二分查找方法:
The array must be sorted.If the array contains multiple elements with the specified value, there is no guarantee which one will be found.

public static int binarySearch(int[] a, int key) {
	return binarySearch0(a, 0, a.length, key);
}
public static int binarySearch(int[] a, int fromIndex, int toIndex, int key) {
	rangeCheck(a.length, fromIndex, toIndex);
	return binarySearch0(a, fromIndex, toIndex, key);
}
// Like public version, but without range checks.
private static int binarySearch0(int[] a, int fromIndex, int toIndex, int key) {
	int low = fromIndex;
	int high = toIndex - 1;
	while (low <= high) {
		int mid = (low + high) >>> 1;
		int midVal = a[mid];
		if (midVal < key)
			low = mid + 1;
		else if (midVal > key)
			high = mid - 1;
		else
			return mid; // key found
	}
    return -(low + 1);  // key not found.
}
private static void rangeCheck(int arrayLength, int fromIndex, int toIndex) {
	if (fromIndex > toIndex) {
		throw new IllegalArgumentException("fromIndex(" + fromIndex + ") > toIndex(" + toIndex + ")");
    }
    if (fromIndex < 0) {
    	throw new ArrayIndexOutOfBoundsException(fromIndex);
    }
    if (toIndex > arrayLength) {
    	throw new ArrayIndexOutOfBoundsException(toIndex);
    }
}

未指定起始位置和结束位置的二分查找默认在整个数组查询,binarySearch方法调用了binarySearch0(a, 0, a.length, key);说明查询是不包含toIndex的,因此方法里的high = toIndex -1;
为防止溢出,可将计算中点位置写为int mid = low + ((high - low) >>>1).
由于每次查找后数据范围都会缩小到原来的一半,因此时间复杂度为O(logn)。

二分查找的递归实现

以上用的非递归实现,二分查找也可用递归实现:

private static int binarySearch0(int[] a, int fromIndex, int toIndex, int key) {
	int low = fromIndex;
	int high = toIndex - 1;
	if( low > high ) return -1;
	int mid = low + (( high - low ) >>> 1);
	int midVal = a[mid];
	if (midVal == key) return mid; // key found
	else if (midVal < key)
		return binarySearch0(a, mid+1, toIndex, key);	
	else if (midVal > key)
		return binarySearch0(a, low, mid, key);
    //return -(low + 1);  // key not found.
}

二分查找的局限

二分查找必须在有序的数据集合上使用,另外由于按照下标随机访问,需要是数组存储,如是其他数据结构不太适用。

二分查找方法的变形问题

上述JAVA源码的注释中提到了如果数组中国包含了多个查找值,那么返回第几个符合条件的查找值是不确定的。(If the array contains multiple elements with the specified value, there is no guarantee which one will be found.)如果我们要查找第一个或最后一个符合条件的值,该如何实现呢?再进一步,要查找第一个不小于或不大于查找值的元素呢?下面给出其中两个问题实现,其他同理。

查找第一个符合要求的值

private static int binarySearch0(int[] a, int fromIndex, int toIndex, int key) {
	int low = fromIndex;
	int high = toIndex - 1;
	while (low <= high) {
		int mid = (low + high) >>> 1;
		int midVal = a[mid];
		if (midVal < key)
			low = mid + 1;
		else if (midVal > key)
			high = mid - 1;
		else {
			if ( mid == 0 || a[mid-1] != key) 
				return mid;
			else high = mid - 1; // key found
		}
	}
    return -(low + 1);  // key not found.
}

查找第一个不小于查找值的元素

private static int binarySearch0(int[] a, int fromIndex, int toIndex, int key) {
	int low = fromIndex;
	int high = toIndex - 1;
	while (low <= high) {
		int mid = (low + high) >>> 1;
		int midVal = a[mid];
		if (midVal >= key){
			if((mid == 0) || (a[mid-1] < key)) 
				return mid;
			else high = mid - 1;
		}
		else low = mid + 1;
	}
    return -(low + 1);  // key not found.
}

你可能感兴趣的:(算法与数据结构)