事实上Collections.sort方法底层就是调用的Arrays.sort方法,而Arrays.sort使用了两种排序方法,快速排序和优化的归并排序。
快速排序主要是对那些基本类型数据(int,short,long等)排序, 而归并排序用于对Object类型进行排序。
使用不同类型的排序算法主要是由于快速排序是不稳定的,而归并排序是稳定的。这里的稳定是指比较相等的数据在排序之后仍然按照排序之前的前后顺序排列。对于基本数据类型,稳定性没有意义,而对于Object类型,稳定性是比较重要的,因为对象相等的判断可能只是判断关键属性,最好保持相等对象的非关键属性的顺序与排序前一致;另外一个原因是由于归并排序相对而言比较次数比快速排序少,移动(对象引用的移动)次数比快速排序多,而对于对象来说,比较一般比移动耗时。
public static > void sort(List list) {
list.sort(null);
}
default void sort(Comparator super E> c) {
Object[] a = this.toArray();
Arrays.sort(a, (Comparator) c);
ListIterator i = this.listIterator();
for (Object e : a) {
i.next();
i.set((E) e);
}
}
public static void sort(T[] a, Comparator super T> c) {
if (c == null) {
sort(a);
} else {
if (LegacyMergeSort.userRequested)
legacyMergeSort(a, c);
else
TimSort.sort(a, 0, a.length, c, null, 0, 0);
}
}
public static void sort(Object[] a) {
if (LegacyMergeSort.userRequested)
legacyMergeSort(a);
else
ComparableTimSort.sort(a, 0, a.length, null, 0, 0);
}
结合了归并排序和插入排序的混合算法,它基于一个简单的事实,实际中大部分数据都是部分有序(升序或降序)的。
TimSort 算法为了减少对升序部分的回溯和对降序部分的性能倒退,将输入按其升序和降序特点进行了分区。排序的输入的单位不是一个个单独的数字,而是一个个的块-分区。其中每一个分区叫一个run。针对这些 run 序列,每次拿一个 run 出来按规则进行合并。每次合并会将两个 run合并成一个 run。合并的结果保存到栈中。合并直到消耗掉所有的 run,这时将栈上剩余的 run合并到只剩一个 run 为止。这时这个仅剩的 run 便是排好序的结果。
综上述过程,Timsort算法的过程包括
(0)如果数组长度小于某个值,直接用二分插入排序算法
(1)找到各个run,并入栈
(2)按规则合并run
/**
* Sorts the given range, using the given workspace array slice
* for temp storage when possible. This method is designed to be
* invoked from public methods (in class Arrays) after performing
* any necessary array bounds checks and expanding parameters into
* the required forms.
*
* @param a the array to be sorted
* @param lo the index of the first element, inclusive, to be sorted
* @param hi the index of the last element, exclusive, to be sorted
* @param work a workspace array (slice)
* @param workBase origin of usable space in work array
* @param workLen usable size of work array
* @since 1.8
*/
static void sort(Object[] a, int lo, int hi, Object[] work, int workBase, int workLen) {
assert a != null && lo >= 0 && lo <= hi && hi <= a.length;
int nRemaining = hi - lo;
if (nRemaining < 2)
return; // Arrays of size 0 and 1 are always sorted
// If array is small, do a "mini-TimSort" with no merges
if (nRemaining < MIN_MERGE) {
int initRunLen = countRunAndMakeAscending(a, lo, hi);
binarySort(a, lo, hi, lo + initRunLen);
return;
}
/**
* March over the array once, left to right, finding natural runs,
* extending short natural runs to minRun elements, and merging runs
* to maintain stack invariant.
*/
ComparableTimSort ts = new ComparableTimSort(a, work, workBase, workLen);
int minRun = minRunLength(nRemaining);
do {
// Identify next run
int runLen = countRunAndMakeAscending(a, lo, hi);
// If run is short, extend to min(minRun, nRemaining)
if (runLen < minRun) {
int force = nRemaining <= minRun ? nRemaining : minRun;
binarySort(a, lo, lo + force, lo + runLen);
runLen = force;
}
// Push run onto pending-run stack, and maybe merge
ts.pushRun(lo, runLen);
ts.mergeCollapse();
// Advance to find next run
lo += runLen;
nRemaining -= runLen;
} while (nRemaining != 0);
// Merge all remaining runs to complete sort
assert lo == hi;
ts.mergeForceCollapse();
assert ts.stackSize == 1;
}
/**
* Creates a TimSort instance to maintain the state of an ongoing sort.
*
* @param a the array to be sorted
* @param work a workspace array (slice)
* @param workBase origin of usable space in work array
* @param workLen usable size of work array
*/
private ComparableTimSort(Object[] a, Object[] work, int workBase, int workLen) {
this.a = a;
// Allocate temp storage (which may be increased later if necessary)
int len = a.length;
int tlen = (len < 2 * INITIAL_TMP_STORAGE_LENGTH) ?
len >>> 1 : INITIAL_TMP_STORAGE_LENGTH;
if (work == null || workLen < tlen || workBase + tlen > work.length) {
tmp = new Object[tlen];
tmpBase = 0;
tmpLen = tlen;
}
else {
tmp = work;
tmpBase = workBase;
tmpLen = workLen;
}
/*
* Allocate runs-to-be-merged stack (which cannot be expanded). The
* stack length requirements are described in listsort.txt. The C
* version always uses the same stack length (85), but this was
* measured to be too expensive when sorting "mid-sized" arrays (e.g.,
* 100 elements) in Java. Therefore, we use smaller (but sufficiently
* large) stack lengths for smaller arrays. The "magic numbers" in the
* computation below must be changed if MIN_MERGE is decreased. See
* the MIN_MERGE declaration above for more information.
* The maximum value of 49 allows for an array up to length
* Integer.MAX_VALUE-4, if array is filled by the worst case stack size
* increasing scenario. More explanations are given in section 4 of:
* http://envisage-project.eu/wp-content/uploads/2015/02/sorting.pdf
*/
int stackLen = (len < 120 ? 5 :
len < 1542 ? 10 :
len < 119151 ? 24 : 49);
runBase = new int[stackLen];
runLen = new int[stackLen];
}
private static void legacyMergeSort(Object[] a) {
Object[] aux = a.clone();
mergeSort(aux, a, 0, a.length, 0);
}
private static void mergeSort(Object[] src,
Object[] dest,
int low,
int high,
int off) {
int length = high - low;
// 7
// Insertion sort on smallest arrays
if (length < INSERTIONSORT_THRESHOLD) {
for (int i=low; ilow &&
((Comparable) dest[j-1]).compareTo(dest[j])>0; j--)
swap(dest, j, j-1);
return;
}
// Recursively sort halves of dest into src
int destLow = low;
int destHigh = high;
low += off;
high += off;
int mid = (low + high) >>> 1;
mergeSort(dest, src, low, mid, -off);
mergeSort(dest, src, mid, high, -off);
// If list is already sorted, just copy from src to dest. This is an
// optimization that results in faster sorts for nearly ordered lists.
if (((Comparable)src[mid-1]).compareTo(src[mid]) <= 0) {
System.arraycopy(src, low, dest, destLow, length);
return;
}
// Merge sorted halves (now in src) into dest
for(int i = destLow, p = low, q = mid; i < destHigh; i++) {
if (q >= high || p < mid && ((Comparable)src[p]).compareTo(src[q])<=0)
dest[i] = src[p++];
else
dest[i] = src[q++];
}
}
小于60:使用插入排序,插入排序是稳定的
大于60的数据量会根据数据类型选择排序方式:
基本类型:使用快速排序。因为基本类型。1、2都是指向同一个常量池不需要考虑稳定性。
Object类型:使用归并排序。因为归并排序具有稳定性。
注意:不管是快速排序还是归并排序。在二分的时候小于60的数据量依旧会使用插入排序