第三周上:MergeSort

3.1 MergeSort

Mergesort: java sort for objects

1. Merge sort(recursive,top-down)

  1. 思路:

    • 将array对半分
    • 递归地(recursively)将每一半各自排序
    • 再将这两半合并
      • 复制一个aux[]
      • 两个已排序的subarray: aux[lo] ~ aux[mid]aux[mid+1] ~ aux[hi]
      • 分别设index:i、j,aux[i], aux[j]比大小,取小的复制回a[];若相等,将aux[i]复制回a[]
  2. Performance (size ):

    • worst case: compares, array accesses
    • best case (input array is sorted): ~
      • optimized version (也就是在sort()函数中多加一行代码,即优化2:a[mid]<=a[mid+1]): compares
  • Complexity:
    • 考虑compares次数:Mergesort is optimal
    • 考虑space使用:Mergesort is not optimal
  1. Memory(size ):

    • extra memory: proportional to N
    • 但一个in-place的sorting algorithm 应该只能使用 的extra memory,如insertion sort,selection sort,shellsort
  2. Stability:

    • sort(): stable
    • merge(): stable
  3. Java Implementation

    public class Merge{
     private static void merge(Comparable[] a, Comparable[] aux, int lo, int mid, int hi){
         
         // assert expression(逻辑运算表达式)
         // 如果expression为true,表示断言成功,程序继续执行。如果为false,会抛出AssertionError
         assert isSorted(a, lo, mid);   // precondition: a[lo..mid] sorted
         assert isSorted(a, mid+1, hi); // precondition: a[mid+1..hi] sorted
    
         // copy
         for(int k = lo; k <= hi; k++){
             aux[k] = a[k];
         }
    
         // merge
         int i = lo;
         int j = mid+1;
         for(int k = lo; k <= hi; k++){
             if(i>mid){
                 // i超出mid,表明i所在的subarray已全部排完
                 // 只需把j所在的subarray的剩余部分copy回a[]即可
                 a[k] = aux[j++]; // 等同于两行代码:a[k] = aux[j]; j++;
             }else if(j>hi){
                 // j超出hi,与上述同理
                 a[k] = aux[i++];
             }else if(less(aux[j],aux[i])){
                 // aux[j]=aux[i], 将aux[i]复制回a[]
                 a[k] = aux[i++];
             }
         }
    
         assert isSorted(a, lo, hi); // postcondition: a[lo..hi] sorted
     }
    
     private static void sort(Comparable[] a, Comparable[] aux, int lo, int hi){
         // 递归终止条件
         if(hi<=lo){
             return;
         }
        
        // 优化1:对于比较小的array(定cutoff=7),用merge sort太浪费memory,改用insertion sort
         int cutoff = 7;
         if(hi <= lo+cutoff - 1){
             Insertion.sort(a, lo, hi); // Insertion.java与Merge.java放在同一个目录下
             return;
         }
        // 结束优化1
        
         int mid = lo + (hi - lo) / 2; // 类似binary research
         sort(a, aux, lo, mid);
         sort(a, aux, mid+1, hi);
        
        // 优化2:如果上一半中最大的item也小于下一半中最小的item,那么merge就不必要了
         if(!less(a[mid+1], a[mid])){
             return;
         }
         // 结束优化2
        
         merge(a, aux, lo, mid, hi);
     }
    
     public static void sort(Comparable[] a){
         aux = new Comparable[a.length];
         sort(a, aux, 0, a.length-1);
     }
    
     private static boolean less(Comparable v, Comparable w){
         return v.CompareTo(w) < 0;
     }
    
     private static void exch(Comparable[] a, int i, int j){
         Comparable swap = a[i];
         a[i] = a[j];
         a[j] = swap;
     }
    
     private static boolean isSorted(Comparable[] a){
         for(int i=1; i

    优化3:将merge()函数中:循环里头的aux[]和a[]互换位置,第一个sort()函数中:sort(), merge()里的aux,a互换位置。这样可以save time(but not space)

2. Merge sort(non-recursive、buttom-up)

  1. 思路:

    • 遍历整个array,将size=1的subarray合并起来
    • 再重头开始,不断重复size=2,4,8,16...
  2. Java implementation

    public class MergeBU{
     private static void merge(Comparable[] a, Comparable[] aux, int lo, int mid, int hi){
         
         // assert expression(逻辑运算表达式)
         // 如果expression为true,表示断言成功,程序继续执行。如果为false,会抛出AssertionError
         assert isSorted(a, lo, mid);   // precondition: a[lo..mid] sorted
         assert isSorted(a, mid+1, hi); // precondition: a[mid+1..hi] sorted
    
         // copy
         for(int k = lo; k <= hi; k++){
             aux[k] = a[k];
         }
    
         // merge
         int i = lo;
         int j = mid+1;
         for(int k = lo; k <= hi; k++){
             if(i>mid){
                 // i超出mid,表明i所在的subarray已全部排完
                 // 只需把j所在的subarray的剩余部分copy回a[]即可
                 a[k] = aux[j++]; // 等同于两行代码:a[k] = aux[j]; j++;
             }else if(j>hi){
                 // j超出hi,与上述同理
                 a[k] = aux[i++];
             }else if(less(aux[j],aux[i])){
                 // aux[j]=aux[i], 将aux[i]复制回a[]
                 a[k] = aux[i++];
             }
         }
    
         assert isSorted(a, lo, hi); // postcondition: a[lo..hi] sorted
     }
    
    
     public static void sort(Comparable[] a){
         int n = a.length;
         Comparable[] aux = new Comparable[n];
         for(int sz=1; sz

3. Comparator interface

  1. 优点

    • 相比较于comparable,comparator对于给定的data type支持多种方式的ordering
  2. 用法:

    • 创建一个Comparator对象

    • 传给Arrays.sort的第二个argument一个自定义的order

      String[] a;
      ...
      // 使用natural order
      Arrays.sort(a); 
      // 使用用Comparator对象中自定义的order
      Arrays.sort(a, String.CASE_INSENSITIVE_ORDER);
      Arrays.sort(a, new BritishPhoneBookOrder());
      
  1. 应用(举例:insertion sort)

    public class Insertion implements Comparator{
     public static void sort(Object[] a, Comparator comparator){
         int n = a.length;
         // 向右移动pointer
         for(int i = 0; i < n; i++){
             // j从右向左移动,a[j]和它左边较大的那个交换位置
             for (int j = i; j > 0; j--){
                 if (less(comparator, a[j], a[j-1])){
                     exch(a, j, j-1)
                 }else{
                     break;
                 }
             }
         }
     }
    
     // item v,w比较大小
     private static boolean less(Comparator c, Object v, Object w){
         return c.compare(v,w) < 0;
     }
    
      // a[i]和a[j]交换位置
     private static void exch(Object[] a, int i, int j){
         Object swap = a[i];
         a[i] = a[j];
         a[j] = swap;
     }
      
      // 检验array是否完成排序
     private static boolean isSorted(Object[] a, Comparator comparator){
         for(int i = 1; i < a.length; i++){
             if(less(comparator, a[i], a[i-1])){
                 return false;
             }
         }
         return true;
     }
    }
    
  1. Comparator interface:implementing

    • 思路:建一个nested class,该类继承接口Comparator,然后在该类中写一个compare()的方法
    public class Student{
      public static final Comparator BY_NAME = new ByName();
      public static final Comparator BY_SECTION = new BySection();
      private final String name;
      private final int section;
      ...
      
      //这里的static和上面attribute中的static表明对这个类只有这一个comparator
      private static class ByName implements Comparator{
        public int compare(Student v, Student w){
          return v.name.compareTo(w.name);
        }
      }
      
      private static class BySection implements Comparator{
        public int compare(Student v, Student w){
          return v.section - w.section; //这里不会产生overflow的危险
        }
      }
    }
    

4. Stability

  1. 重要性:在排序时,先根据A-order排完了这组data,在此基础上,当我再根据B-order排序时,如果这个sort algorithms是not stable,我的A-order可能会因为第二次排序而打乱;但如果这个sort algorithms是stable的,当我第二次排序结束后,相同的item(基于B-order)之间还保持着原来的A-order
  2. Stable:insertion sort,mergesort
    • Equal item never move past each other
  3. Not stable:selection sort,shell
    • a Long-distance exchange might move an item past some equal item

你可能感兴趣的:(第三周上:MergeSort)