Finding Kth Minimum (partial ordering) – Using Tournament Algorithm

Ok, so far in my previous blog entries about  finding 2nd minimum (and for  repeating elements) I wrote about efficient algorithm for finding second minimum element of an array. This optimized algorithm takes O(n + log(n)) time (worst case) to find the second minimum element.
So, the next obvious question is - how can we find K th minimum element efficiently in an unsorted array where k ranges from 1 – n. How about partial sorting - efficiently returning first K minimum values from unsorted array. As you might have guessed it, we can achieve this by extending the logic used before and tweaking our implementation little bit. We can find K th min (or return partially sorted list of K min elements) in O(n + klog(n)) time.

Design Details

 For simplicity we will assume that the input unsorted array consists of non-repeating, non-negative integer numbers only. Solution can be extended for arrays with repeating numbers using similar considerations as outlined in my blog for finding second min for repeating values.

As noted in the previous blog, using tournament method we can build output tree by comparing the adjacent elements and moving the lower of the two to next level. Repeating this process yields a tree with log(n) depth and the root element of this tree represent the minimum value. Also, the second minimum can then easily be obtained by finding the minimum of all the values that the root was compared against at each level.  Since the depth of tree is log(n), we have to do at most O(log(n)) comparisons. For example, Figure 1 below uses tournament method for finding the root element. For this particular case, the minimum value is 1 and second minimum 2.

 

 

Let’s say we want to find the 3rd minimum value. Extending the logic used to find second minimum, the third minimum value must have met the root (minimum) or the second minimum before ending its progression (just like winning a tournament round) to next level of the tree (as smaller of the two compared values move to next level). Thus 3rd minimum can be obtained by finding the least of all the values compared against the minimum and the second minimum, i.e. backtracking the root (minimum) and the second minimum up to the top of tree (original array) and noting all values each (root and second minimum) was compared against.

 Take the sample in Figure 1. Here is the list of values obtained by back-tracking the root (call it Set A).

Set A (for root) - [3, 2, 10, 8]

From this list we obtain the second minimum as 2. Back-tracking from 2, we obtain following list (Set B):

Set B (for Second minimum) [5, 14]

Thus the third minimum is the least value in union of set A and B, which is 3 (ignoring value 2, which was already established as second minimum). Note that the 3rd minimum must be in either Set A or Set B. In above case it was found from Set A. 

How about if we have to find the 4th minimum? Well, we have to back track the 3rd least value and collect adjacent values a (one level up). Then add this set to the values for root and second minimum. Then this set will contain the 2nd, 3rd and 4th minimum. Continuing our sample, let’s say Set C consists of such elements obtained by backtracking 3rd min – [4, 11, 16]. Thus the minimum value in union of Set A, B, C is 4 (ignoring already established 2 and 3 as second min and third min resp.), is fourth least value. Note that for calculating kth minimum we don’t backtrack and find the adjacency list for minimum through k -1. We can use already calculated results from any previous such computation. This is the reason why this technique falls under Dynamic Programming – we have optimal substructure and avoid repeated computations. 

Now it is time to formalize. To find Kth minimum, back track root, second min, third min up to k-1 min and collect the adjacent values for each - moving from the root (of sub-tree) all the way to the top. The resulting collection will contain the 2nd, 3rd … Kth minimum.

Implementation Details

To obtain Kth minimum, we need collection of all the comparisons from minimum (root) to K-1. Thus we need to make following changes to our previous code for finding second minimum.

 

  1. Return array of values (as indicated in rectangle boxes in Figure 2) while back-tracking root (could also be sub-tree root). Also, in addition to the values, we need the level and also index information for of each of these values. The reason why we need the level and index is because once we determine that a particular value from this list is minimum, we need to back-track the sub-tree from that value (hence we need to know the location of the element at that level) on order to obtain new collection to determine next lower value. Thus we will return a two-dimensional array with first index denoting the value and the other one the index. We can infer the level from the index of the pair (in 2-d array).
    For example, for the case above, for the first pass (to find the minimum value), the following will be returned (let’s call it adjacency list):


    Thus from above array we can conclude that the second minimum value is 2 (least of first value of all pairs), and this value appears two levels above the root (level 5), at 0th index (at this level).  We will use this info to identify the sub-tree for out next round of back-tracking.
  2. Change the api to take back-track a portion of tree. In order to achieve that we need to pass level and index to define the root of sub-tree to back track from.
  3. The 2-dimensional list obtained from each run will be added to a 3-dimensional list – full adjacency list for min to K-1 min elements. 3-D list is used rather than merging a bigger 2-D list to preserve and identify the results for a particular run. The reason being once we identify the minimum element for all k-1 runs, the level of that particular element is obtained from particular list where the minimum was found (alternatively our backtracking api could have returned a 3-d array with level info as one more dimension, either way we need this info).
  4. Api that take full adjacency list and min value and return the next min element info. The api should identify for which run it found the min and the index of element within that run. Refer Figure 4 below. When this list is passed to this api with value of second min (2), it returns (0,3), which is interpreted as the next min (third min) was found in first array (obtained from first backtrack) and 4th element within that array. Once we have this info, we can look into first array and locate the third min value as 3, at index 1 at level 3 (remember original array is level 1 – Figure 2). Note that for m elements in full adjacency list we are making m comparisons which seem not efficient at first glance, but since the size of the adjacency list is small O(log(n)), hence it is not significant. 
  5. To find Kth min, we need to find min through k. Hence this algorithm can return partially sorted array up-to k elements. 

 

 

Code 

Can be downloaded from here - KthMinimum.java (remove the .txt extension)

 

view source print ?
001 /**
002  * Copyright (c) 2010-2020 Malkit S. Bhasin. All rights reserved.
003  *
004  * All source code and material on this Blog site is the copyright of Malkit S.
005  * Bhasin, 2010 and is protected under copyright laws of the United States. This
006  * source code may not be hosted on any other site without my express, prior,
007  * written permission. Application to host any of the material elsewhere can be
008  * made by contacting me at mbhasin at gmail dot com
009  *
010  * I have made every effort and taken great care in making sure that the source
011  * code and other content included on my web site is technically accurate, but I
012  * disclaim any and all responsibility for any loss, damage or destruction of
013  * data or any other property which may arise from relying on it. I will in no
014  * case be liable for any monetary damages arising from such loss, damage or
015  * destruction.
016  *
017  * I further grant you ("Licensee") a non-exclusive, royalty free, license to
018  * use, modify and redistribute this software in source and binary code form,
019  * provided that i) this copyright notice and license appear on all copies of
020  * the software;
021  *
022  * As with any code, ensure to test this code in a development environment
023  * before attempting to run it in production.
024  *
025  * @author Malkit S. Bhasin
026  *
027  */
028  
029 public class KThMinimum {
030  
031     /**
032      * @param inputArray
033      *            unordered array of non-negative integers
034      * @param k
035      *            order of minimum value desired
036      * @return kth minimum value
037      */
038     public static int getKthMinimum(int[] inputArray, int k) {
039         return findKthMinimum(inputArray, k)[k - 1];
040     }
041  
042     /**
043      * @param inputArray
044      *            unordered array of non-negative integers
045      * @param k
046      *            ordered number of minimum values
047      * @return k ordered minimum values
048      */
049     public static int[] getMinimumKSortedElements(int[] inputArray, int k) {
050         return findKthMinimum(inputArray, k);
051     }
052  
053     /**
054      * First output tree will be obtained using tournament method. For k
055      * minimum, the output tree will be backtracked k-1 times for each sub tree
056      * identified by the minimum value in the aggregate adjacency list obtained
057      * from each run. The minimum value after each run will be recorded and
058      * successive runs will produce next minimum value.
059      *
060      * @param inputArray
061      * @param k
062      * @return ordered array of k minimum elements
063      */
064     private static int[] findKthMinimum(int[] inputArray, int k) {
065         int[] partiallySorted = new int[k];
066         int[][] outputTree = getOutputTree(inputArray);
067         int root = getRootElement(outputTree);
068         partiallySorted[0] = root;
069         int rootIndex = 0;
070         int level = outputTree.length;
071         int[][][] fullAdjacencyList = new int[k - 1][][];
072         int[] kthMinIdx = null;
073         for (int i = 1; i < k; i++) {
074             fullAdjacencyList[i - 1] = getAdjacencyList(outputTree, root,
075                     level, rootIndex);
076             kthMinIdx = getKthMinimum(fullAdjacencyList, i, root);
077             int row = kthMinIdx[0];
078             int column = kthMinIdx[1];
079             root = fullAdjacencyList[row][column][0];
080             partiallySorted[i] = root;
081             level = column + 1;
082             rootIndex = fullAdjacencyList[row][column][1];
083         }
084  
085         return partiallySorted;
086     }
087  
088     /**
089      * Takes an input array and generated a two-dimensional array whose rows are
090      * generated by comparing adjacent elements and selecting minimum of two
091      * elements. Thus the output is inverse triangle (root at bottom)
092      *
093      * @param values
094      * @return
095      */
096     public static int[][] getOutputTree(int[] values) {
097         Integer size = new Integer(values.length);
098         double treeDepth = Math.log(size.doubleValue()) / Math.log(2);
099         // int intTreeDepth = getIntValue(Math.ceil(treeDepth)) + 1;
100         int intTreeDepth = (int) (Math.ceil(treeDepth)) + 1;
101         int[][] outputTree = new int[intTreeDepth][];
102  
103         // first row is the input
104         outputTree[0] = values;
105         printRow(outputTree[0]);
106  
107         int[] currentRow = values;
108         int[] nextRow = null;
109         for (int i = 1; i < intTreeDepth; i++) {
110             nextRow = getNextRow(currentRow);
111             outputTree[i] = nextRow;
112             currentRow = nextRow;
113             printRow(outputTree[i]);
114         }
115         return outputTree;
116     }
117  
118     /**
119      * Compares adjacent elements (starting from index 0), and construct a new
120      * array with elements that are smaller of the adjacent elements.
121      *
122      * For even sized input, the resulting array is half the size, for odd size
123      * array, it is half + 1.
124      *
125      * @param values
126      * @return
127      */
128     private static int[] getNextRow(int[] values) {
129         int rowSize = getNextRowSize(values);
130         int[] row = new int[rowSize];
131         int i = 0;
132         for (int j = 0; j < values.length; j++) {
133             if (j == (values.length - 1)) {
134                 // this is the case where there are odd number of elements
135                 // in the array. Hence the last loop will have only one element.
136                 row[i++] = values[j];
137             else {
138                 row[i++] = getMin(values[j], values[++j]);
139             }
140         }
141         return row;
142     }
143  
144     /**
145      * From the passed full adjacency list and min value scans the list and
146      * returns the information about next minimum value. It returns int array
147      * with two values:
148      * first value: index of the back-track (the min value was found in the
149      * Adjacency list for min value, second min etc.)
150      * second value: index within the identified run.
151      *
152      * @param fullAdjacencyList
153      *            Adjacency list obtained after k-1 backtracks
154      * @param kth
155      *            Order of minimum value desired
156      * @param kMinusOneMin
157      *            value of k-1 min element
158      * @return
159      */
160     private static int[] getKthMinimum(int[][][] fullAdjacencyList, int kth,
161             int kMinusOneMin) {
162         int kThMin = Integer.MAX_VALUE;
163         int[] minIndex = new int[2];
164         int j = 0, k = 0;
165         int temp = -1;
166  
167         for (int i = 0; i < kth; i++) {
168             for (j = 0; j < fullAdjacencyList.length; j++) {
169                 int[][] row = fullAdjacencyList[j];
170                 if (row != null) {
171                     for (k = 0; k < fullAdjacencyList[j].length; k++) {
172                         temp = fullAdjacencyList[j][k][0];
173                         if (temp <= kMinusOneMin) {
174                             continue;
175                         }
176                         if ((temp > kMinusOneMin) && (temp < kThMin)) {
177                             kThMin = temp;
178                             minIndex[0] = j;
179                             minIndex[1] = k;
180                         }
181                     }
182                 }
183             }
184         }
185         return minIndex;
186     }
187  
188     /**
189      * Back-tracks a sub-tree (specified by the level and index) parameter and
190      * returns array of elements (during back-track path) along with their index
191      * information. The order elements of output array indicate the level at
192      * which these elements were found (with elements closest to the root at the
193      * end of the list)
194      *
195      * Starting from root element (which is minimum element), find the lower of
196      * two adjacent element one row above. One of the two element must be root
197      * element. If the root element is left adjacent, the root index (for one
198      * row above) is two times the root index of any row. For right-adjacent, it
199      * is two times plus one. Select the other element (of two adjacent
200      * elements) as second minimum.
201      *
202      * Then move to one row further up and find elements adjacent to lowest
203      * element, again, one of the element must be root element (again, depending
204      * upon the fact that it is left or right adjacent, you can derive the root
205      * index for this row). Compare the other element with the second least
206      * selected in previous step, select the lower of the two and update the
207      * second lowest with this value.
208      *
209      * Continue this till you exhaust all the rows of the tree.
210      *
211      * @param tree
212      *            output tree
213      * @param rootElement
214      *            root element (could be of sub-tree or outputtree)
215      * @param level
216      *            the level to find the root element. For the output tree the
217      *            level is depth of the tree.
218      * @param rootIndex
219      *            index for the root element. For output tree it is 0
220      * @return
221      */
222     public static int[][] getAdjacencyList(int[][] tree, int rootElement,
223             int level, int rootIndex) {
224         int[][] adjacencyList = new int[level - 1][2];
225         int adjacentleftElement = -1, adjacentRightElement = -1;
226         int adjacentleftIndex = -1, adjacentRightIndex = -1;
227         int[] rowAbove = null;
228  
229         // we have to scan in reverse order
230         for (int i = level - 1; i > 0; i--) {
231             // one row above
232             rowAbove = tree[i - 1];
233             adjacentleftIndex = rootIndex * 2;
234             adjacentleftElement = rowAbove[adjacentleftIndex];
235  
236             // the root element could be the last element carried from row above
237             // because of odd number of elements in array, you need to do
238             // following
239             // check. if you don't, this case will blow {8, 4, 5, 6, 1, 2}
240             if (rowAbove.length >= ((adjacentleftIndex + 1) + 1)) {
241                 adjacentRightIndex = adjacentleftIndex + 1;
242                 adjacentRightElement = rowAbove[adjacentRightIndex];
243             else {
244                 adjacentRightElement = -1;
245             }
246  
247             // if there is no right adjacent value, then adjacent left must be
248             // root continue the loop.
249             if (adjacentRightElement == -1) {
250                 // just checking for error condition
251                 if (adjacentleftElement != rootElement) {
252                     throw new RuntimeException(
253                             "This is error condition. Since there "
254                                     " is only one adjacent element (last element), "
255                                     " it must be root element");
256                 else {
257                     rootIndex = rootIndex * 2;
258                     adjacencyList[level - 1][0] = -1;
259                     adjacencyList[level - 1][1] = -1;
260                     continue;
261                 }
262             }
263  
264             // one of the adjacent number must be root (min value).
265             // Get the other number and compared with second min so far
266             if (adjacentleftElement == rootElement
267                     && adjacentRightElement != rootElement) {
268                 rootIndex = rootIndex * 2;
269                 adjacencyList[i - 1][0] = adjacentRightElement;
270                 adjacencyList[i - 1][1] = rootIndex + 1;
271             else if (adjacentleftElement != rootElement
272                     && adjacentRightElement == rootElement) {
273                 rootIndex = rootIndex * 2 1;
274                 adjacencyList[i - 1][0] = adjacentleftElement;
275                 adjacencyList[i - 1][1] = rootIndex - 1;
276             else if (adjacentleftElement == rootElement
277                     && adjacentRightElement == rootElement) {
278                 // This is case where the root element is repeating, we are not
279                 // handling this case.
280                 throw new RuntimeException(
281                         "Duplicate Elements. This code assumes no repeating elements in the input array");
282             else {
283                 throw new RuntimeException(
284                         "This is error condition. One of the adjacent "
285                                 "elements must be root element");
286             }
287         }
288  
289         return adjacencyList;
290     }
291  
292     /**
293      * Returns minimum of two passed in values.
294      *
295      * @param num1
296      * @param num2
297      * @return
298      */
299     private static int getMin(int num1, int num2) {
300         return Math.min(num1, num2);
301     }
302  
303     /**
304      * following uses Math.ceil(double) to round to upper integer value..since
305      * this function takes double value, diving an int by double results in
306      * double.
307      *
308      * Another way of achieving this is for number x divided by n would be -
309      * (x+n-1)/n
310      *
311      * @param values
312      * @return
313      */
314     private static int getNextRowSize(int[] values) {
315         return (int) Math.ceil(values.length / 2.0);
316     }
317  
318     /**
319      * Returns the root element of the two-dimensional array.
320      *
321      * @param tree
322      * @return
323      */
324     public static int getRootElement(int[][] tree) {
325         int depth = tree.length;
326         return tree[depth - 1][0];
327     }
328  
329     private static void printRow(int[] values) {
330         for (int i : values) {
331             // System.out.print(i + " ");
332         }
333         // System.out.println(" ");
334     }
335  
336     public static void main(String args[]) {
337         int[] input = { 214513181710612941115316};
338         System.out.println("Fifth Minimum: " + getKthMinimum(input, 5));
339  
340         int minimumSortedElementSize = 10;
341         int[] tenMinimum = getMinimumKSortedElements(input,
342                 minimumSortedElementSize);
343         System.out.print("Minimum " + minimumSortedElementSize + " Sorted: ");
344         for (int i = 0; i < minimumSortedElementSize; i++) {
345             System.out.print(tenMinimum[i] + " ");
346         }
347     }
348

}

 

转自:http://blogs.sun.com/malkit/entry/finding_kth_minimum_partial_ordering

你可能感兴趣的:(Algorithm,list,tree,Integer,input,output)