1. MST Review :
a) Input : Undirected graph G = (V; E), edge costs ce.
b) Output : Min-cost spanning tree (no cycles, connected).
c) Assumptions : G is connected
d) Cut Property : If e is the cheapest edge crossing some cut (A , B), then e belongs to the MST.
2. Kruskal's MST Algorithm
- Sort edges in order of increasing cost
(Rename edges 1, 2, ... , m so that c1 < c2 < ... < cm)
- T = empty set;
- For i = 1 to m
- If T U {i} has no cycles
- Add i to T
- Return T
3. Correctness of Kruskal's algorithm
Let T* = output of Kruskal's algorithm on input graph G.
(1) Clearly T* has no cycles.
(2) T* is connected.
- By Empty Cut Lemma, only need to show that T crosses every cut.
- Fix a cut (A , B). Since G is connected, so at least one of its edges crosses (A , B).
- Kruskal will include first edge crossing (A , B) that it sees,
by Lonely Cut Corollary, cannot create a cycle.
(3) Every edge of T satisfied by the Cut Property. (Implies T is the MST)
- Consider iteration where edge (u , v) added to current set T. Since T U { (u, v)} has no cycle, T has no u - v path.
- exists empty cut (A , B) separating u and v. (As in proof of Empty Cut Lemma)
- no edges crossing (A , B) were previously considered by Kruskal's algorithm.
- (u , v) is the first ( hence the cheapest!) edge crossing (A , B).
- (u , v) justified by the Cut Property.
4. Running Time of Kruskal's Algorithm
- Sorting edges : O(m log n) , because m = O(n^2) assuming nonparallel edges
- O(m) iterations and O(n) time to check for cycle (Use BFS or DFS in the graph (V , T) which contains <= n - 1 edges)
Plan : Data structure for O(1)-time cycle checks ==> overall algorithm has O(m log n) running time.
5. The Union-Find Data Structure
- Maintain partition of a set of objects.
- FIND(X): Return name of group that X belongs to.
- UNION(Ci , Cj ): Fuse groups Ci , Cj into a single one.
6. Applied Union-Find Data Structure in Kruskal's Algorithm:
- Groups = Connected components w.r.t. chosen edges T.x
- Adding new edge (u , v) to T <==> Fusing connected components of u, v.
7. Union-Find Data Structure Implementation:
- Maintain one linked structure per connected component of (V , T).
- Each component has an arbitrary leader vertex.
- Invariant : Each vertex points to the leader of its component.
- Given edge (u , v), can check if u & v already in same component in O(1) time : FIND(u) = FIND(v)
- Maintain the invariant : When two components merge, have smaller one inherit the leader of the larger one.
A single vertex v have its leader pointer updated at most O(log n) times over the course of Kruskal's algorithm. Because : Every time v's leader pointer gets updated, population of its component at least doubles ==> Can only happen log2 n times. (the total number of vertices is n)