1. Informal Goal: Connect a bunch of points together as cheaply as possible.
Blazingly Fast Greedy Algorithms:
- Prim's Algorithm
- Kruskal's algorithm
O(m log n) m is the # of edges and n is the # of vertices.
2. Problem Denition:
Input: Undirected graph G = ( V , E ) and a cost ce for each edge e in E.
- Assume adjacency list representation
- OK if edge costs are negative
Output: minimum cost ( sum of edge costs) tree T contained by E that spans all vertices .
Definition of spanning tree:
a) T has no cycles
b) The subgraph (V,T) is connected (i.e., contains path between each pair of vertices)
3. Standing Assumptions:
Assumption #1: Input graph G is connected.
- Else no spanning trees.
- Easy to check in preprocessing (e.g., depth-rst search).
Assumption #2: Edge costs are distinct.
- Prim + Kruskal remain correct with ties (which can be broken arbitrarily).
- Correctness proof a bit more annoying.
4. Prim's MST Algorithm:
- Initialize X = {s} [s in V chosen arbitrarily]
- T = empty set [invariant: X = vertices spanned by tree-so-far T]
- While X <> V
- Let e = (u, v) be the cheapest edge of G with u in X, v not in X.
- Add e to T
- Add v to X.
While loop: Increase # of spanned vertices in cheapest way possible.
5. Denition of Cut : A cut of a graph G = (V , E) is a partition of V into 2 non-empty sets. ( at most 2^(n-1) -1 cuts)
6. Empty Cut Lemma: A graph is not connected <==> exists a cut (A , B) with no crossing edges.
Proof : <== choose u in A and v in B , there is no path from u to v.
==> for (u, v) in G, that there is no path from u to v, Define :
A = {Vertices reachable from u in G}
B = V - A
So, no edge from A to B, otherwise A will be bigger
7. Double-Crossing Lemma: Suppose the cycle C in E has an edge crossing the cut (A , B): then so does some other edge of C. ( the crossing edge of C should be even)
8. Lonely Cut Corollary: If e is the only edge crossing some cut (A , B), then it is not in any cycle.
9. Claim: Prim's algorithm outputs a spanning tree.
Proof: (1) Algorithm maintains invariant that T spans X
(2) Can't get stuck with X <> V (other wise cut {X, V-X} has no crossing edge ==> G is disconnected
(3) No cycles ever get created in T. A newly added edge e is the 1st edge crossing (X , V - X) that gets added to T ==> its addition can't create a cycle in T
10. Cut Property: Consider an edge e of G. Suppose there is a cut (A , B) such that e is the cheapest
edge of G that crosses it. Then e belongs to the MST of G.
Proof : Suppose there is an edge e that is the cheapest one crossing a cut (A , B), yet e is not in the MST T*.
Idea: Exchange e with another edge in T* to make it even cheaper(contradiction).
Since T* is connected, must construct an edge f (<> e) crossing (A , B).
However exchange f with e may make T* not a spanning tree :
How to find e' : Let C = cycle created by adding e to T*. ( there is already path between the nodes connected by e, so adding e to T* constructs a cycle)
By the Double-Crossing Lemma: Some other edge e' of C [with e' <> e and e' in T] crosses (A , B).
T = T * U {e} - {e'} is also a spanning tree. Since ce < ce' , T cheaper than purported MST T*
11. Claim: Cut Property ==> Prim's algorithm is correct.
Proof: By previous video, Prim's algorithm outputs a spanning tree T*.
Key point: Every edge e in T* is explicitly justied by the Cut Property.
==> T* is a subset of the MST
==> Since T* is already a spanning tree, it must be the MST
12. Running time of straightforward implementation:
- O(n) iterations [where n = # of vertices]
- O(m) time per iteration [where m = # of edges]
==> O(mn) time
13. Prim's Algorithm with Heaps:
Invariant #1: Elements in heap = vertices of V - X.
Invariant #2: For v in V - X, key[v] = cheapest edge (u , v) with u in X (or infinitive if no such edges exist).
Given invariants, Extract-Min yields next vertex v not in X and edge (u , v) crossing (X , V - X) to add to X and T, respectively.
Can initialize heap with O( m + n log n ) = O(m log n) preprocessing. Inserts m >= n - 1 since G connected.
Pseudocode: When v added to X:
- For each edge (v , w) in E:
- If w in V - X ==> The only whose key might have changed (Update key if needed:)
- Delete w from heap
- Recompute key[w]:=min{key[w],cvw}
- Re-Insert into heap
14. Running Time with Heaps :
- Dominated by time required for heap operations
- (n - 1) Inserts during preprocessing
- (n - 1) Extract-Mins (one per iteration of while loop)
- Each edge (v , w) triggers one Delete/Insert combo
[When its 1rst endpoint is sucked into X]
==> O(m) heap operations [ m >= n - 1 since G connected]
==> O(m log n) time [As fast as sorting!]