Hi peers,
We know Prim’s algorithm can correctly find the minimum spanning tree (MST) in a given graph. Kruskal algorithm (“the algorithm” in the following context) is yet another simple but powerful way to find MST in a graph. In this short essay, I will provide the pseudocode and the proof for the algorithm in two parts.
Pseudocode
Setup:
Given a graph, G (V, E), V: {v1, v2, …, v_n} is the set of vertices in the graph and E: {e1, e2, …, e_n} is the set of edges.
∀ e_i ϵ E,∃ a corresponding cost,c_i; i ∈{1,2,3,…,n}. C:{c1,c2,…,c_n } denotes the set of costs associated with E.
T is the set of edges that have been visited by the algorithm.
m denotes the number of edges, while n denotes the number of vertices.
Initialization: T= ∅
Steps:
- Sort E in order of increasing cost.
- For the counter k = 1 to m:
If ( T ∪{k} has no cycles)
Add k to T.
-Return T as MST.
Proof of Correctness:
To prove, we might use the following lemmas. They are obvious.
L1 - Empty Cut Lemma: Graph,G,is connected. ⟺ G has no empty cut.
*Empty cut: The partition of the graph without cutting any edge of it.
L2 - Lonely Cut Lemma: If an edge, e, is the only edge that crosses an arbitrary cut, then e must not be in any cycle.
Now, we first prove that, the output from the algorithm, T, is a tree. In second part, we prove T is an MST.
Part 1: T is a tree
Setup:
Cut(A, B) is an arbitrary cut that partitions graph, G, into two subgraphs, A&B.
Proof:
We fist prove T is connected. Because graph G is connected, by lemma1, there must exist more than 1 edge(s) that crosses Cut(A, B). Let E’ denote the set of the edge that crosses Cut(A, B).
Recall the algorithm iterates through all edges in a single for loop. When the algorithm first encounters any edge, e’, which belongs to E’, the algorithm will include e’ in T, because e’ is not in any cycle so far by lemma2. Thus, the algorithm must include into T at least one edge that crosses the Cut(A, B). Since Cut (A, B) is an arbitrary cut, we can extend this argument to - T must include one edge that crosses any cut in G. By lemma1, T is connected because there does not exist empty cut for T.
After we prove T is connected, we now prove T is acyclic. This is self-evident because the algorithm prohibits any cycle to be formed in T by explicitly stating an if-condition in the loop.
Now, after we have proved T is both connected and acyclic, we also complete the proof that T is a tree. (Recall that tree is defined as acyclic and connected graph)
Part 2: T is MST
Because the algorithm sorts the edges in order of the increasing cost before it includes any edge into T, the algorithm always chooses the minimum cost edge to include in T at each iteration. This practice ensures that the output tree from the algorithm is of minimum cost. Thus, T is an MST. Q.E.D.
Hope you enjoy this one.
Best,
Ben