Expected linear time MST algorithm
From Wikipedia, the free encyclopedia
The expected linear time MST algorithm is a randomized algorithm for computing the minimum spanning forest of a weighted graph with no isolated vertices. It was developed by David Karger, Philip Klein, and Robert Tarjan.[1] The algorithm relies on techniques from Borůvka's algorithm along with an algorithm for verifying a minimum spanning tree in linear time.[2][3] It combines the design paradigms of divide and conquer algorithms, greedy algorithms, and randomized algorithms to achieve expected linear performance.
Deterministic algorithms that find the minimum spanning tree include Prim's algorithm, Kruskal's algorithm, reverse-delete algorithm, and Borůvka's algorithm.
Borůvka Step
The key insight to the algorithm is a random sampling step which partitions a graph into two subgraphs by randomly selecting edges to include in each subgraph. The algorithm recursively finds the minimum spanning forest of the first subproblem and uses the solution in conjunction with a linear time verification algorithm to discard edges in the graph that cannot be in the minimum spanning tree. A procedure taken from Borůvka's algorithm is also used to reduce the size of the graph at each recursion.
Each iteration of the algorithm relies on an adaptation of Borůvka's algorithm referred to as a Borůvka step:
Input: A graph G with no isolated vertices 1 For each vertex v, select the lightest edge incident on v 2 Create a contracted graph G' by replacing each component of G connected by the edges selected in step 1 with a single vertex 3 Remove all isolated vertices, self-loops, and non-minimal repetitive edges from G' Output: The edges selected in step 1 and the contracted graph G'
A Borůvka step is equivalent to the inner loop of Borůvka's algorithm, which runs in O(m) time where m is the number of edges in G. Furthermore, since each edge can be selected at most twice (once by each incident vertex) the maximum number of disconnected components after step 1 is equal to half the number of vertices. Thus, a Borůvka step reduces the number of vertices in the graph by at least a factor of two and deletes at least n/2 edges where n is the number of vertices in G.
Example execution of a Borůvka step
F-heavy and F-light edges
In each iteration the algorithm removes edges with particular properties that exclude them from the minimum spanning tree. These are called F-heavy edges and are defined as follows. Let F be a forest on the graph H. An F-heavy edge is an edge e connecting vertices u,v whose weight is strictly greater than the weight of the heaviest edge on the path from u to v in F. (If a path does not exist in F it is considered to have infinite weight). Any edge that is not F-heavy is F-light. If F is a subgraph of G then any F-heavy edge in G cannot be in the minimum spanning tree of G by the cycle property. Given a forest, F-heavy edges can be computed in linear time using a minimum spanning tree verification algorithm.[2][3]
Algorithm
Input: A graph G with no isolated vertices
- If G is empty return an empty forest
- Create a contracted graph G' by running two successive Borůvka steps on G
- Create a subgraph H by selecting each edge in G' with probability 1/2. Recursively apply the algorithm to H to get its minimum spanning forest F.
- Remove all F-heavy edges from G' (where F is the forest from step 3) using a linear time minimum spanning tree verification algorithm.[2][3]
- Recursively apply the algorithm to G' to get its minimum spanning forest.
Output: The minimum spanning forest of G' and the contracted edges from the Borůvka steps
