Union find (Disjoint set)
1. Disjoint set in math
In mathematics, two sets are said to be disjoint sets
if they have no element in common. Equivalently, disjoint sets are sets whose intersection is the empty set. For example, {1, 2, 3} and {4, 5, 6} are disjoint sets, while {1, 2, 3} and {3, 4, 5} are not.
2. Disjoint set data structure
Union find(Disjoint set) is a data structure that stores a collection of disjoint (non-overlapping) sets. Its has 2 primary operations:
- Find: Determine which subset a particular element is in. This can be used for determining if two elements are in the same subset.
- Union: Join two subsets into a single subset
Below is the sample code which implements union-find algorithms
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 |
|
3. Optimization
Path compression
This optimization is designed for speeding up find
method.
The naive find() method is read-only, when find() is called for an element i, root of the tree is returned, the find() operation traverses up from i to find root.
The idea of path compression is to make the found root as parent of i so that we don’t have to traverse all intermediate nodes again. If i is root of a subtree, then path (to root) from all nodes under i also compresses.
Below is the optimized find() method with Path Compression.
1 2 3 4 5 6 7 |
|
We can also implement find
without recursion.
1 2 3 4 5 6 7 8 |
|
Union by rank
In this optimization we will change the unionSet
method.
In the naive implementation the second tree always got attached to the first one. In practice that can lead to trees containing chains of length O(n).
The solution is to always attach smaller depth tree under the root of the deeper tree.
Below is the optimized unionSet()
method with Union by rank.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 |
|