Friday, January 14, 2011

UNION FIND ALGORITHM



A union-find algorithm is an algorithm that performs two useful operations on such a data structure:
  • Find: Determine which set a particular element is in. Also useful for determining if two elements are in the same set.
  • Union: Combine or merge two sets into a single set.

The Union-Find algorithm


This algorithm works for a set of size n where we start out with each element regarded as a singleton set represented by it's own value. Thus 0 represents the set containing 0, 1 represents the set containing 1, ... , n-1 representing the set n-2. Pictorially, we picture the sets as a series of these singletons:

    0   1   2   3   4   .....   n-2   n-1

Then we represent the union of two of these singletons as a tree where one points to the other and the top singleton is now the "set" containing both of the elements. For example, if we did Union(3,0) we would get the tree:

uf1

If we now did Union(3,2) we would first do a Find(3) which would return 0 since 3 is in set 0 and a Find(2) which would return 2 since 2 is still a singleton set. We would now union 2 and 0 and since 0 is larger we would get:

uf2

Suppose we had continued this for a while and we now had the two sets:

uf3

Suppose we now did Union(2,4). A Find(2) would return 0 (the set it is in) and a Find(4) would return 1 (the set it is in) so we do a union of 0 and 1. Since 0 is the larger set, we make 1 point to 0 giving:

uf4

Suppose we continued on and we now had the following:

uf5

Suppose we do a Union(4,7). Then we Find(4) which gives us 0 and Find(7) which gives us 5 so we will union sets 0 and 5. However, during the Find, an interesting optimization takes place. What happens is that when you do, for example, the Find(4) this Find works by following the path up from 4 to 0. After it does it, Find repeats the path from 4 to 0 but it changes each thing it passes so that it points directly to 0. That is, it "compresses" the things along it's path so they point directly to 0. This means that Find(4) on the first set changes that set so it looks like:

uf6

Similarly, when we follow the path from 7 to 5, compression yields:

uf7

Now, when we union 0 and 5, 5 is the smaller set so we make it point to 0 giving:

uf8



this post was made through the help of this website sources


http://www.users.csbsju.edu/~lziegler/CS239/UnionFind.html


http://en.wikipedia.org/wiki/Disjoint-set_data_structure