Kruskal's Algorithm
This algorithm creates a forest of trees. Initially the forest consists of n single node trees (and no edges). At each step, we add one (the cheapest one) edge so that it joins two trees together. If it were to form a cycle, it would simply link two nodes that were already part of a single connected tree, so that this edge would not be needed.The basic algorithm looks like this:
Forest MinimumSpanningTree( Graph g, int n, double **costs ) { Forest T; Queue q; Edge e; T = ConsForest( g ); q = ConsEdgeQueue( g, costs ); for(i=0;i<(n1);i++) { do { e = ExtractCheapestEdge( q ); } while ( Cycle( e, T ) ); AddEdge( T, e ); } return T; }The steps are:
 Construct a forest  with each node in a separate tree.
 Place the edges in a priority queue.
 Until we've added n1 edges,
 Continue extracting the cheapest edge from the queue,
until we find one that does not form a cycle,  Add it to the forest. Adding it to the forest will join two trees together.
 Continue extracting the cheapest edge from the queue,
We can use a heap for the priority queue. The trick here is to detect cycles. For this, we need a unionfind structure.
Unionfind structure
To understand the unionfind structure, we need to look at a partition of a set.Partitions
A partitions is a set of sets of elements of a set. Every element of the set belong to one of the sets in the partition.
 No element of the set belong to more than one of the subsets.
 Every element of a set belongs to one and only one of the sets of a partition.
The forest of trees is a partition of the original set of nodes. Initially all the subsets have exactly one node in them. As the algorithm progresses, we form a union of two of the trees (subsets), until eventually the partition has only one subset containing all the nodes.
A partition of a set may be thought of as a set of equivalence classes. Each subset of the partition contains a set of equivalent elements (the nodes connected as one of the trees of the forest). This notion is the key to the cycle detection algorithm. For each subset, we denote one element as the representative of that subset or equivalence class. Each element in the subset is, somehow, equivalent and represented by the nominated representative.
As we add elements to a tree, we arrange that all the elements point to their representative. As we form a union of two sets, we simply set the representative of one of the sets to point to any element of the other set.
So the test for a cycle reduces to: for the two nodes at the ends of the candidate edge, find their representatives. If the two representatives are the same, the two nodes are already in a connected tree and adding this edge would form a cycle. The search for the representative simply follows a chain of links.
Each node will need a representative pointer. Initially, each node is its own representative, so the pointer is set to NULL. As the initial pairs of nodes are joined to form a tree, the representative pointer of one of the nodes is made to point to the other, which becomes the representative of the tree. As trees are joined, the representative pointer of the representative of one of them is set to point to any element of the other. (Obviously, representative searches will be somewhat faster if one of the representatives is made to point directly to the other.)
Select diagrams of Kruskal's algorithm in operation.
Greedy operation
At no stage did we try to look ahead more than one edge  we simply chose the best one at any stage. Naturally, in some situations, this myopic view would lead to disaster! The simplistic approach often makes it difficult to prove that a greedy algorithm leads to the optimal solution. proof by contradiction is a common proof technique used: we demonstrate that if we didn't make the greedy choice now, a nonoptimal solution would result. Proving the MST algorithmis, happily, one of the simpler proofs by contradiction!
Data structures for graphs
You should note that we have discussed graphs in an abstract way: specifying that they contain nodes and edges and using operations like AddEdge, Cycle, etc. This enables us to define an abstract data type without considering implementation details, such as how we will store the attributes of a graph! This means that a complete solution to, for example, the MST problem can be specified before we've even decided how to store the graph in the computer. However, representation issues can't be deferred forever, so we need to examine ways of representing graphs in a machine. As before, the performance of the algorithm will be determined by the data structure chosen.The following sequence of diagrams illustrates Kruskal's algorithm in operation.
gh is shortest.
Either g or h could be the representative,  
ci creates two trees.
c chosen as representative for second.  
fg is next shortest.
Add it, choose g as representative.  
ab creates a 3rd tree  
Add cf, merging two trees. c is chosen as the representative.  
gi is next cheapest, but a cycle would be created. c is the representative of both. 

Add cd instead  
hi would make a cycle  
Add ah instead  
bc would create a cycle.
Add de instead 