Kahn's Algorithm
Last updated
Last updated
A topological sort is a graph traversal where each node is visited only after all its dependencies have first been visited.
Topological sorting is used where things need to be scheduled based on their dependencies. For example, course schedules. A student needs to take CS101 before they can take CS201, and they need to take CS201 before they can take CS301. Other areas of applications are build programs in CS (a program's dependencies need to be built in a certain order), event scheduling, assembly instructions etc.
There can be more than one valid topological sort. In the following graph the following topological sorts are possible:
A B C D
A C B D
A C D B
A topological sort is only possible on a directed acyclic graph (DAG). This is because if the graph contained a cycle, then a sort order cannot be found. For example, in the following graph, C depends on A and B. But B depends on D which depends on C. C depends on a dependency (D) which depends on C! So, it is invalid.
Kahn's Alogorithm finds a topological sort of a DAG. It works by repeatedly removing vertices that have no incoming edges. Those removed vertices are recorded to the topological sort set/array.
In the above DAG graph, vertex A is the only vertex with no incoming edges. It is the starting point for the algorithm. It is removed and put in the topological sort set. Now, B and C vertices have no incoming edges. One is picked arbitrarily to process. Say B. B is removed to the topolgoical sort set. Now C is the only vertice without any incoming edges. C is removed. And finally D. The topological sort in this example is A, B, C, D.
The code below is a simple implementation of Kahn's algorithm. It assumes the vertices value are integers in the range of 0...number of vertices. This allows for an array to track the number of incoming edges for each vertex.