🤖
Data Structures and Algorithms
  • CS Theory And Problems
  • Big O
    • Big O Notation
  • Algorithms
    • Binary Search
    • Breadth First Search (BFS)
    • Depth First Search (DFS)
    • Dynamic Programming
    • Kadane's Algorithm
    • Kahn's Algorithm
    • Quickselect
    • Recursion
    • Sorting Algorithms
    • Sliding Window
  • Data Structures
    • Binary Heap
    • Graph
    • Linked List
    • Trees
  • Problems
    • LeetCode
      • 1. Two Sum
      • 3. Longest Substring Without Repeating Characters
      • 5. Longest Palindromic Substring
      • 11. Container With Most Water
      • 15. 3 Sum
      • 19. Remove Nth Node From End of List
      • 20. Valid Parentheses
      • 33. Search in Rotated Sorted Array
      • 49. Group Anagrams
      • 53. Maximum Subarray
      • 55. Jump Game
      • 56. Merge Intervals
      • 76. Minimum Window Substring
      • 98. Validate Binary Search Tree
      • 105. Construct Binary Tree from Preorder and Inorder Traversal
      • 121. Best Time to Buy and Sell Stock
      • 133. Clone Graph
      • 141. Linked List Cycle
      • 152. Maximum Product Subarray
      • 153. Find Minimum in Rotated Sorted Array
      • 200. Number of Islands
      • 206. Reverse Linked List
      • 207. Course Schedule
      • 217. Contains Duplicate
      • 226. Invert Binary Tree
      • 238. Product of Array Except Self
      • 242. Valid Anagram
      • 297. Serialize and Deserialize Binary Tree
      • 347. Top K Frequent Elements
      • 417. Pacific Atlantic Water Flow
      • 424. Longest Repeating Character Replacement
      • 435. Non-overlapping Intervals
      • 647. Palindromic Substrings
Powered by GitBook
On this page
  • Topological Sort
  • Kahn's Algorithm
  1. Algorithms

Kahn's Algorithm

PreviousKadane's AlgorithmNextQuickselect

Last updated 2 years ago

Topological Sort

A topological sort is a graph traversal where each node is visited only after all its dependencies have first been visited.

Topological sorting is used where things need to be scheduled based on their dependencies. For example, course schedules. A student needs to take CS101 before they can take CS201, and they need to take CS201 before they can take CS301. Other areas of applications are build programs in CS (a program's dependencies need to be built in a certain order), event scheduling, assembly instructions etc.

There can be more than one valid topological sort. In the following graph the following topological sorts are possible:

  • A B C D

  • A C B D

  • A C D B

A topological sort is only possible on a directed acyclic graph (DAG). This is because if the graph contained a cycle, then a sort order cannot be found. For example, in the following graph, C depends on A and B. But B depends on D which depends on C. C depends on a dependency (D) which depends on C! So, it is invalid.

Kahn's Algorithm

Kahn's Alogorithm finds a topological sort of a DAG. It works by repeatedly removing vertices that have no incoming edges. Those removed vertices are recorded to the topological sort set/array.

In the above DAG graph, vertex A is the only vertex with no incoming edges. It is the starting point for the algorithm. It is removed and put in the topological sort set. Now, B and C vertices have no incoming edges. One is picked arbitrarily to process. Say B. B is removed to the topolgoical sort set. Now C is the only vertice without any incoming edges. C is removed. And finally D. The topological sort in this example is A, B, C, D.

The code below is a simple implementation of Kahn's algorithm. It assumes the vertices value are integers in the range of 0...number of vertices. This allows for an array to track the number of incoming edges for each vertex.

'''
Kahn's algorithm - used for finding a topological sort (ordering) of a directed acyclic graph.
'''
from typing import List
from collections import defaultdict, deque


class Graph:

    __MIN_VALUE__ = 0
    
    def __init__(self, num_vertices):
        self.graph = defaultdict(list)
        self.num_vertices = num_vertices

    def __is_valid__(self, num: int) -> bool:
        return num >= Graph.__MIN_VALUE__ and num < self.num_vertices

    def add_edge(self, u: int, v: int):
        '''
        Add an edge from vertex u to vertex v
        '''
        if not self.__is_valid__(u) or not self.__is_valid__(v):
            raise Exception(f"Invalid vertex value. Must be between {Graph.__MIN_VALUE__}  and {self.num_vertices}")
        self.graph[u].append(v)

    def add_non_connected_vertex(self, u):
        self.graph[u].append(None)



class Kahns:
   def kahn(self, graph: Graph) -> List[int]:
        # this array keeps track of how many edges are incoming to a vertex (how many dependecies a course has)
        # because the vertices ints an array can be used. The array index is the vertex int
        incoming_edges = [0] * graph.num_vertices

        # populate the incoming_edges with the number of incoming edges for each vertex
        for u in graph.graph:
            for v in graph.graph[u]:
                if v:
                    incoming_edges[v] += 1


        # initialize the queue with any vertices that do NOT have any incoming edges (0 dependencies)
        processing_queue = deque()
        for vertex, edges in enumerate(incoming_edges):
            if edges == 0:
                processing_queue.append(vertex)

       
        topo_sort = []
        num_vertices_processed = 0
        while processing_queue:
            vertex = processing_queue.pop()
            num_vertices_processed += 1
            topo_sort.append(vertex)
            for v in graph.graph[vertex]:
                if v:
                    incoming_edges[v] -= 1
                    if incoming_edges[v] == 0:
                        processing_queue.append(v)

        # The graph contains a cycle if these two values do not match
        if num_vertices_processed != graph.num_vertices:
            raise Exception("Invalid graph")
        return topo_sort
Directed Acyclic Graph
A graph which contains a cycle. A topological sort is not possible