417. Pacific Atlantic Water Flow

Description

See https://leetcode.com/problems/pacific-atlantic-water-flow/

We are given a matrix. The matrix is surrounded by two oceans. The pacific on the top and left. The atlantic on the bottom and right. A matrix cell's value is a height.

We need to find cells where water would flow into both oceans. Water flows into an ocean, when adjacent cells have heights that are equal or less than the current cell. Water flows downhill and on flats, not uphill. For example:

+ is pacific ocean
~ is atlantic ocean
++++++++++
+[2, 1, 3]~
+[4, 5, 1]~
+[6, 2, 7]~
+[7, 4, 1]~
~~~~~~~~~~

The cells that flow into both oceans are: [[0, 2], [1, 1], [3, 0]]

  • The top row [2, 1, 3] flows into the pacific above it.

  • The bottom row [7, 4, 1] flows into the atlantic below it.

  • The left most column [2, 4, 6, 7] flow into the pacific beside it.

  • The right most column [3, 1, 7, 1] flow into the atlantic beside it.

  • Cell [0, 0] (2) flows into the pacific as it's on the edge that touches the pacific. It cannot flow to the atlantic to the right because cell [0, 2] (3) blocks it. The water would flow from [0, 0] to [0, 1], but then be blocked by [0, 2] as its height (3) is greater than cell's [0, 1] (1). Furthermore, cell [0, 1] is blocked by cell [1, 1] below it (5 > 1). Cell [0, 0] also cannot flow to the atlantic below because cell [1,0] (4) blocks it (4 > 2). Water cannot flow up hill.

  • Cell [0, 1] (1) flows into the pacific as its on top edge. It cannot flow to the atlantic because the cells around it [0, 2] (3) and [1, 1] (5) have greater heights. Water cannot flow up hill.

  • Cell [0, 2] (3) flows into both the atlantic and the pacific, as it is situated at the corner of both oceans.

  • Cell [1,0] (4) flows into the pacific as it's on the edge. It cannot flow into the atlantic ocean because cells [1, 1] (5) and [2, 0] (6) have greater values than it does. Water cannot flow up hill.

  • Cell [1, 1] (5) can flow into the pacific ocean via [0, 1] or [1, 0] as its value is greater than either cell. It can also flow into the atlantic ocean via [1, 2] (1) as its value is greater 5 > 1. It cannot flow into the atlantic via the cells below it. Although it can flow to the cell directly below it [2, 1] (2), the water is trapped there as all surrounding cells have greater values.

  • Cell [1, 2] (1) can flow into the atlantic as it is on the edge. It cannot flow to the pacific as all cells in those directions are greater than its value.

  • Cell [2, 0] (6) flows into the pacific ocean as it is on the edge. It cannot flow into the atlantic as there is not a path.

  • Cell [2, 1] (2) cannot flow anywhere as it is a hole, all surrounding cells have greater values.

  • Cell [2, 2] (7) flows into the atlantic ocean. It cannot flow into the pacific ocean as there is not path. It is able to flow water to the adjacent cells, [1, 2] and [2, 1] but then the water is trapped in those cells as their surround cells have greater heights.

  • Cell [3, 0] (7) flows into both oceans at is at the corner of both.

  • Cell [3, 1] (4) flows into the atlantic as it is on the bottom edge. There is no path the pacific ocean.

  • Cell [3, 2] (1) flows into the atlantic ocean. There is no path to the pacific ocean.

Failed approach

It took me awhile to figure out what the question was asking. I used the leetcode testcase runner to enter in test cases to verify my understanding. This worked well and I was able to figure out what the question was asking. I also had a set of test cases from doing this.

I tried the brute force solution of visiting every cell and running a dfs algorithm on it. I tried to figure out how to not reprocess nodes, but failed. I ran into trouble as I was trying to accomplish too much at once. I was trying to keep track of what cells flowed into both oceans and what had been processed all at once in one hash. I was trying to keep track of too many things at once in one data structure.

An example of a brute force solution is at: Algorithms-Made-Easy /March-Leetcoding-Challenge

Solution

We do not need to iterate through every cell and call dfs on it. Instead, we only need to iterate over the edges of the matrix. Why? Because, the top and left side of the matrix flows into the pacific. And the bottom and right side flow into the atlantic. Since every cell that flows into an ocean must go through one of these edge cells, we only need to call dfs on the edge cells. We can discover the cells that flow into both, by starting at the oceans. There is no point in calling dfs on an interior cell as it may lead nowhere.

The solution also doesn't try to figure out which cells flow into both oceans at once. Instead it finds the cells that flow into the pacific and stores that in one set. It also finds all the cells that flow into the atlantic and stores them in a different set. At the end we then find the intersect of the two sets for the answer (cells that flow into both oceans).

class Solution:

    def pacificAtlantic(self, heights: List[List[int]]) -> List[List[int]]:
        num_rows = len(heights)
        num_cols = len(heights[0])

        # implemented dfs function here so I could make use of the heights, num_rows and num_cols variables. Passing 
        # them in as parameters resulted in too many parameters.
        def dfs(row: int, col: int, prev_height: int, visited: Set):
            # Since we are doing the dfs from the outer edges and working in we need to reverse the heights logic.
            # we do heights[row][col] < prev_height as water in a low area on the inside of the matrix won't make it 
            # over a high area on the outer side of the matrix. So, we just return in that case.
            if (row, col) in visited or row < 0 or row >= num_rows or col < 0 or col >= num_cols or \
                    heights[row][col] < prev_height:
                return

            visited.add((row, col))
            cell_height = heights[row][col]
            dfs(row + 1, col, cell_height, visited)
            dfs(row - 1, col, cell_height, visited)
            dfs(row, col + 1, cell_height, visited)
            dfs(row, col - 1, cell_height, visited)

        pacific = set()
        atlantic = set()
        for c in range(num_cols):
            # the top row of the matrix flows into the pacific
            dfs(0, c, heights[0][c], pacific)
            # the bottom row of the matrix flows into the atlantic
            dfs(num_rows - 1, c, heights[num_rows - 1][c], atlantic)

        for r in range(num_rows):
            # the left side of the matrix (column 0 in all rows) flows into the pacific
            dfs(r, 0, heights[r][0], pacific)
            # the right side of the matrix (last column in all rows) flows into the atlantic
            dfs(r, num_cols - 1, heights[r][num_cols - 1], atlantic)

        # the answer requires a matrix. Otherwise, we could have done a set intersection instead, which would have been
        # more efficient. i.e. result = atlantic & pacific
        result = []
        for r in range(num_rows):
            for c in range(num_cols):
                if (r, c) in pacific and (r, c) in atlantic:
                    result.append([r, c])

        return result

Take away

  • Break things up into separate data structures and logic. Don't try to store multiple different states in one data structure.

  • Consider how we can eliminate data. In this case, only use cells that are on the edges of the matrix, as all solutions must flow through those cells. This means we can eliminate running dfs on the internal cells.

Resources

Video explanation: Pacific Atlantic Water Flow - Leetcode 417 - Python

Last updated