# 347. Top K Frequent Elements

Given an integer array`nums`

and an integer`k`

, returnthe`k`

most frequent elements. You may return the answer inany order.Constraints:

`1 <= nums.length <= 105`

`k`

is in the range`[1, the number of unique elements in the array]`

. It isguaranteedthat the answer isunique.

There are various solutions to this problem. The simple approach is to use the built in sort function. But, the follow up to question is to provide a solution that performs in under O(n log n) time. I did it in O(n log k) time using a heap. However, there are two approaches that give a O(n) solution.

The bucket sort solution is the best answer for this problem as it is fast (O(n)) and the code is straight forward.

In all solutions, the first step is to create a dict that maps the numbers to their frequencies.

This approach uses the built in sort function. The frequency dict is created. The built in sorted function is used to sort on the dict keys, using their values as the sort key.

class Solution:

def topKFrequent(self, nums: List[int], k: int) -> List[int]:

freq = Counter(nums)

result = sorted(freq.keys(), key=lambda n: freq[n], reverse=True)

return result[:k]

Time complexity: O(n log n) - This is the time complexity of the built in sort function. There is also the O(n) time to build the hash and reverse the sorted array. But since these are additions, the complexity reduces to O(n log n)

Space complexity: O(n) - Building the hash is O(n) and the sorted function creates a new list which is also O(n). O(n) + O(n) reduces to O(n)

Use a heap as the max frequency will be at the root. Popping the root k times will give the k most frequent items.

Python has the built in heapq module which makes working with heaps easy. It has the heapq.nlargest function which does which this question asks for.

By doing the

`len(nums) == k `

check at the start ensures that the code below it will operate in k time, since k will be less than n. class Solution:

def topKFrequent(self, nums: List[int], k: int) -> List[int]:

# This is the case where nums is unique numbers, so everything has a frequency of 1.

# Example nums=[1,2,3,4], k=4

if len(nums) == k:

return nums

freq = Counter(nums)

# return k elements for the iterable freq.keys() (numbers), sort by freq values (which is the frequency of each number)

return heapq.nlargest(k, freq.keys(), key=freq.get)

Time Complexity: O(n log k) - heapify and retrieve the k elements. The n scenario is covered by the

`len(nums) == k `

check, which runs in O(1) time.Space Complexity: O(n + k) - dict of

`n`

elements and Heap of `k`

elementsIn this solution an array where the indexes represent the frequencies is created. The numbers are sorted into these buckets. This is known as a bucket sort.

Example: nums = [1, 4, 1, 3, 4, 2, 1, 2]

buckets = [[], [3], [4, 2], [1]]

- index 0 - will always be empty, as it represents frequency 0. We do not consider numbers that are non-existent in the array.
- index 1 - represents the numbers that have a frequency of 1. In this case the number 3
- index 2 - represents the numbers that have a frequency of 2. In this case the numbers 4 and 2
- index 3 - represent the nubers that have a frequency of 3. In this case the number 1.

The size of the array is the size of the nums array + 1. Why? Because index 0 will never be used. For the case where all the numbers are the same, example nums = [5, 5, 5] the frequency will be len(nums) + 1 (3 in this case).

class Solution:

def topKFrequent_bucket_sort(self, nums: List[int], k: int) -> List[int]:

if len(nums) == k:

return nums

freq = Counter(nums)

buckets = [[] for i in range(len(nums) + 1)]

for num, count in freq.items():

buckets[count].append(num)

result = []

for i in range(len(buckets) - 1, 0, -1):

for num in buckets[i]:

result.append(num)

if len(result) == k:

return result

Time Complexity: O(n) - The nested for loops are not O(n^2) because they are iterating over different things. The outer loop is iterating over the bucket array. The inner loop is iterating over the array in each bucket. See Big O Notation

Space Complexity: O(n) - Requires space for the freq dict, the buckets array and the result array

class Solution:

def topKFrequent(self, nums: List[int], k: int) -> List[int]:

if len(nums) == k:

return nums

counts = Counter(nums)

unique_nums = list(counts.keys())

def partition(left, right):

pivot_index = (left + right) // 2

unique_nums[right], unique_nums[pivot_index] = unique_nums[pivot_index], unique_nums[right]

mark = left

for i in range(left, right):

if counts[unique_nums[i]] < counts[unique_nums[right]]:

unique_nums[i], unique_nums[mark] = unique_nums[mark], unique_nums[i]

mark += 1

unique_nums[mark], unique_nums[right] = unique_nums[right], unique_nums[mark]

return mark

def select(left, right, kth_position):

val = partition(left, right)

if val < kth_position:

select(val + 1, right, kth_position)

elif val > kth_position:

select(left, val - 1, kth_position)

else:

return

k_position = len(unique_nums) - k

select(0, len(unique_nums) - 1, k_position)

return unique_nums[k_position:]