Quickselect
Quickselect is often used to solve problems of the form "find kth something": kth smallest, kth largest, kth most frequent, kth less frequent, etc.
The idea is to pick a pivot index and do a partial sort of the array. Items less than the pivot are on the left of the pivot. Items greater or equal than the pivot are on the right. After the sort the pivot index is in its correct position.
Note that the items to the left and right are NOT sorted. However, this does not matter in the "find kth" scenario. What is important is whether the pivot is at the kth element. If it is than we've found the solution. The pivot and anything to the right of it will be the kth greatest. The pivot and anything to the left will be the kth least.
If the pivot is not at the correct position then another sort is needed. But, the sort is only done on the part of the array where the kth position will be found. If the pivot is at a position greater than k, then only the left side of the array is sorted. So, the left index to the pivot - 1 position will be sorted. A new pivot based on those indexes will be used. If the pivot is less than k then the kth element will be found on the right side. So, pivot + 1 to the right index will be sorted.
Choosing a pivot index
There are various schemes to choose a pivot point. Some are:
Choose the left or right most element - this can lead to a poor pivot though, if the array is already sorted
a random index
the midpoint
median of 3 (median of first, middle and last element of the partition) - this is a good strategy.
and median of medians - not used in production. Theoretically works but is too slow and doesn't handle duplicates.
Partition Strategy
There are different algorithms to do the partition. Below is the Lomuto Partition Scheme as it is simpler to implement than the original Hoare Partition Scheme. However, it is less efficient than Hoare's partition when all the elements are the same. The Lomuto partition does not work well on arrays that contain many repeated elements.
The steps for the Lomuto Partition are:
get the pivot index
swap the the pivot with the right most element
set a pointer to the left index
loop through the left to right index range
if the current element is less than the pivot than swap the current element with the pointer
advance the pointer
after the loop completes swap the pointer with the right most element (which holds the pivot)
the pivot will now be in its correct position
Example:
Time Complexity
The time complexity is O(n) with a worst case of O(n^2). The worst case occurs when a poor pivot is chosen which repeatedly reduces the partition size only by one element. This occurs when the array is already sorted. This can be mitigated by choosing a good pivot index. One strategy is to use the median-of-3 (https://en.wikipedia.org/wiki/Quicksort#Choice_of_pivot).
Last updated