For this blog post I want to talk about the sorting algorithm QuickSort.
QuickSort is a Divide and Conquer algorithm like MergeSort, which means that they recursively divide the arrays into smaller and smaller sorted subarrays until the array is finally sorted. They also both have a run-time of N log N, which is the fastest run time for any compare based sorting algorithm.
One key difference between MergeSort and QuickSort is that MS uses an auxiliary array for each recursion (using extra space proportional to N), while QS does not. This makes QS a better algorithm in cases where memory space is limited.
- N log N run time
- Simple “inner loop,” which results in a faster run time than other N log N sorting algorithms
- Array manipulation happens in place without extra auxiliary memory
- Quick Sort arbitrarily picks the first element in the array as “pivot”
- It “partitions” the array into two divisions: The left side of the array is less than the “pivot”, and the right side of the array is greater than “pivot”
- It does this recursively until the entire array is sorted.
This picture shows a big picture overview of how the algorithm works for an array of chars from the string “QUICKSORTEXAMPLE.” First, there is a shuffling of the values — this makes QS more efficient, which we will get into later.
In the first “partition,” K is the pivot and the array is divided into a left section (elements less than the pivot) and right section (elements greater than the pivot). This happens recursively until we have successfully partitioned each subsection, resulting in a sorted array.
How does partitioning work?
This picture shows how the first partition happens.
The first element “K” is arbitrarily selected as the “pivot.” There are two indices, one scanning from the left and one scanning from the right. These two indices are scanning for two elements that need to be swapped so that the left side remains less than the pivot and the right side remains greater than. In this example, R and C are the first two elements that need to be exchanged.
The two indices scan and exchange elements until they pass each other. After the elements have been swapped into their correct partitions, the pivot gets swapped into the index between the two sections.
The end result is [less than pivot subsection] [pivot] [greater than pivot subsection].
What does the code look like?
quick_sort shuffles and calls sort() on entire array
sort() is the important recursive function. Given a valid section of an array, it will partition the subarray, and set j = the index of the pivot point. It then calls itself on the two newly partitioned sections.
This the partition method. First, it creates the two indices, one from the left and one from the right (note: left index starts at pivot and right starts at the last element plus 1). It sets the pivot, v.
The outer while loop executes until the two scanning indices have met or passed.
while true do #outer while loop
break if i >= j
The two inner while loops are scanning for the next elements that should be swapped in each partition — remember that the left side needs to remain less than and right side needs to stay greater than the pivot. Then, once both elements are found they are swapped, and this process is repeated until the indices pass or meet.
The two inner loops are also controlling for when the indices reach the end. I strongly suggest running through the partition with some smaller arrays like [4, 2], [2, 4], [2, 6, 1] to see how the partitions and indices resolve when the while loops terminate. ***Notice how j always ends up being the index of the last element in the left partition.***
Finally, once all the elements are on their proper side, the pivot is put into place by swapping it with index j. Partition returns the new index of partition.
The sort() method then calls itself on each division recursively until the entire array has been sorted.
Avoiding Worst Case Scenario of QuickSort
In the best-case scenario of QuickSort, we want the partitions to be divided in the middle of the array. This would result in N log N run time. Although things do not always go this well, it is true that the partition falls in the middle on average.
On the other hand, the worst-case scenario occurs when partitions are unbalanced. We can imagine a situation where we have the pivot be the smallest number for each partition, creating an unnecessarily large amount of partitions.
[1, 2, 3, 4, 5, 6, 7, 5, 10]
Pivot 1, Partitions , [2,3,4,5,6,7,5,10]
Pivot 2, Partitions , [3,4,5,6,7,5,10]
… and so on
When we have consistently a partition where one subarray is empty, the runtime becomes quadratic (N²) and the space required to handle the recursion also becomes linear (N).
We can avoid this situation by shuffling the array before quicksorting.
A simple inner loop, a relatively small number of compares, and in-place sorting (no auxiliary array) make QuickSort the most efficient compared based sorting algorithm today.
I hope you found this post about QuickSorting helpful!