Sorting Lists
We have already seen how to work with lists in a functional language. The focus here will be on functional approaches to sorting lists; for simplicity we will work with lists of integers, although the techniques extend to any type with a total ordering relation.
Insertion Sort
One of the simplest methods for sorting a list is based on the insert
function, which adds one element in the correct position within an already sorted list.
Suppose that nums
is a (possibly empty) list of numbers, arranged in non-decreasing order, and we wish to insert the number n
.
If nums
is empty, then the result will just be the list [n]
; otherwise, split the list into a head
element (which must be the smallest) and a tail
list.
If n
is no larger than head
, then n
will be the smallest number overall, and we may insert it in front of head
.
On the other hand, if n
is larger than head
, then head
is still the smallest and must come first, while n
must be inserted somewhere in the tail
.
However, inserting n
into the tail
is the same problem we started with (insert a number into a sorted list), except the list is smaller—this is exactly the setup we need to make a recursive call to finish the job.
In code, here is the procedure we just described:
Once we have the insert
function, it is easy to build a sorting function based on it.
If we want to sort an empty list, then we are done.
Otherwise, we have a non-empty list that may be split (using pattern matching) into a head and a tail.
Recursively sorting the tail will give us a sorted list, and then we just need to insert the head into it:
Although this is simple, the call to insertion_sort
is not a tail-recursive call—that is, the recursive call is not the last thing done, because it still needs to call insert
after the recursion finishes.
That can cause problems when the list is large, because each recursive call needs to be saved on the function-call stack and we run the danger of overflowing the stack.
Another approach is to traverse the list from left to right, building up a sorted list by inserting each successive element (thus the sorted list is an accumulator—an extra argument that collects the result as the calculation proceeds; this is a common trick when making a function tail-recursive).
Because the last action of the aux
function in the recursive case is to call itself, this solution is properly tail-recursive1
and the ReasonML compiler will be able to produce code that doesn't overflow the stack no matter how many numbers we are sorting.2
Note that the helper function here (aux
) is essentially the same as the reduce
function we saw in the MapReduce section.
The ReasonML standard library provides the reduce
function under the name List.fold_left
, so we can also define insertion sort as:
Using currying, we can simplify this to let insertion_sort_left2 = List.fold_left(insert, [])
.
Selection Sort
Instead of having to go through the list to find the correct insertion point each time we want to add a number to the sorted list, we might try to do more work "up front" and choose the numbers to be added in the correct order.
This is the idea behind selection sort: given a non-empty list of numbers, if we can separate the smallest number from the rest, then that number will definitely be the head of the resulting sorted list and we will never have to adjust its position during the rest of the sorting operation.
So, our first goal is to write a select
function that takes a non-empty list and splits it into the smallest element plus a list of all of the other elements.
We may approach this using recursion: to find the smallest element in the list, first find the smallest element in the tail of the list (if any), then compare it against the head and keep the smaller of the two. The result should be a pair of the smallest element plus a list of all of the other elements. This leads to the following code:
Note that it only makes sense to call select
on a non-empty list, so we must make sure that we satisfy that precondition when making the recursive call.
ReasonML warns us that we are not handling all of the cases with our patterns, but we may ignore the warning here.
The select
function is not tail-recursive, so again we might look for an alternate way to compute it using an accumulator and a helper function.
Think of the usual process of searching a list for the smallest element: choose the first element as our initial guess, and then examine each of the rest of the numbers, refining our guess whenever we find a smaller element.
If our accumulator is the pair consisting of the smallest element seen so far (our current guess), plus a list of all of the other numbers examined, then we can view the selection process as a left-to-right reduction:
Once we have a selection function, the selection sort is easy to write recursively:
One of the exercises asks you to find a way to write this using tail-recursion.
Running Time of Insertion and Selection Sort
The insert
and select
functions each make a pass through all of the elements in their list argument, at least in the worst case (insertion will stop early if it finds the correct position, but selection needs to examine the entire list).
When doing an insertion sort, the insert
function is called on successively longer lists, so its running time will be proportional to the sum , where is the size of the list to be sorted. We have seen that this sum is , which is quadratic in (i.e., we say that it is ). Similarly, when doing a selection sort, the select
function is initially called on a list of size , and then on successively smaller lists: , which is the same sum in reverse.
Therefore, the worst-case running time of both sorting algorithms is , and they are known as quadratic sorts. The average case for insertion sort might seem to be better, because on average the insertion process will only have to traverse half of the list to find the correct insertion point (as long as the input is random). However, that gives us a running time proportional to , which is still quadratic. For lists with millions of items, a quadratic sort will perform trillions of operations, which might be too long to wait (even at a billion operations per second, that could easily take on the order of an hour). So, we might wonder if there is a faster approach to such a fundamental operation as sorting.
Merge Sort
Instead of inserting one element at a time into a sorted list, the strategy for merge sort is to insert an entire sorted collection of items in a single operation. Since the items are in order, we know that after finding the correct insertion position for one item we can start looking from there for the next item's position, without having to start back at the beginning. If we are merging two sorted lists, each of size , then the total time taken will be instead of the required to insert items one at a time. Of course, we still need to sort each of the lists being merged, but since these are each only half the size of the total collection of elements we can treat that as a smaller problem to be solved recursively.
First, here is a recursive merge
function:
In words, if either list is empty, then the result is the other list. Otherwise, look at the heads of each list; the resulting merged list will start with the smaller of the two heads, followed by a merge of the tail following the smaller head with the entire other list.
(As a challenge, see if you can write a tail-recursive version of merge
; as a hint, you might want to make use of the reverse
function discussed in the section on accumulators).
The first part of the merge sort algorithm is to split the input list into two halves (provided there is more than one element), then recursively sort the halves to prepare for merging. One way to split a list into halves is to calculate the size of the list, then count off the first half of them into one list and the rest into another (an exercise below asks you to implement this). However, because we do not care about the order of the incoming data, a more efficient approach is to walk through the list, alternately putting elements into one or the other of the halves (which are built up in an accumulator containing a pair of lists):
Putting the pieces together then, we get the following merge_sort
function:
Note that merge_sort
makes two recursive calls to itself, so there is no easy way to make it entirely tail-recursive.
This is not a problem, though, because we know that the number of times that a list of elements can be split in two before hitting the base case is just ;
even if is one million, this is only around 20, so the stack will never get very deep with recursive calls to merge_sort
.
This observation allows us to compute the expected running time of merge sort: splitting and merging items can be done in time, and there will be levels of recursion (at the second level, there will be two lists of size to split and merge, at the third there will be four lists of size , etc., but that is still overall at each level), so the total expected running time is , which is considerably better than . By comparison, when is around a million, is only twenty million, while is a trillion.
Quicksort
We saw with insertion and selection sort that we could either do the hard word after the recursion, when we insert into an already-sorted sublist, or before the recursion, when we select and remove the smallest element. The merge sort algorithm feels more like insertion sort, because the hard work, the merging, is done after the recursion that sorts the halves. This should suggest that we look for an algorithm that will fill in the missing spot in this grid:
Subproblem Size | Easy Split, Hard Join | Hard Split, Easy Join |
---|---|---|
Insertion Sort | Selection Sort | |
Merge Sort | ? |
As you should know from the title of this section, the missing algorithm is Quicksort. The idea is that we will look at the values while splitting the list in two, in such a way that it will be trivial to join the parts back together after they are recursively sorted. This implies that all of the elements in the first part need to be less than the elements in the second part. The easiest way to ensure this is to choose a pivot element: elements less than the pivot go in the first part, and elements greater than the pivot go in the second part. Elements that are equal to the pivot may go in either part. A simple (and not very good…) choice for the pivot is the first element of the list; after using it to partition the rest of the list into two parts, we recursively sort those parts and then append them back together with the pivot element in between.
We may use the standard library function List.filter
to do the partitioning:
Now the quicksort function is very easy (making use of the ReasonML list append operator, @
):
The analysis of Quicksort is somewhat more difficult that the other sorts, however, because we do not know the sizes of the two parts that partition
will give us.
If the pivot is chosen at random, then we expect it to fall somewhere in the middle of the list, so the two parts will be roughly equal in size, and we get an running time just as for merge sort.
However, if we consistently choose a pivot that is either very small or very large (for example, if the list was already sorted, and we choose the head element, then the pivot will always be the smallest value!), then one of the parts will only have a few items while the other contains almost all of them.
In this worst case, the behavior of Quicksort approaches selection sort (which can be seen as the limiting case where we deliberately pivot on the smallest element), and it will run in time.
It is possible to show that, with a better way to choose the pivot, the quadratic worst case becomes very unlikely, and Quicksort will almost always run in time, but we will not do this here.
Exercises
Explain what will happen to the running time of each of the above sorting algorithms if the input is already sorted.
Answer
insertion_sort
will run in time, because each insertion will put its new element at the head of the result.insertion_sort_left
will run in time, because each insertion will have to put its new element all the way at the end of the result. Selection sort does not depend on the order of the input, because it has to examine the entire list to find the smallest on each pass. Merge sort will always run in time, because the best case for a merge operation is when everything in one list is smaller than everything in the other, and that only cuts the time by a factor of 2. Quicksort with the pivot taken as the head of the list exhibits its worst-case behavior, , as shown above.Give an example of an input list that will cause the simple recursive version of
insertion_sort
to exhibit its worst-case running time. That is, every insertion should need to traverse the entire list to find the correct insertion point.Answer
Any list sorted in reverse order will do.
Rewrite the
select_left
function as an application ofList.fold_left
to an appropriate reduction function. Whenselect_left(nums)
is called on a non-empty listnums
, the initial value passed into the reduction should be the pair(head, [])
, representing the initial guess that the head ofnums
is the smallest number, with an empty list of other numbers examined so far.Answer
let select_left2 = nums => {let aux = ((small, rest), n) => {if (n < small) {(n, [small, ...rest])} else {(small, [n, ...rest])}};switch (nums) {| [head, ...tail] => List.fold_left(aux, (head, []), tail)}};Find a way to write the selection sort algorithm using only tail-recursive functions. Hint: Instead of selecting the smallest element, modify
select_left
to separate out the largest element, then write a sorting function that accumulates the sorted list from back to front.Answer
/* Precondition: nums is non-empty */let select_max_left = nums => {let rec aux = (nums, accum) => {switch (nums) {| [] => accum| [head, ...tail] => {let (large, rest) = accum;if (head > large) {aux(tail, (head, [large, ...rest]))} else {aux(tail, (large, [head, ...rest]))}}}};switch (nums) {| [head, ...tail] => aux(tail, (head, []))}};let ssort = nums => {let rec aux = (nums, sorted) => {switch (nums) {| [] => sorted| _ => {let (large, rest) = select_max_left(nums);aux(rest, [large, ...sorted])}}};aux(nums, [])};Find a way to implement the
insert
function using only tail-recursive functions. Hint: First write a tail-recursive functionreverse_append(a, b)
that appends the reverse of lista
to the front of listb
, then use a helper function to insert a number into a sorted list by collecting the front half of the list (the elements less than the number) in an accumulator; when the correct position is found for the number, usereverse_append
to move the accumulated elements back onto the front of the result.Answer
let rec reverse_append = (a, b) => {switch (a) {| [] => b| [head, ...tail] => reverse_append(tail, [head, ...b])}};/* Precondition: nums is sorted in non-decreasing order */let insert_TR = (nums, n) => {let rec aux = (sorted, acc) => {switch (sorted) {| [] => reverse_append([n, ...acc], [])| [head, ...tail] => if (n <= head) {reverse_append(acc, [n, ...sorted])} else {aux(tail, [head, ...acc])}}};aux(nums, [])};Implement the merge sort
split
function by first computing the size of the list, then passing half that size to a function that takes a number, n, and a list and returns a pair with the first n elements of the list as the first component, and the rest of the list as the other component.Answer
let split = nums => {let rec size = nums => {switch (nums) {| [] => 0| [_, ...tail] => 1 + size(tail)}};let rec aux = (n, nums) => {switch (n, nums) {| (0, _) => ([], nums)| (_, []) => ([], [])| (_, [head, ...tail]) => {let (left, right) = aux(n - 1, tail);([head, ...left], right)}}};let n = size(nums) / 2;aux(n, nums)};The
partition
implementation for Quicksort shown above makes two passes over the list, once for each call toList.filter
. Write a version ofpartition
that does the job in just one pass, accumulating a pair of the two parts as it traverses the list.Answer
let partition = (pivot, nums) => {List.fold_left(((first, second), n) =>if (n < pivot) {([n, ...first], second)} else {(first, [n, ...second])},([], []),nums)};
- Well, it would be if we also modified the
insert
function itself to be tail-recursive; this is left as an exercise.↩ - In fact, when the compiler sees a tail-recursive call, it can essentially turn the recursive call into a loop back up to the top of the function, without needing to push a new function call on the stack. It is able to do this because it knows there is nothing left to do in the original call, so there is no need to return to where we left off.↩