An Introduction to Haskell

Haskell is a modern ``purely functional'' programming language. A functional language is one in which functions are as easy to manipulate as the more common kinds of values, such as numbers and strings. By being purely functional, Haskell does not allow the operations which make functions difficult to work with in traditional imperative languages: assignment to global variables or reference parameters. As a result, calling a function will not have any ``side-effects''--that is, it will not change the values of any non-local variables--and we are free to treat them as pure functions (Carrano & Prichard refer to these as ``valued functions'' in Chapter 1 of our text).

As a higher-level language than C++ or Java, Haskell enables programmers to be more productive and write shorter, clearer, and more maintanable code. Unfortunately, industry practice has not yet embraced functional programming, in part because of the enormous effort it would take to retrain current programmers and redesign all the support tools. There are signs that the industry will gradually move to higher-level languages: for example, the next version of Java will contain several new language features influenced by the design of Haskell. In the meantime, we will study Haskell because it provides a convenient and safe (you will appreciate this safety when we talk about pointers in C++) setting in which to explore recursive functions and data structures.

Here is an example of Haskell code to perform the recursive Mergesort described in class:

merge [] ys = ys
merge xs [] = xs
merge (x:xs) (y:ys) = if x <= y
                      then x : merge xs (y:ys)
                      else y : merge (x:xs) ys

mergesort [] = []
mergesort [x] = [x]
mergesort xs = let (as, bs) = splitAt (length xs `quot` 2) xs
               in merge (mergesort as) (mergesort bs)

After defining these two functions, merge and mergesort, one can sort a list by entering a command such as mergesort [3,1,4,1,5,9,2,6,5]; the result is quickly printed: [1,1,2,3,4,5,5,6,9].

Even without knowing any details yet of the syntax of Haskell, it should be easy to recognize the structure of Mergesort in this code. The mergesort function is specified by three rules; the first two handle the base cases of an empty list or a one-element list, while the third rule handles the recursive case by splitting the data into two halves, calling mergesort on each half, and then merging the results with merge. The merge function is also pretty easy to read: if either of the lists to be merged is empty, then it just returns the other list (first two rules), otherwise it chooses the smaller of the two first elements and puts it first in the result, followed by another call to merge on the remainder of the data. Compare this to the one-and-a-half pages of code used for a C++ version of Mergesort in Chapter 9 of Carrano & Prichard (which only works on arrays of data up to a fixed maximum size--we will see a way around this in Chapter 4, but it adds several more lines to the code), and you will appreciate the power of higher-level programming provided by Haskell.

There are several points to observe about the Mergesort program as we start to learn the details of Haskell. First, the program is organized as a sequence of equations defining the functions. Unlike the assignment statement in C++, these are truly meant to be read as equations. This means that the familiar sort of algebraic reasoning by ``substituting equals for equals'' will work for Haskell programs. For example, consider the following definition of the factorial function:

factorial 0 = 1
factorial n = n * factorial (n-1)

(You are encouraged to try this now in Hugs, as described in class and in the companion handout.) We may evaluate the result of factorial 4 by using the equations to expand the expression as follows:

factorial 4 = 4 * factorial (4-1)
            = 4 * factorial 3
            = 4 * 3 * factorial 2
            = 4 * 3 * 2 * factorial 1
            = 4 * 3 * 2 * 1 * factorial 0
            = 4 * 3 * 2 * 1 * 1
            = 24

The second point to observe is that the computer will choose the first equation that matches the arguments in a particular function call. That is, when it evaluates factorial 0, it uses the first equation rather than trying to evaluate 0 * factorial (-1). This means that we will usually want to handle the special ``base'' case(s) of functions first in our programs, followed by the general recursive case.

One minor observation about the code seen so far is that function arguments do not need to be in parentheses. This reflects the basic role that functions play in Haskell--applying a function to an argument occurs so often that the designers of the language decided not to clutter up the syntax with lots of extra parentheses. The only case in which a function parameter (either formal or actual) needs to be surrounded by parentheses is when it is an expression such as (n-1) or (x:xs). However, if it makes you more comfortable to write factorial(4) instead of factorial 4, go ahead.

Another observation about the Mergesort program is that it works on lists of data, rather than the arrays common in C++. Tied as it is to a specific layout in memory (and the corresponding convention of avoiding copying that block of memory by passing arrays by reference, which we have already seen is a no-no in a purely functional language), the array is too low-level a data structure for ordinary use in Haskell. As we will see starting in Chapter 3 of Carrano & Prichard, switching to the more abstract concept of a list of data allows an additional level of flexibility in our programming. Here are the operations on lists that we need for Mergesort:

Picking off the first element (the ``head'') of a list. This is commonly performed by matching a function argument against a pattern such as (x:xs)--read this as matching a list composed of the first element x followed by the remainder of the list (the other ``x's''). If you need to grab the first element of a list in an expression without introducing a new function to use pattern-matching, the head function will do the job: head [1,2,3] is 1.
Adding a new element at the head of a list. As suggested by the pattern form above, this uses the colon (:) operator. Therefore, 1 : [2,3] is the list [1,2,3]. Note the asymmetrical nature of this operation: the left operand is a single value, while the right operand is a list of values. If you want to concatenate two lists together, use the ++ operator: [1,2] ++ [3,4] produces [1,2,3,4].
Constructing an empty list or a single-element list. The empty list is written [], while a list consisting of just the value x is written [x]. As we have seen, this notation extends to arbitrarily long lists by putting a comma-separated list of values between the brackets. This is all an abbreviation; what the computer is really thinking when you write [1,2,3] is 1 : 2 : 3 : []--that is (since the colon operator groups from right-to-left), start with the empty list and successively prepend the values 3, then 2, and finally 1 at the head.
Getting the length of a list and splitting it into two sublists. These operations are not directly built into the language, but rather are provided as part of the ``Standard Prelude'' of commonly used functions (think of the common functions defined in C++ headers such as iostream or string). We could write these functions ourselves--for example, here is a definition of length:
```
length [] = 0
length (x:xs) = 1 + length xs
```
--but they are used so frequently that people have assembled a standard collection of functions that are always available (without even needing the equivalent of a #include...).