OverviewScheduleResourcesAssignmentsHome

CSC 122: Computer Science II, Fall 2005

An Introduction to Haskell

Haskell is a modern ``purely functional'' programming language. A functional language is one in which functions are as easy to manipulate as the more common kinds of values, such as numbers and strings. By being purely functional, Haskell does not allow the operations which make functions difficult to work with in traditional imperative languages: assignment to global variables or reference parameters. As a result, calling a function will not have any ``side-effects''--that is, it will not change the values of any non-local variables--and we are free to treat them as pure functions (some texts refer to these as ``valued functions'').

As a higher-level language than C++ or Java, Haskell enables programmers to be more productive and write shorter, clearer, and more maintanable code. Unfortunately, industry practice has not yet embraced functional programming, in part because of the enormous effort it would take to retrain current programmers and redesign all the support tools. There are signs that the industry will gradually move to higher-level languages: for example, the next version of Java will contain several new language features influenced by the design of Haskell. In the meantime, we will study Haskell because it provides a convenient and safe (you will appreciate this safety when we talk about pointers in C++) setting in which to explore recursive functions and data structures.

Here is an example of Haskell1 code to perform the recursive Mergesort described in class:

merge([ ], ys) = ys
merge(xs, [ ]) = xs
merge([x:xs], [y:ys]) = if x <= y
                        then [ x : merge(xs, [y:ys]) ]
                        else [ y : merge([x:xs], ys) ]

mergesort([ ]) = [ ]
mergesort([x]) = [x]
mergesort(xs)  = let (as, bs) = splitAt(length(xs) # 2, xs)
                 in merge(mergesort(as), mergesort(bs))

After defining these two functions, merge and mergesort, one can sort a list by entering a command such as mergesort([3,1,4,1,5,9,2,6,5]); the result is quickly printed: [1,1,2,3,4,5,5,6,9].

Even without knowing any details yet of the syntax of Haskell, it should be easy to recognize the structure of Mergesort in this code. The mergesort function is specified by three rules; the first two handle the base cases of an empty list or a one-element list, while the third rule handles the recursive case by splitting the data into two halves, calling mergesort on each half, and then merging the results with merge. The merge function is also pretty easy to read: if either of the lists to be merged is empty, then it just returns the other list (first two rules), otherwise it chooses the smaller of the two first elements and puts it first in the result, followed by another call to merge on the remainder of the data. Compare this to the one-and-a-half pages of code used for a C++ version of Mergesort in Chapter 9 of Carrano & Prichard (which only works on arrays of data up to a fixed maximum size--we will see a way around this later, but it adds several more lines to the code), and you will appreciate the power of higher-level programming provided by Haskell.

There are several points to observe about the Mergesort program as we start to learn the details of Haskell. First, the program is organized as a sequence of equations defining the functions. Unlike the assignment statement in C++, these are truly meant to be read as equations. This means that the familiar sort of algebraic reasoning by ``substituting equals for equals'' will work for Haskell programs. For example, consider the following definition of the factorial function:

factorial(0) = 1
factorial(n) = n * factorial(n-1)

(You are encouraged to try this now in Funnie, as described in class and in the companion handout.) We may evaluate the result of factorial(4) by using the equations to expand the expression as follows:

factorial(4) = 4 * factorial(4-1)
             = 4 * factorial(3)
             = 4 * 3 * factorial(2)
             = 4 * 3 * 2 * factorial(1)
             = 4 * 3 * 2 * 1 * factorial(0)
             = 4 * 3 * 2 * 1 * 1
             = 24

In Funnie, you may open a ``Stepper Window'' to see this sequence of expansions.

The second point to observe is that the computer will choose the first equation that matches the arguments in a particular function call. That is, when it evaluates factorial(0), it uses the first equation rather than trying to evaluate 0 * factorial(-1). This means that we will usually want to handle the special ``base'' case(s) of functions first in our programs, followed by the general recursive case.

Another observation about the Mergesort program is that it works on lists of data, rather than the arrays common in C++. Tied as it is to a specific layout in memory (and the corresponding convention of avoiding copying that block of memory by passing arrays by reference, which we have already seen is a no-no in a purely functional language), the array is too low-level a data structure for ordinary use in Haskell. As we will see, switching to the more abstract concept of a list of data allows an additional level of flexibility in our programming. Here are the operations on lists that we need for Mergesort:



Footnotes

... Haskell1
In fact, this is not quite standard Haskell; we will be using a modified subset developed here at DePauw called HasCl.
OverviewScheduleResourcesAssignmentsHome

Valid HTML 4.01!Valid CSS!DePauw University, Computer Science Department, Fall 2005
Maintained by Brian Howard (bhoward@depauw.edu). Last updated