Programming with HasCl

This handout covers the basics of programming in HasCl, which is essentially a subset of the functional language Haskell. Although the details are specific to the Funnie environment, most of it should apply to other Haskell systems such as Hugs or GHC (see http://www.haskell.org/ for more details on these and other implementations of the language). In comparison with other tutorials available on the web, the focus here will be on learning to work with recursion and some of the elementary data structures traditionally encountered in Computer Science II; it is not specifically about exploring the power of Haskell, which has many advanced features that we will not discuss.

Interacting with Funnie

When you start a session with Funnie you should see an expression window, plus a browseable list of pre-defined functions. This means that the system is ready to evaluate expressions, calculator-style, in the context of the standard collection of definitions known as the prelude. The prelude defines many powerful operations on numbers, strings, and lists. Try entering each of the following expressions at the prompt, clicking the evaluate button after each (or hitting Enter), and figure out what the responses mean; then test yourself by trying similar expressions and checking that you can predict what the responses will be.

1+2*3
1+2^3
(1+2)^3
product([1, 2, 3, 4, 5, 6])
product([1 .. 6])
sum([1 .. 6]) / length([1 .. 6])
sum([1 .. 6]) # length([1 .. 6])
22/7
22.0/7
22#7
22%7
[1 .. 6] ++ reverse([1 .. 6])
"Hello" ++ "World!"
toUpper('a')
map(toUpper, "Hello!")
map(sqrt, [1 .. 6])
map(odd, [1 .. 6])
head([1 .. 6])
tail([1 .. 6])
(head([1 .. 6]), head("Hello!"))
zip([1 .. 6], "Hello!")

To use functions beyond those defined in the prelude, you will need to enter them in a definition window. From the Window menu, select ``New Definition Window,'' then click in the window to start typing. When the function is complete, click on the compile button; if the definition does not have any syntax errors, it will be added to the list of locally-defined functions in the top half of the function browser. Otherwise, read the message in the error window and try again.

For practice, enter the following definition:

fact(0) = 1
fact(n) = n * fact(n-1)

Once you have successfully entered a definition, it will be available for use in the expression window just like the functions in the prelude. With the above function definition, you should now be able to enter an expression such as fact(5) to compute the factorial of 5 (5!). When you try this, you should also click on the ``Open Stepper'' button to bring up a stepper window, in which you can trace the individual evaluation steps.

Basic Types

If you worked through the expressions in the previous section, you have already encountered most of the standard built-in types of values available in HasCl. Now we will look at each in more detail.

Num

HasCl has a very rich collection of numeric values, but as opposed to many programming languages, it does not have separate types for integers and floating-point numbers; instead, all numbers belong to the type Num. Unlike the type int of C++, there is no upper limit to the size of a Num. For example, we may use the fact function defined above to find 52! (which counts the number of ways to arrange a deck of 52 playing cards):

fact(52) => 80658175170943878571660636856403766975289505440883277824000000000000

This 68-digit number is a value of type Num. HasCl will happily work with numbers having thousands of digits, up to the limits of your computer's memory. Arithmetic on large numbers is important in applications such as cryptography, where schemes such as RSA encryption are based on the ease of multiplying multi-hundred-digit numbers (and the corresponding difficulty of factoring the result back into the original numbers).

All of the usual arithmetic operations are available on Nums: +, -, *, and /. In general, division of two integers will produce a rational number (for example, try 22 / 8). To divide a by b and get an integer result, use the # operator: a # b. This gives the quotient; to get the remainder, use a % b (this is the same as in C++). Exponentiation, which is not a built-in operator in C++, is written with the caret operator, ^; that is, a^b is written a^b.

We sometimes need to use an operator like a function. By surrounding any operator in parentheses you can do just that. For example, (+) is a function which takes two arguments, so (+)(1, 2) is the same as 1 + 2.

Question: Convert the expressions 1+(2*3) and (1+2)*3 into equivalent calls to the functions (+) and (*). What can you discover about the precedence rules for the arithmetic operators in HasCl (that is, which of these two expressions is the same as 1+2*3)?

Bool

The type Bool represents truth conditions, either true or false. It is essentially equivalent to the bool type of C++. Comparison operators such as ==, != (not equal), <, >=, ..., return values of type Bool. The standard logical operations are available: 1 < x && x < 10 is true if x is both greater than 1 and less than 10; x == 2 || odd(x) is true if x is either equal to 2 or odd; and not(odd(x)) is true if x is not odd (this can also be written even(x)).

Given a boolean value, the natural way to use it is to make a decision between two choices. HasCl has a conditional expression similar to that found in most languages: if b has type Bool and x and y are expressions of the same type, then if b then p else q is an expression that evaluates to p when b is true and q when b is false. For example, if 1 < x && x < 10 then "OK" else "Out of Range" will evaluate to the string "OK" whenever x is strictly between 1 and 10, and "Out of Range" otherwise.

Question: Write an expression using the variable x which evaluates to "Might be prime" if x is either odd or equal to 2, and evaluates to "Not prime" otherwise. To turn this into a function named primeTest, enter the following in a definition window and press Compile:

primeTest(x) = <your expression here>

(where <your expression here> is replaced by the expression). Test it on cases like primeTest(2), primeTest(6), and primeTest(15).

Question: Write a function named myAnd which takes two arguments, x and y. Using if-then-else, the constants true and false, and the variables x and y, it should produce the same result as the expression x && y. That is, it should produce true when x and y are both true, and false otherwise, but you may not use the built-in && operator.

Char

The HasCl type Char can represent any single character. It is essentially identical to the char type in C++, even using the same single-quote syntax: 'a' is the character a. If you want a single-quote character, HasCl uses the same backslash escape mechanism as C++: '\''. Here are some more escape codes for characters that are otherwise hard to type:

`\n`	newline	`\r`	carriage return
`\t`	tab	`\v`	vertical tab
`\a`	bell	`\f`	form feed
`\b`	backspace	`\\`	backslash
`\'`	single-quote	`\"`	double-quote

Unlike C++, a Char does not automatically convert to an integer. If you want the ASCII code for a character, use the ord function. To convert from an ASCII code back to a character, use the chr function. For example, ord('A') returns 65, while chr(48) gives '0'.

Question: Given that the ASCII codes of the digits are consecutive numbers from 48 for '0' to 57 for '9', write a function named digitToNum that takes a digit d of type Char and produces the corresponding Num. For example, digitToNum('7') should produce the result 7 (note that the result is the number 7, not the character '7').

String

As with many other parts of the HasCl language, the syntax for strings is essentially the same as in C++. That is, a string constant is a sequence of characters surrounded by double-quotes: "Like this". The escape codes listed above may be used to embed ``difficult'' characters in strings: "My favorite string is \"Hello World\"." However, unlike C and C++, where strings are arrays of characters (reflecting the importance of the array in those languages), in HasCl a String is a list of elements of type Char. We will see more about lists below, but one of the implications for strings is that the common operation of appending two strings is achieved by using the list append operator, ++. For example, "Hello" ++ "World" evaluates to "HelloWorld".

Question: Write a function named echo that takes a string s and produces a new string with s followed by s again in parentheses. For example, echo("Hello") should evaluate to "Hello (Hello)". Be careful to get the correct spacing.

Pairs and Tuples

Several values may be packaged into a single data item by wrapping them in a tuple. A tuple is written as a comma-separated sequence of values in parentheses: (42, 'a', "Hello", true). There may be any number of values (even none at all) in a tuple; the special case of two values is called a pair. The order of the values matters, so the pair (1, 2) is different from the pair (2, 1). The type of a tuple is written as a tuple of types, so these pairs each have type (Num, Num), while the first tuple above has type (Num, Char, String, Bool).

Question: Invent an expression which has the type (Bool, Char, String). Now find an expression whose type is ((Bool, Char), String) (note the extra parentheses). Recall that the type of an expression is displayed after the colons (::) when it is evaluated in an expression window.

Lists

Tuples can hold a variety of types of values; they are roughly analogous to structs in C++ (although the fields are accessed only by position, not by name). One disadvantage of using a tuple is that a particular tuple type will always contain the same number of values. If we want to store a varying number of values in a single data object, we use a list. In this sense, lists are the HasCl equivalent of arrays in C++; however, lists are a more abstract concept than arrays (since an array is tied to a specific storage mechanism in memory), and HasCl provides many powerful operations on lists that have no direct equivalents in C++ arrays. The tradeoff for this added power over tuples is that all of the data items in a list have to be of the same type; it is not possible to mix numbers with characters or strings in a single list (C++ arrays have this same restriction).

The basic way to write a list of values is to enclose them in square brackets, separated by commas. For example, [1, 2, 3, 4, 5] is a list of five numbers, starting with 1 at the head of the list. Just as with tuples, the order matters, so [2, 5, 3, 1, 4] is a different list, even though it contains the same values. Since each of these values is of type Num, we write the type of the list as [Num]. There is an abbreviation for lists which consist of a regular sequence of values: [1 .. 5] gives the list [1, 2, 3, 4, 5].

As mentioned above, a String is just a list of Chars. The notation "Hello" gives exactly the same list as ['H', 'e', 'l', 'l', 'o']. Note that a list of Strings is not the same; ["Hello", "World"] is a list with two elements, each of which is a list of characters (coincidentally, each one is five characters long, but recall that a given type of list can have any number of elements). The type of ["Hello", "World"] is [String]; since String is a synonym for [Char], this can also be written [[Char]] (a list of lists of characters).

The fundamental operations on lists allow us to take them apart and put them together. Given a list, we may remove the first element with head, and obtain the list of all except the first element with tail: head([1, 2, 3, 4, 5]) is 1, while tail([1, 2, 3, 4, 5]) is [2, 3, 4, 5]. A new list may be formed from a head element and a tail list with the colon operator: for example, [1 : [2, 3, 4, 5]] produces [1, 2, 3, 4, 5]. An entire list may be put together in this way, with the initial tail list being the empty list, [ ]. That is, [1, 2, 3, 4, 5] is equivalent to [1 : [2 : [3 : [4 : [5 : [ ]]]]]]. If you ask the type of [ ], the system will say [ ] :: [a], which is read ``[ ] has the type list of a, where a can be any type.'' Trying to take the head or tail of an empty list produces an error (try head(tail([1]))).

We have seen a number of other operations on lists already. The operator ++ will append two lists of the same type, so [1, 2] ++ [3, 4, 5] produces [1, 2, 3, 4, 5]. The reverse function produces a list with all the same elements as its argument, but in the opposite order: reverse("Hello") gives "olleH". The length function counts how many elements are in a list; if you do length([ ]), the answer is 0, while length(["Hello", "World"]) is 2 (and length(head(["Hello", "World"])) is 5). Given lists of numbers, sum and product will add or multiply all of the numbers together.

A more interesting operation is map, which takes two arguments. The first is a one-argument function and the second is a list; map applies the function to each of the elements of the list and returns the list of results. For example, map(sqrt, [1 .. 10]) produces a list of the square roots of the numbers from 1 to 10; it is equivalent to evaluating [sqrt(1), sqrt(2), ..., sqrt(10)] (the ... here is not in the syntax of HasCl; I just didn't feel like typing all ten terms).

One more function on lists that we have seen is zip. When zip is applied to a pair of lists, it creates a list of pairs of corresponding elements from the two lists, until one or both of the lists is exhausted. That is, zip([1, 2, 3], ["Hello", "World"]) produces the list [(1, "Hello"), (2, "World")]; the 3 is ignored, because there was no matching element in the second list.

Lists may be compared for equality; two lists are equal if they have the same length and if corresponding elements are equal.

Question: Write a function that tests whether a string s is a palindrome (that is, it reads the same forwards as backwards). For example, isPalindrome("radar") should return true, while isPalindrome("Madam, I'm Adam") should return false (that is, the spaces and punctuation matter, as does capitalization).

Functions

As seen in the map example above, HasCl allows functions to be used as values and passed as arguments to other functions. HasCl is known as a ``functional'' language, partly because of this treatment of functions as first-class values (that is, they are treated just like the other basic types of the language). The type of a function which takes an argument of type a and produces a result of type b is (a) -> b, which is an ASCII approximation of the usual mathematical notation a $\rightarrow$ b (the parentheses around the argument type a are to remind us of the parentheses around the argument in a function call; this is similar to C++, where we say that if f is a function that takes an argument of type a and produces a result of type b, then the prototype for f is b f(a);). To express the type of map, note that map takes a function of any type (a) -> b, as well as a list of type [a]; the result, after applying the function to each element of the list, will be of type [b]. Therefore, the system will inform us that map :: ((a) -> b, [a]) -> [b] (try it).

Question: What is the type of zip?

User-Defined Types and Functions

Apart from some of the syntactic abbreviations mentioned above (for example, string constants), there is nothing special about the built-in types described above. HasCl has a very powerful system for defining new types and operations. The basis of the means of defining new types is the data definition, which names a new type and lists all of the fundamental ways of constructing new values of the type. The standard way of defining new operations is then to give a series of rules that cover each case of how an argument could have been constructed; since all of the constructors are listed in the data definition, it is usually sufficient to provide one rule for each constructor.

Enumerated Types

A simple example is a type which is just an enumeration of all the possible values. Consider a Direction type, which may be North, East, South, or West. Here is an appropriate definition:

data Direction = North | East | South | West

This says that the only way to construct a value of type Direction is to give one of the four main compass directions; these four constructors name the four values of the type. Now suppose that we want to define a function degrees which converts a Direction into the corresponding numerical compass heading, where 0^o is north, 90^o is east, etc. The following set of rules will cover all the cases:

degrees(North) = 0
degrees(East)  = 90
degrees(South) = 180
degrees(West)  = 270

That's it. Enter these lines in a New Definition window and compile them. Now when the system encounters an expression such as degrees(West), it can evaluate it by looking at the rules, producing the result 270. We have just defined a new function degrees :: (Direction) -> Num.

When you enter a new function definition, it is a good idea to declare the expected type of the function on the first line, followed by a descriptive comment (this is no different from the good practice of providing function prototypes and comments in C++). For example, the function above could also have been entered as

degrees :: (Direction) -> Num
// Convert a compass direction into numeric degrees
degrees(North) = 0
...

The type declaration provides extra documentation about the use of the function, as well as a check that the function really does have the desired type (Funnie will give you an error message when compile the function if you declare the wrong type). Incidentally, a current limitation of Funnie is that function definitions may not be deleted, and all definitions of a given function must have the same name. If you define a function incorrectly and it compiles with the wrong type, you will need to make up a new name, such as degrees2, for the correct function (fixing this problem is on the list of improvements for the summer).

Another example is the type Bool; it acts as if it were defined as data Bool = false | true. Here is the definition of the not function from the standard library:

not :: (Bool) -> Bool
// Logical negation
not(false) = true
not(true) = false

As you can see, it is very similar to the Direction type. There are exactly two values of type Bool, false and true. The not function has a correspondingly simple definition, since it only has to consider two cases for its argument. The Char type is also handled conceptually as an enumeration, although it has many more cases.

Question: Define a function rotateDirLeft :: (Direction) -> Direction which will take North to West, East to North, etc.

Constructors with Arguments

Constructors in a data definition may also take arguments. For example, suppose we want a type to represent colors. We might use the following (a slightly different type Color is already defined in the Graphics module in Funnie, so I will use the British spelling here to avoid a conflict):

data Colour = Red | Green | Blue
            | Yellow | Cyan | Magenta
            | White | Black
            | rgb(Num, Num, Num)

This is like an enumeration, except the last constructor can take three Nums to represent an arbitrary color by giving its red, green, and blue components (each going from 0 to 255). We could define a function getRGB of type (Colour) -> (Integer, Integer, Integer) which converts any value of type Colour into a triple of red, green, and blue components. Here is an appropriate list of rules:

getRGB(Red)        = (255,   0,   0)
getRGB(Green)      = (  0, 255,   0)
getRGB(Blue)       = (  0,   0, 255)
getRGB(Yellow)     = (255, 255,   0)
getRGB(Cyan)       = (  0, 255, 255)
getRGB(Magenta)    = (255,   0, 255)
getRGB(White)      = (255, 255, 255)
getRGB(Black)      = (  0,   0,   0)
getRGB(rgb(r,g,b)) = (  r,   g,   b)

The name for this kind of function definition by giving rules is a ``pattern-matching'' definition. Each rule gives a pattern that will be matched against an argument; if the match is successful, then the rule is used; otherwise, the next rule in the list is tried. For no-argument constructors, such as Red, the name of the constructor is the entire pattern. For constructors taking arguments, the pattern is formed by giving the constructor name followed by enough variables to match whatever argument values come along with that constructor. For example, if we evaluate getRGB(rgb(64, 128, 192)), the ninth rule will succeed by matching r to 64, g to 128, and b to 192. These variable matches, also known as bindings, are roughly equivalent to associating actual arguments with formal parameters in calling a function in C++; for the course of the execution of the function, the variables will contain the values passed in from the caller (in C++ this has to be qualified to say ``unless the variables are assigned different values in the course of execution''; HasCl has no notion of changing the value assigned to a variable--this is part of what makes the functions so pure, since we don't have to worry about changing the state of variables--so this qualification is not necessary). Therefore, in evaluating the right-hand-side of the rule, the expression (r, g, b) becomes (64, 128, 192); this is the result of the function.

Although they depend on some special syntax, the built-in tuple types are an instance of this kind of data type. For example, we may define a new type that is essentially equivalent to the type (Bool, Char) as follows:

data BoolChar = BoolChar(Bool, Char)

For convenience, we may also define functions to extract individual components from a BoolChar value:

first :: (BoolChar) -> Bool
first(BoolChar(b,c)) = b

second :: (BoolChar) -> Char
second(BoolChar(b,c)) = c

These mirror the built-in functions fst and snd which extract the two components of a pair. Thus, we could write an expression such as first(BoolChar(false, 'x')) to mirror the standard expression fst(false, 'x'); each evaluates to false.

The prelude does not provide functions analogous to fst and snd for tuple types with more values, because it is more common to extract the parts of a tuple by pattern matching. For example, suppose we want to write a function brightness :: (Num, Num, Num) -> Num which takes a triple of Nums, as produced by the getRGB function above, and returns the average of the three components. It may be defined as follows:

brightness((r, g, b)) = (r + g + b) # 3

Evaluating brightness(getRGB(Cyan)) first applies the getRGB function to Cyan, producing the triple (0, 255, 255). Then it applies the brightness function to this result: brightness((0, 255, 255)). This matches the rule for brightness, binding 0 to r, 255 to g, and 255 to b. Finally, it plugs these values into the right-hand-side, producing (0 + 255 + 255) # 3, which gives 170 as the final result.

Question: Give a direct definition of a function colourBrightness :: Colour -> Num, such that colourBrightness(c) produces the same result as brightness(getRGB(c)) for any Colour value c (but without using the brightness or getRGB functions).

Recursive Data Types

At this point it is a minor step to allow recursive data types--types where a constructor may take another value of the type being constructed as an argument. For example, here is the definition of a type Path, where a path is either empty or it consists of a segment going right, left, up, or down followed by another path:

data Path = End | R(Path) | L(Path) | U(Path) | D(Path)

An example of a value of type Path is R(U(R(D(End)))), which may be drawn as

$\begin{figure}\setlength{\unitlength}{2mm}\begin{center}\begin{picture}(20,10) \... ...\vector(0,-1){5}} \put(15,2){\circle*{.5}} \end{picture}\end{center}\end{figure}$

The only extra ability needed to support this is to allow a similar recursion when defining a function--we will allow the use of the function being defined on the right-hand-side of a rule, usually applied to a variable that was bound to another instance of the type when matching one of the recursive constructors. For example, here is a rotateLeft function on Paths, which changes all of the path segments to be rotated 90^o counter-clockwise:

rotateLeft(End)  = End
rotateLeft(R(p)) = U(rotateLeft(p))
rotateLeft(L(p)) = D(rotateLeft(p))
rotateLeft(U(p)) = L(rotateLeft(p))
rotateLeft(D(p)) = R(rotateLeft(p))

For each of the recursive constructors, which build a longer path by adding a segment onto the beginning of another path, we give a rule that says how to rotate the added segment, and then we make a recursive call to rotateLeft to rotate the rest of the path. The base case is where the recursion bottoms out, in this case at the empty path End.

The canonical example of a recursive data type is the built-in list type. Indeed, we can frequently ``code up'' other recursive types by representing them as lists--you should be able to imagine using [Direction] as a replacement for Path, where we might write [East, North, East, South] instead of R(U(R(D(End)))).

Question: How would you define rotateDirListLeft :: ([Direction]) -> [Direction] to perform the analogous operation to rotateLeft :: (Path) -> Path? Try to use functions we have already defined.

Although the syntax is not legal HasCl, you should imagine the list type [a] as being defined by

data [a] = [ ] | [a : [a]]

That is, a list of values of type a is either an empty list, [ ], or the result of using the colon operator to add a value of type a onto the front of another list of values of type a.

Here is an example of defining our own version of the length function:

len([ ])      = 0
len([x : xs]) = 1 + len(xs)

Note how the structure of the function definition exactly follows the structure of the data definition. To find the length of a list with one element (x) added to it, we add one to the length of the shorter list, obtained by recursively calling the len function on the tail (xs). That rule, plus the base case for the empty list, is all we need to define the function.

Question: Write a function countVertical :: ([Direction]) -> Num which counts how many ``vertical'' segments (North or South) are in a Direction list. For example, evaluating the expression

countVertical([North, East, North, South, West])

should produce 3.

For another example, here is the definition of a listMap function which is equivalent to the built-in map function:

listMap(f, [ ])      = [ ]
listMap(f, [x : xs]) = [f(x) : listMap(f, xs)]

Reading this off in words, it says that the result of mapping a function f over an empty list is just an empty list, while mapping f over a list that starts with x is the list with f(x) at the head and the result of mapping f over the rest of the list as the tail.

Question: Define your own version of the zip function.