CSC 222: Data Structures and Algorithms, Spring 2008

Programming Project 2

This project will be due by 5 pm, Thursday, April 17. To submit your project, save it in a directory named pp2 under your directory in /home/libs/dataStr/students/ on one of the department's Linux machines (this is a shared directory, so you should be able to reach it from any of the machines, including oort).

You may work on this project by yourself or in a group of two or three. If you work in a group, be sure to include everyone's name in the comments at the top of each file. Also, be sure you understand the guidelines for group work listed in the syllabus--the fundamental idea is that everyone in the group should understand what everyone else has done, and be able to recreate it on their own.

You should start by creating your pp2 directory and copying the starter files into it (the following commands assume that you start in your directory under /home/libs/dataStr/students/):

mkdir pp2
cd pp2
cp /home/libs/dataStr/pp2/* .

The project is to create an interpreter for a tiny programming language reminiscent of (old-style) BASIC. Given a program such as this:

10 X = 10
15 foo = 17
20 a = X * X - foo
25 X = ( X + 1 ) / 3

your interpreter will execute the statements in sequence, then print out the final values of all the variables:

X = 3
a = 83
foo = 17

This is a very simple subset of BASIC; in a few weeks, we will discuss how to extend this subset to handle statements other than assignment (such as INPUT and PRINT, to allow some kind of I/O, and IF and GOTO, to allow non-linear program flow...).

In more detail, your program will present the user with an "interpreter prompt," where they can either enter/modify program statements or give commands such as RUN to execute the program, or LIST to display the current program. If a command starts with a positive integer, it will cause that numbered line to be added to the program (possibly replacing a previous line with the same number). Otherwise, it will be treated as an immediate command: you will need to handle RUN and LIST, as mentioned, plus QUIT to exit the interpreter (this is already done in the provided code); for convenience, you may also want to handle commands such as LOAD "file" and SAVE "file", so that you don't have to keep typing in lots of statements when testing. Here is a sample interaction (the user's input is after the > prompt):

> 10 X = 10
> 20 a = X * X - 17
> RUN
X = 10
a = 83
Done
> 20 a = X * X - foo
> 15 foo = 17
> LIST
10 X = 10
15 foo = 17
20 a = ( ( X * X ) - foo )
> RUN
X = 10
a = 83
foo = 17
Done
> 25 X = ( X + 1 ) / 3
> RUN
X = 3
a = 83
foo = 17
Done
> QUIT

Here are some comments about the above interaction:

The line numbers enable a crude form of program editing, since you can go back and insert statements (such as 15) between existing lines, as long as there is a gap between their numbers, and you can modify lines (such as 20) by replacing the old version.
At the end of running the program, all variables assigned during execution are printed out along with their final values. Variables are printed in alphabetical order, according to the ASCII rule where all upper-case letters precede lower-case (so capital 'X' comes before little 'a').
Each "token" (variable, constant, operator, or punctuation) must be surrounded by spaces. Variables are words starting with an alphabetic letter (A-Z and a-z), constants start with a digit (0-9), and everything else is an operator or punctuation (+, -, *, /, =, and parentheses).
Arithmetic is done entirely on integers; for example, 11 / 3 produces the result 3.
Listing the program includes superfluous parentheses, partly because it is easier (hint: write an inorder traversal which prints "(", the left subexpression, the operator, the right subexpression, and ")"; in the base case, just print the variable or constant), and partly so that you can make sure your parser is getting the precedences correct.

There are two main parts to the project: parsing and execution. The following sections deal with each in more detail.

Parsing

The constructor for the Statement class is passed a buffer of strings containing the current line (excluding the line number, which has already been extracted). A buffer is simple an STL queue, so you may use the front() and pop() methods as usual to look at and erase the next available word.

For an assignment statement, you may assume that the first word in the buffer will be the name of the variable to be assigned, and the second word will be an equal sign. For this project, you may ignore error cases (for extra credit, augment your interpreter with appropriate handling of the various syntactic errors that might occur). The remaining words will be the right-hand-side expression. The Statement constructor should record the variable name and the expression tree (see below) in private fields.

Parsing the expression will result in an expression tree. This will be a binary tree where each of the interior nodes is labeled by an operator, whose children are the operands of that operator. Leaf nodes will be variable names or constants (stored as strings, just as they are read in from the buffer). For example, the expression ( X + 1 ) / 3 will turn into the following tree:

    "/"
    / \
  "+" "3"
  / \
"X" "1"

Here is pseudocode for the operator-precedence parsing algorithm. It makes use of two stacks: a "nouns" stack which contains pointers to tree nodes, and a "verbs" stack which contains strings.


for each token from the buffer:
  if the token is a variable or constant,
    push a new leaf node containing the token on the nouns stack;
  else if the token is a left parenthesis,
    push a left parenthesis on the verbs stack;
  else if the token is a right parenthesis,
    while the top of the verbs stack is not a left parenthesis:
      REDUCE;
    pop the left parenthesis from the verbs stack;
  else:
    while the verbs stack is not empty and the top of the verbs stack has precedence >= the token:
      REDUCE;
    push the token on the verbs stack;
while the verbs stack is not empty:
  REDUCE;
return the top of the nouns stack.

The REDUCE operation pops an operator from the verbs stack and two operands from the nouns stack (right operand first, then left), then pushes a new tree node back on the nouns stack containing that operator and operands. The precedence levels should be set so that left parenthesis has the lowest precedence, then the additive operators (+ and -) have the next level, and the multiplicative operators (* and /) have the highest. Of course, you may add more operators and give them corresponding precedence levels; this is just the minimum you should support.

Execution

Each Statement object has an execute method which takes a context and a program iterator. In the provided code, all that it does is increment the iterator and return. You will need to add code to evaluate the expression tree and store the resulting value in the context.

The context is an STL map from strings (variable names) to integers (their current values). Therefore, you may treat it as an array indexed by the variable names, so context["X"] gives you the current value of X (or 0, if X has not yet been assigned), and context["foo"] = 17; enters 17 as the new value for foo.

Evaluating an expression tree is simply a postorder traversal of the tree: evaluate the left child to get an integer, evaluate the right child to get another integer, and then perform the operation specified in the parent node (just use a series of if statements to match the various operators) on those two integers to produce the final result. Evaluating a leaf node either looks up the variable in the context or converts the constant (if it starts with a digit) to an integer -- use the atoi function from the cstdlib header, as done in basic.cc to process the line number.

At the end of execution, add code to basic.cc which will print out the contents of the context, as shown above. Since an STL map is sorted by key, you will simply need to iterate through the map and print out each entry in an appropriate format.

Overview

Schedule

Resources

Assignments

Home

DePauw University, Computer Science Department, Spring 2008
Maintained by Brian Howard (bhoward@depauw.edu). Last updated