Practice Exam 2 - CSC 222: Data Structures and Algorithms, Fall 2008

Practice Exam 2

This exam is open-book and open-note. Please allow some time to check your work. If you need extra space, write on the back.

Show the result after each pass of Insertion Sort on the input E A S Y O N E, where the letters will be sorted in alphabetic order. Fill in only the letters which have moved to a new position from the previous row.

	0	1	2	3	4	5	6
input:	E	A	S	Y	O	N	E
pass 1:	A	E
pass 2:			S
pass 3:				Y
pass 4:			O	S	Y
pass 5:			N	O	S	Y
pass 6:			E	N	O	S	Y

For each of the following questions, be sure to justify your answer. Just saying the name of the algorithm without giving a reason will not get full credit.

Which of the sorting algorithms discussed in class would be the best choice for re-sorting a telephone directory containing 100,000 entries, after adding 100 new entries at the end (that is, the 100 new entries are not in their correct sorted positions yet, although the rest of the directory is already in order)?
Insertion sort -- on nearly-sorted data, it is O(N).
Which of the sorting algorithms discussed in class would be the best choice for sorting 100 complex objects, where exchanging a pair of objects requires moving thousands of bytes of data?
Selection sort -- it only does (at most) N-1 exchanges.
Answer the previous question for 100,000 objects instead of 100.
Heapsort or Quicksort -- either does O(N log N) compares and exchanges, which is much better than the O(N²) compares for either insertion or selection, and not much worse than the O(N) exchanges for selection sort. Merge or Tree sort would also be O(N log N), but do more copying of data than the other two.

(4 points) Suppose we have a hash table using linear probe open addressing where the key values are integers and the hash function is simply the identity function. Therefore, if the table has size m, the key k will hash to the index k mod m (that is, k % m). Let m = 10 and give an example of a sequence of five distinct keys that will require the maximum amount of probing to insert in an initially empty table.

10, 20, 30, 40, 50 (or any sequence where all the keys have the same value mod 10)

What is the big-O running time of the worst case for inserting N values in an open-addressed hash table of size 2N?

In the worst case, each insertion will require a number of probes equal to the number of entries already in the table, plus one. Therefore, we need the sum 1 + 2 + 3 + ... + N, which is O(N²).

(4 points) Show the 2-3-4 tree that results from inserting the values 42, 19, 37, 65, 55, 49, and 51, in order.

     37,55
   /   |    \
  /    |     \
19  42,49,51  65

(2 points) Draw a red-black tree that corresponds to the result of the previous question (mark the red nodes with a star or some other special notation).

(3 points) Show the result of inserting the number 45 in the above red-black tree.

      49
     /  \
  37*    55*
 /  \   /  \
19  42 51  65
      \
       45*

The easiest way to see this is to work with the corresponding 2-3-4 tree, where the 49 had to split up into the root node:

    37,49,55
   /  |   \  \
19  42,45  51  65

Show the binary (minimum) heap that results from inserting the values 42, 19, 37, 65, 55, 49, and 51, in order, into an initially empty heap.

      19
     /  \
   42    37
  /  \  /  \
65  55  49  51

(This was unintentionally easy...)

Given the resulting heap from the previous question, which value will be popped off first? Show the resulting heap after removing this element.

The 19 will be popped off, leaving the following heap:

Show the result of applying the "build heap" operation to an array containing the values 42, 19, 37, 65, 55, 49, and 51.

      19
     /  \
   42    37
  /  \  /  \
65  55  49  51

(again, this was supposed to be more interesting, and different from the result of individual insertions; sorry.)
This corresponds to the array 19, 42, 37, 65, 55, 49, 51.

Consider the following prefix encoding trie:

       *
   0 /   \ 1
    /     \
   *       *
0 / \ 1 0 / \ 1
 S   I   M   P

Use this trie to encode the word "MISS". 10010000
How many bits will be needed to encode "MISSISSIPPI"? 22 bits, since each letter needs two bits.
Is there a better prefix encoding trie for "MISSISSIPPI"? If not, explain why not; if so, give an example.
There is a better encoding: using Huffman's algorithm, we can get the following trie:
```
   *
0 / \ 1
 S   *
  0 / \ 1
   I   *
    0 / \ 1
     M   P
```
(other tries are possible, depending on the order in which equal-frequency items come off the queue.)
The encoding for "MISSISSIPPI" with this trie is 110100010001011111110, which is only 21 bits.

Trace the operation of the quickselect algorithm in finding the median of the values 42, 19, 37, 65, 55, 49, and 51. Use the naive choice of the first element as pivot.

42	19	37	65	55	49	51	Choose 42 as pivot
37	19	42	65	55	49	51	Partitioned around 42; median is in right partition
			65	55	49	51	Ignore left partition and pivot; choose 65 as pivot
			51	55	49	65	Partitioned around 65; median is in left partition
			51	55	49		Ignore pivot and right partition; choose 51 as pivot
			49	51	55		Partitioned around 51; median is in left partition
			49				Ignore pivot and right partition; 49 is the median