Week 5. Trees
Projects and Exercises
6.1 Projects and Exercises
6.1.1 Warm-up
Problem 6.1 Goodrich et al R-8.1
Problem 6.2 Design an algorithm to calculate the height of a given node (position) in a tree . What is the worst case time complexity?
6.1.2 Compulsory Assignment
Problem 6.3 (Freely rephrased from Goodrich et al C-8.50) Given a tree , and two nodes and . We define an ancestor in the natural way, i.e. is an ancestor of if either is equal to , or is the parent of some ancestor of . If is an ancestor of both and , then is a common ancestor. The lowest common ancestor (LCA) of and is a common ancestor of and , such that there is no other common ancestor of and where is also an ancestor of .
Devise an algorithm to find the LCA for two given nodes and . What is the running time?
Problem 6.4 (Freely rephrased from Goodrich et al C-8.51) Consider the problem of finding the LCA of and as in the previous problem, when the positions of and are numbered in the usualy way and .
Devise an algorithm to find where is the LCA for two given nodes and , without actually determining . What is the running time of your algorithm?
Problem 6.5 Goodrich et al C-8.53. The difficult step is to read the infix expression into the tree representation discussed in the chapter.
Solution 6.1 Many parser problems are solved by representing syntax elements in a tree structure, and this is possible also in this case.
Each arithmetic operation is a node with two child subtrees representing the two arguments. This tree captures the semantic meaning of the expression exactly, saying exactly how the expression is calculated.
Thus our algorithm consits of two steps. First, the tree is generated from the infix expression. Then the postfix expression is output from the tree.
The second step is simply a post-order traversal of the tree, outputting the two subtrees (arguments) before the parent node (operation).
The first step is harder. As a preliminary solution, we will assume that the expression is fully parenthesised. Thus every parenthesis has the form where and may be values or parenthesised expressions. It is not possible to see . When this is the case, we know that when we first see a closing parenthesis, we have three tokens which can be organised as a small tree. Every time we find a closing parenthesis, we build a tree which is subsequently treated as a single token.
We need to use a tree structure where existing trees can be added easily as child subtrees to any leaf node. This probably means a node linked implementation.
The following is a possible algorithm, where is the list of symbols.
2
3 while is not empty
4
5 if is closing parenthesis,
6
7
8
9 assert is opening parenthesis
10
11 else
12
13
14 if is empty,
15 return
16 else
17
18
19 return
The auxiliary routine newTree creates a new tree with as root, and and as left and right subtree respectively.
Now, we consider the case where the expression may not be fully parenthesised. This means that we need to introduce the concept of precedence of operators. Parentheses has the highest precedence, and was handled in the first parse, that we just described. The next in order is multiplication and division, and lastly we have addition and subtraction.
Without parentheses, we may find, in Line 9, another operator instead of the parenthesis, and we need to keep popping until we find the opening parenthesis, and handle multiplication/division before addition/subtraction.
Let us replace that inner block with another auxiliary, as follows.
2
3 while is not empty
4
5 if is closing parenthesis,
6
7 else
8
9
10 if is empty,
11 return
12 else
13
14
15 return
The innerParse routine parses from the stack, as follows.
2
3
4 while
5
6 if is opening parenthesis
7
8 else
9
10 if is empty,
11
12 while
13
14 if is division or multiplication,
15
16
17
18
19
20
21
22 while
23
24 if is addition or subtraction,
25
26
27
28
29
30
31
32 assert is empty
33 return
In the inner parser, we construct the list so that the tokens appear in the original order. The top element from the stack came last from the string, and is entered first into the list. It ends up last, since subsequent elements are added before it in the list.
We traverse this list twice. Once for multiplication/division and once for addition subtraction. Operations at the same precedence, are processed left to right. Hence, we traverse the list from head to tail, and whenever we find an operation of the right precedence, we apply it to its two neighbours, and combine these three tokens into a tree, which replaces the three tokens.
If the input is valid, we get operators and operands alternately when we pop the stack. After two traversals, all operators have been removed, with the adjacent operands collapsing into a single operand. After the second traversal there should be a single operand left in the list, and this operand is returned.
We complete the translation by post-order traversal of the tree.
where
The time complexity is linear. The input is traversed several times, but there is only a constant number of traversals, and the work per node is constant in each traversal. The traversals are (1) from input to stack, (2) from stack to list, (3–4) two traversals of the list, and finally the (5) traversal of the tree.
Remark 6.1 The preliminary solution given above, although incomplete, is a good start and more than halfway to a perfect answer. You should not try to skip this preliminary step, but rather embrace the partial solution.
The typed solution is not satisfactory. It should have been written by hand, with figures to demonstrate the key points. Text only is hard to read, and I have only typed it because of popular demand. Figures are important.
The argument shows the sequence of discovery. If you have the slightest doubt about the solution, this is the way to go. If you cut down on the presentation, you need a much more formal proof with strict logic.
If you read compiler theory and formal languages, you may find better ways to solve this problem. The solution given is within reach with the bare minimum of theory taught in this module. It is a difficult problem though.
6.1.3 Exercises
Problem 6.6 Goodrich et al C-8.29