Algorithms and Data Structures 2020

Week 5. Trees

Projects and Exercises

6.1 Projects and Exercises

  6.1.1 Warm-up
  6.1.2 Compulsory Assignment
  6.1.3 Exercises

6.1.1 Warm-up

Problem 6.1 Goodrich et al R-8.1

Problem 6.2 Design an algorithm to calculate the height of a given node (position) p in a tree T. What is the worst case time complexity?

6.1.2 Compulsory Assignment

Problem 6.3 (Freely rephrased from Goodrich et al C-8.50) Given a tree T, and two nodes a and b. We define an ancestor in the natural way, i.e. c is an ancestor of a if either c is equal to a, or c is the parent of some ancestor of a. If c is an ancestor of both a and b, then c is a common ancestor. The lowest common ancestor (LCA) of a and b is a common ancestor c of a and b, such that there is no other common ancestor dc of a and b where c is also an ancestor of d.

Devise an algorithm to find the LCA for two given nodes a and b. What is the running time?

Problem 6.4 (Freely rephrased from Goodrich et al C-8.51) Consider the problem of finding the LCA of a and b as in the previous problem, when the positions of a and b are numbered in the usualy way f(a) and f(b).

Devise an algorithm to find f(c) where c is the LCA for two given nodes a and b, without actually determining c. What is the running time of your algorithm?

Problem 6.5 Goodrich et al C-8.53. The difficult step is to read the infix expression into the tree representation discussed in the chapter.

Solution 6.1 Many parser problems are solved by representing syntax elements in a tree structure, and this is possible also in this case.

Each arithmetic operation is a node with two child subtrees representing the two arguments. This tree captures the semantic meaning of the expression exactly, saying exactly how the expression is calculated.

Thus our algorithm consits of two steps. First, the tree is generated from the infix expression. Then the postfix expression is output from the tree.

The second step is simply a post-order traversal of the tree, outputting the two subtrees (arguments) before the parent node (operation).

The first step is harder. As a preliminary solution, we will assume that the expression is fully parenthesised. Thus every parenthesis has the form (xopy) where x and y may be values or parenthesised expressions. It is not possible to see (xop1yop2z). When this is the case, we know that when we first see a closing parenthesis, we have three tokens which can be organised as a small tree. Every time we find a closing parenthesis, we build a tree which is subsequently treated as a single token.

We need to use a tree structure where existing trees can be added easily as child subtrees to any leaf node. This probably means a node linked implementation.

The following is a possible algorithm, where L is the list of symbols.

1            Algorithm ArithmeticTree(L
2            S:=emptystack 
3            while L is not empty 
4               y:=L.getNext() 
5               if y is closing parenthesis, 
6                  o:=S.pop() 
7                  x:=S.pop() 
8                  p:=S.pop() 
9                  assert p is opening parenthesis 
10                  S.push(newTree(o,x,y)) 
11               else 
12                  S.push(y) 
13            y:=S.pop() 
14            if S is empty, 
15               return y 
16            else 
17               o:=S.pop() 
18               x:=S.pop() 
19               return newTree(o,x,y)

The auxiliary routine newTree creates a new tree with o as root, and x and y as left and right subtree respectively.

Now, we consider the case where the expression may not be fully parenthesised. This means that we need to introduce the concept of precedence of operators. Parentheses has the highest precedence, and was handled in the first parse, that we just described. The next in order is multiplication and division, and lastly we have addition and subtraction.

Without parentheses, we may find, in Line 9, another operator instead of the parenthesis, and we need to keep popping until we find the opening parenthesis, and handle multiplication/division before addition/subtraction.

Let us replace that inner block with another auxiliary, as follows.

1            Algorithm ArithmeticTree(L
2            S:=emptystack 
3            while L is not empty 
4               y:=L.getNext() 
5               if y is closing parenthesis, 
6                  T:=innerParse(S) 
7               else 
8                  S.push(y) 
9            y:=S.pop() 
10            if S is empty, 
11               return y 
12            else 
13               o:=S.pop() 
14               x:=S.pop() 
15               return newTree(o,x,y)

The innerParse routine parses from the stack, as follows.

1            Algorithm InnerParse(S
2            Q:=emptyLinkedList 
3            c:=True 
4            while c 
5               x:=S.pop() 
6               if x is opening parenthesis 
7                  c:=False 
8               else 
9                  Q.addHead(x) 
10                  if S is empty, c:=False 
11            p:=Q.first() 
12            while pnull 
13              o:=Q.get(p) 
14              if o is division or multiplication, 
15                x:=Q.before(p) 
16                y:=Q.after(p) 
17                Q.set(p,newTree(o,Q.get(x),Q.get(y))) 
18                Q.remove(x) 
19                Q.remove(y) 
20              p:=Q.after(p) 
21            p:=Q.last() 
22            while pnull 
23              o:=Q.get(p) 
24              if o is addition or subtraction, 
25                x:=Q.before(p) 
26                y:=Q.after(p) 
27                Q.set(p,newTree(o,Q.get(x),Q.get(y))) 
28                Q.remove(x) 
29                Q.remove(y) 
30              p:=Q.after(p) 
31            p:=Q.first() 
32            assert Q is empty 
33            return Q.get(p)

In the inner parser, we construct the list Q so that the tokens appear in the original order. The top element from the stack came last from the string, and is entered first into the list. It ends up last, since subsequent elements are added before it in the list.

We traverse this list twice. Once for multiplication/division and once for addition subtraction. Operations at the same precedence, are processed left to right. Hence, we traverse the list from head to tail, and whenever we find an operation of the right precedence, we apply it to its two neighbours, and combine these three tokens into a tree, which replaces the three tokens.

If the input is valid, we get operators and operands alternately when we pop the stack. After two traversals, all operators have been removed, with the adjacent operands collapsing into a single operand. After the second traversal there should be a single operand left in the list, and this operand is returned.

We complete the translation by post-order traversal of the tree.

1            Algorithm infixToPostfix(L
2            T:=ArithmeticTree(L) 
3            return postfixOutput(T)

where

1            Algorithm postfixOutput(T
2            s:= 
3            s.append(postfixOutput(T.leftSubtree())) 
4            s.append(space) 
5            s.append(postfixOutput(T.rightSubtree())) 
6            s.append(space) 
7            s.append(T.root()) 
8            return s

The time complexity is linear. The input is traversed several times, but there is only a constant number of traversals, and the work per node is constant in each traversal. The traversals are (1) from input to stack, (2) from stack to list, (3–4) two traversals of the list, and finally the (5) traversal of the tree.

Remark 6.1 The preliminary solution given above, although incomplete, is a good start and more than halfway to a perfect answer. You should not try to skip this preliminary step, but rather embrace the partial solution.

The typed solution is not satisfactory. It should have been written by hand, with figures to demonstrate the key points. Text only is hard to read, and I have only typed it because of popular demand. Figures are important.

The argument shows the sequence of discovery. If you have the slightest doubt about the solution, this is the way to go. If you cut down on the presentation, you need a much more formal proof with strict logic.

If you read compiler theory and formal languages, you may find better ways to solve this problem. The solution given is within reach with the bare minimum of theory taught in this module. It is a difficult problem though.

6.1.3 Exercises

Problem 6.6 Goodrich et al C-8.29