# Tree

## Graph

Before diving into tree, let’s understand what is a graph

A graph is a data structure that consists of following two components: - A finite set of **vertices** also called as **nodes**. - A finite set of ordered pair of the form (u, v) called as **edge**. - The pair is ordered because (u, v) is not same as (v, u) in case of directed graph(di-graph). - The pair of form (u, v) indicates that there is an edge from vertex u to vertex v. - The edges may contain weight/value/cost.

### Directed Graph

- Nodes: \{A, B, C, D\}
- Edges: \{(A, B), (A, C), (B, D), (C, D), (D, A)\}

Notice that (A, B) is not same as (B, A). That’s because the graph is **directed**

### Undirected Graph

In the above graph, (A, B) is same as (B, A). That’s because the graph is **undirected**

### Cycle

A cycle is a path of edges and vertices wherein a vertex is reachable from itself

There is a cycle in the above graph. The cycle is \{A, B, C, D\}

There is also a cycle in the above graph. The cycle is \{A, B, C\}

## Graph Representation

There are two common ways to represent a graph - Adjacency Matrix - Adjacency List

### Adjacency Matrix

An adjacency matrix is a square matrix used to represent a finite graph. The elements of the matrix indicate whether pairs of vertices are adjacent or not in the graph.

```
import networkx as nx
import numpy as np
# graph using adjacency matrix, 4 nodes
= [[0, 1, 1, 1],
graph 1, 0, 0, 0],
[1, 1, 0, 0],
[1, 0, 0, 1]]
[
# Create a graph from the adjacency matrix
= nx.DiGraph(np.array(graph))
G
# Draw the graph
=True, node_color='lightblue', node_size=500) nx.draw(G, with_labels
```

### Adjacency List

An adjacency list is a collection of unordered lists used to represent a finite graph. Each list describes the set of neighbors of a vertex in the graph.

The above graph can be represented as follows:

```
= [
graph 1, 2, 3],
[0],
[0, 1],
[0, 3]
[ ]
```

### Edge List

Edge list is a collection of unordered pairs used to represent a finite graph. Each pair describes the edge between two vertices in the graph.

`= [(0, 1), (0, 2), (0, 3), (1, 0), (2, 0), (2, 1), (3, 0), (3, 2)] graph `

### Adjacency Matrix vs Adjacency List

Adjacency Matrix | Adjacency List |
---|---|

Takes O(V^2) space | Takes O(V + E) space |

Checking if two vertices are connected takes O(1) time | Checking if two vertices are connected takes O(V) time |

Iterating over all the edges takes O(V^2) time | Iterating over all the edges takes O(V + E) time |

Additional Notes: When the graph is sparse, adjacency list is generally preferred. When the graph is dense, adjacency matrix is generally preferred.

### Implicit Graph

An implicit graph is a graph that is not represented by an adjacency list or an adjacency matrix. Instead, the graph is defined by a set of states and transitions between those states.

Example: A chess board is an implicit graph. The states are the positions of the pieces on the board. The transitions are the legal moves that can be made by each piece.

# Tree

Tree is a special type of graph where: - There is only one path between any two nodes - There are no cycles

## Tree Representation

### Adjacency Matrix

Tree can be represented using adjacency matrix. However, it is not a good idea to do so. That’s because adjacency matrix takes O(V^2) space. Since tree is a special type of graph (sparse), it is better to use adjacency list to represent it.

### Adjancency List

Tree can be represented as a HashMap, where the key is the node and the value is the list of children.

```
= {}
tree
def add_node(parent, value):
if parent not in tree:
= []
tree[parent]
tree[parent].append(value)
1, 2)
add_node(1, 3)
add_node(1, 4)
add_node(4, 5)
add_node(4, 6)
add_node(
print(tree)
```

`{1: [2, 3, 4], 4: [5, 6]}`

### Node Class

We can create a class to represent a node. Each node has a value and a list of children nodes.

```
class Node:
list["Node"]
children: int
value:
def __init__(self, value):
self.value = value
self.children = []
def add_child(self, child):
self.children.append(child)
def __repr__(self):
return f"Node({self.value}, {self.children})"
= Node(1)
root 2))
root.add_child(Node(3))
root.add_child(Node(
= Node(4)
child 5))
child.add_child(Node(6))
child.add_child(Node(
root.add_child(child)
print(root)
```

`Node(1, [Node(2, []), Node(3, []), Node(4, [Node(5, []), Node(6, [])])])`

## Use Cases

### Directory

Tree can be used to represent a directory. Each node is a directory and the children are the files and subdirectories.

```
.
├── _publish.yml
├── _quarto.yml
├── _site
│ ├── agenda
│ │ └── 00-big-o-set.html
│ ├── algorithm
│ │ └── recursion.html
│ ├── cryptography
│ │ ├── 00_cryptographic_hash.html
│ │ └── cryptographic_hash.html
```

It’s a tree, isn’t it? :)

```
= {}
directory
def add_subpath(parent, child):
if parent not in directory:
= []
directory[parent]
directory[parent].append(child)
"/", "/home")
add_subpath("/", "/usr")
add_subpath("/home", "/home/user")
add_subpath("/home", "/home/tmp")
add_subpath(
print(directory)
```

`{'/': ['/home', '/usr'], '/home': ['/home/user', '/home/tmp']}`

### HTML

Well, HTML is a tree. Each node is a tag and the children are the tags inside the tag.

```
<html>
<head>
<title>Tree</title>
</head>
<body>
<h1>Tree</h1>
<p>Tree is a special type of graph</p>
</body>
</html>
```

```
class Tag:
str
name: dict
attributes: list
children:
def __init__(self, name, attributes=None, children=None):
self.name = name
self.attributes = attributes or {}
self.children = children or []
def __str__(self) -> str:
# print in HTML format
= "".join([str(child) for child in self.children])
childrens = " ".join([f'{key}="{value}"' for key, value in self.attributes.items()])
attributes return f"<{self.name} {attributes}>{childrens}</{self.name}>"
= Tag("html", children=[
tag "head", children=[
Tag("title", children=[
Tag("text", children=["Hello, world!"])
Tag(
])
]),"body", children=[
Tag("div", attributes={"class": "container"}, children=[
Tag("h1", children=[
Tag("text", children=["Hello, world!"])
Tag(
])
])
])
])
print(str(tag))
```

`<html ><head ><title ><text >Hello, world!</text></title></head><body ><div class="container"><h1 ><text >Hello, world!</text></h1></div></body></html>`

### Abstract Syntax Tree

Abstract Syntax Tree (AST) is a tree representation of the abstract syntactic structure of source code written in a programming language. Each node is an operator or an operand and the children are the operands of the operator.

That’s the AST for

```
while b ≠ 0:
if a > b:
:= a - b
a else:
:= b - a
b return a
```

Source: Wikipedia

### Binary Search Tree

Binary Search Tree is a tree where each node has at most two children. The left child is smaller than the parent and the right child is greater than the parent.

```
class BST:
int
value: "BST"
left: "BST"
right:
def __init__(self, value):
self.value = value
self.left = None
self.right = None
def insert(self, value):
if value < self.value:
if self.left is None:
self.left = BST(value)
else:
self.left.insert(value)
else:
if self.right is None:
self.right = BST(value)
else:
self.right.insert(value)
def __str__(self):
return f"BST({self.value}, {self.left}, {self.right})"
= BST(10)
tree 5)
tree.insert(15)
tree.insert(13)
tree.insert(7)
tree.insert(3)
tree.insert(17)
tree.insert(
print(tree)
```

`BST(10, BST(5, BST(3, None, None), BST(7, None, None)), BST(15, BST(13, None, None), BST(17, None, None)))`

Well, it’s hard to read the above graph. Let’s draw it in a better way. Let’s learn tree traversal.

## Tree Traversal

### Inorder Traversal

Inorder traversal is a type of depth-first traversal. In inorder traversal, we first visit the left subtree, then the root node, and finally the right subtree.

As the name implies, the output of inorder traversal in a BST is sorted.

```
def inorder(node):
if node is None:
return
yield from inorder(node.left)
yield node.value
yield from inorder(node.right)
print(list(inorder(tree)))
```

`[3, 5, 7, 10, 13, 15, 17]`

### Preorder Traversal

Preorder traversal is a type of depth-first traversal. In preorder traversal, we first visit the root node, then the left subtree, and finally the right subtree.

```
def preorder(node):
if node is None:
return
yield node.value
yield from preorder(node.left)
yield from preorder(node.right)
print(list(preorder(tree)))
```

`[10, 5, 3, 7, 15, 13, 17]`

### Postorder Traversal

Postorder traversal is a type of depth-first traversal. In postorder traversal, we first visit the left subtree, then the right subtree, and finally the root node.

```
def postorder(node):
if node is None:
return
yield from postorder(node.left)
yield from postorder(node.right)
yield node.value
print(list(postorder(tree)))
```

`[3, 7, 5, 13, 17, 15, 10]`

## BST

### Drawing BST

Having learned tree traversal, we can now draw BST in a better way.

```
def draw(node, level=0):
if node is None:
return
+ 1)
draw(node.right, level print(" " * 4 * level + str(node.value))
+ 1)
draw(node.left, level
draw(tree)
```

```
17
15
13
10
7
5
3
```

You need to rotate your head to see the tree ;)

Let’s draw in a directory style:

```
def draw(node, level=0):
if node is None:
return
print(" " * level + "├──" + str(node.value))
+ 1)
draw(node.left, level + 1)
draw(node.right, level
draw(tree)
```

```
├──10
├──5
├──3
├──7
├──15
├──13
├──17
```

### Search on BST

The power of BST is that we can search in O(\log n) time. Let’s see how.

```
= BST(10)
tree 5)
tree.insert(15)
tree.insert(13)
tree.insert(7)
tree.insert(3)
tree.insert(17)
tree.insert(
draw(tree)
```

```
├──10
├──5
├──3
├──7
├──15
├──13
├──17
```

To search for 7:

- Start from the root node 10
- Since 7 < 10, go to the left child 5 (we can ignore the
**whole**right subtree -> eliminating half* of the tree) - Since 7 > 5, go to the right child 7
- Since 7 = 7, we found it!

The code is pretty simple, remember recursive? :)

\begin{aligned} \text{search}(x, k) &= \begin{cases} \text{False} & \text{if } x = \text{None} \\ \text{True} & \text{if } x.\text{value} = k \\ \text{search}(x.\text{left}, k) & \text{if } k < x.\text{value} \\ \text{search}(x.\text{right}, k) & \text{if } k > x.\text{value} \\ \end{cases} \end{aligned}

```
def search(node, value):
if node is None:
return False
if node.value == value:
return True
if value < node.value:
return search(node.left, value)
else:
return search(node.right, value)
print(search(tree, 7))
print(search(tree, 8))
```

```
True
False
```

### The problem of BST

BST is a great data structure. It is used in many places. However, it has a problem. The problem is that the height of the tree can be O(n).

It happens when the tree is skewed. For example, if we insert the numbers in ascending order, the tree will be skewed to the right.

```
= BST(10)
tree for i in range(1, 10):
tree.insert(i)
draw(tree)
```

```
├──10
├──1
├──2
├──3
├──4
├──5
├──6
├──7
├──8
├──9
```

That’s literally a linked list!

### Solution

WE NEED TO BALANCE THE TREE!

How? We will learn in the next class :)