Chapter 9: Binary Trees

Subtopic: Tree Concepts

Definition

A tree is a data structure that organizes data items in a hierarchical manner, where items have ancestors and descendants.

Causes

Arises when data cannot be efficiently represented in a strict linear order.
Needed when data naturally fits into groups and subgroups.

Goals / Objectives

To provide a structure for hierarchical classification of data.
To allow efficient representation of relationships such as parent-child or ancestor-descendant.

Importance

Trees allow more flexible and rich organization of data compared to linear structures like arrays or linked lists.
Useful for representing real-world hierarchies such as family trees, organizational charts, and file directories.

Procedures

Data items are placed at various levels in the hierarchy.
Each node is connected by edges (links) to show relationships.
The root is the top-level node, and all others are arranged below it.

Advantages & Disadvantages

Advantages:

Represents hierarchical data naturally.
Allows efficient access to parent/child relationships.
More powerful than simple linear ADTs.

Disadvantages:

More complex to implement than arrays or linked lists.
Traversal and manipulation can require recursive algorithms, which are harder for beginners.

Impact / Effect

Enables representation of complex systems (e.g., directory structures, decision-making processes).
Provides the foundation for specialized trees like binary trees, expression trees, and decision trees.

Examples

Family trees: show ancestors and descendants.
Organization charts: represent reporting structures.
File directories: show folders and subfolders in a computer.

Key Takeaways

A tree organizes data hierarchically, unlike linear ADTs.
Nodes have parent-child relationships, forming multiple levels.
Real-world examples include family trees, org charts, and file systems.
Trees form the basis for advanced structures like binary trees.

Subtopic: Tree Terminology

Definition

A tree is a collection of nodes connected by edges (links).
Nodes are arranged in levels that represent hierarchy.
The root is the topmost node (the only node with no parent).

Causes

Not applicable to this topic.

Goals / Objectives

To define and describe the roles and relationships of nodes in a tree.
To provide a foundation for understanding how trees are structured.

Importance

Understanding terminology is essential for learning tree algorithms and for distinguishing between node roles (root, parent, child, sibling, leaf).

Procedures

Root node: Topmost node with no parent.
Parent node: A node that has children.
Child nodes: Nodes directly connected below a parent.
Siblings: Nodes that share the same parent.
Leaf node: Node with no children.
Path: A sequence of edges leading from the root to a node.
Path length: Number of edges in a path.
Height of a tree: Number of levels (or 1 + height of the tallest subtree).
Subtree: A tree rooted at a child of a node.

Advantages & Disadvantages

Advantages:

Clear terminology helps in standardizing understanding across algorithms and structures.
Makes it easier to visualize and implement tree operations.

Disadvantages:

Many terms can confuse beginners if not linked to examples.

Impact / Effect

Correct use of terminology allows efficient communication about tree structures and algorithms.
Misunderstanding terms may cause errors in algorithm design and implementation.

Examples

In a sample tree with nodes A, B, C, D, E:
- A = root
- B, C, D, E = children of A
- B, C, D, E = siblings
- If nodes N, O, P have no children, they are leaf nodes.

Key Takeaways

Root: top node, no parent.
Parent–child–sibling terms describe relationships.
Leaf = no children, path = route from root to a node.
Height = number of levels, subtree = tree within a tree.

Subtopic: Binary Trees

Definition

A binary tree is a tree in which each node has at most two children (commonly referred to as the left child and the right child).

Causes

Not specified in notes.

Goals / Objectives

To structure hierarchical data such that each node has 0, 1, or 2 children.
To serve as the foundation for binary search trees, expression trees, and decision trees.

Importance

Provides a more restrictive but powerful form of hierarchical organization.
Essential for implementing searching, sorting, and expression evaluation.

Procedures

A binary tree can be:
- Empty (no nodes), or
- Consist of a root node with two subtrees:
  - Tleft (left subtree)
  - Tright (right subtree)
The height of a binary tree is defined as: Height(T) = 1 + height of the tallest subtree of T.

Advantages & Disadvantages

Advantages:

Simpler and more efficient than general trees.
Enables powerful algorithms like binary search.
Useful for representing hierarchical structures with ordered relationships.

Disadvantages:

Becomes inefficient if unbalanced (resembles a linked list).
Requires extra work (balancing techniques) for efficiency.

Impact / Effect

Enables efficient search, insertion, and deletion operations when structured properly.
Forms the basis for advanced data structures like binary search trees, AVL trees, and B-trees.

Examples

Fig. 25-6(b): A binary tree where each node has at most 2 children.
An empty binary tree is also a valid binary tree.

Key Takeaways

A binary tree restricts each node to at most two children.
The tree can be empty or composed of a root with left and right subtrees.
Height depends on the tallest subtree.
Efficiency depends heavily on whether the tree is balanced.

Subtopic: Traversals of a Tree

Definition

Tree traversal is the process of visiting all the nodes in a tree exactly once.

To visit a node means to process the data stored in that node.

Causes

Not specified in notes.

Goals / Objectives

To ensure every node in the tree is accessed and processed.
To define systematic methods of navigating a tree.

Importance

Essential for performing operations like searching, evaluating expressions, copying, or deleting trees.
Provides different ordering strategies depending on application needs.

Procedures

Four main types of traversals:

Preorder Traversal (Depth-first)
- Visit the root
- Visit all nodes in the left subtree
- Visit all nodes in the right subtree
Inorder Traversal (Depth-first)
- Visit all nodes in the left subtree
- Visit the root
- Visit all nodes in the right subtree
Postorder Traversal (Depth-first)
- Visit all nodes in the left subtree
- Visit all nodes in the right subtree
- Visit the root
Level-order Traversal (Breadth-first)
- Begin at the root
- Visit nodes level by level from top to bottom

Advantages & Disadvantages

Advantages:

Different traversals serve different purposes (e.g., inorder for sorted output, postorder for deletion).
Systematic approach avoids missing nodes.

Disadvantages:

Recursive traversals may be less efficient for very deep trees.
Choice of traversal must match the intended application.

Impact / Effect

Preorder, inorder, and postorder are depth-first strategies.
Level-order is a breadth-first strategy.
Different traversals yield different node visiting sequences → critical in algorithms like expression tree evaluation.

Examples

Exercise 9.1:
- Preorder → 11, 8, 3, 2, 1, 5, 4, 6, 10, 9, 7
- Inorder → 2, 3, 1, 8, 4, 5, 6, 11, 9, 10, 7
- Postorder → 2, 1, 3, 4, 6, 5, 8, 9, 7, 10, 11
Exercise 9.2:
- Reconstruct tree using:
  - Preorder: 6, 4, 2, 1, 3, 5, 8, 7, 9, 10, 11
  - Inorder: 1, 2, 3, 4, 5, 6, 7, 9, 8, 10, 11

Key Takeaways

Traversal = visiting all nodes once.
Four main methods: preorder, inorder, postorder (depth-first), and level-order (breadth-first).
Traversal order affects the output sequence.
Used in searching, sorting, evaluation, and deletion tasks.

Subtopic: Building a Binary Tree

Definition

Building a binary tree means constructing the tree structure step by step using the BinaryTree class, which implements the BinaryTreeInterface.

Causes

Not specified in notes.

Goals / Objectives

To demonstrate how to construct a binary tree programmatically.
To show how smaller trees (subtrees) can be combined to form larger ones.

Importance

Understanding construction is necessary before performing traversals, searching, or manipulation of binary trees.
Serves as a foundation for building more complex structures like expression trees and decision trees.

Procedures

Steps to build the binary tree (example given in Fig. 25-13):

Start with leaf nodes → represent each leaf (e.g., D, F, G, H) as one-node trees.
Move upward → use method setTree() to form higher-level subtrees:
- Combine leaves into subtrees (E, B, C).
Form the final tree → use setTree() again to combine subtrees and form the root (A).

Code outline:

public interface BinaryTreeInterface<T> {
    public T getRootData();
    public boolean isEmpty();
    public void clear();
    public void setTree(T rootData);
    public void setTree(T rootData, 
        BinaryTreeInterface<T> leftTree,
        BinaryTreeInterface<T> rightTree);
}

Advantages & Disadvantages

Advantages:

Modular: build trees bottom-up from leaves to root.
Reusable: small subtrees can be reused in different trees.

Disadvantages:

Manual building can be tedious for very large trees.
Requires understanding of recursive structure.

Impact / Effect

Enables flexible programmatic creation of binary trees.
Once built, the tree can be traversed, searched, or used in applications like expression evaluation.

Examples

Fig. 25-13: A binary tree with nodes labeled A–H.
Example construction:
- Leaves → D, F, G, H
- Intermediate nodes → E, B, C
- Final root → A

Key Takeaways

Build binary trees bottom-up starting from leaves.
Use setTree() method to join subtrees.
Construction process is crucial for applications like expression and decision trees.
Trees are modular: small parts combine into a larger whole.

Subtopic: Expression Trees

Definition

An expression tree is a binary tree that represents an algebraic expression where the operators are stored in internal nodes and the operands (variables or constants) are stored in leaf nodes.

Causes

Not specified in notes.

Goals / Objectives

To represent algebraic expressions in a structured tree form.
To allow evaluation of expressions using different traversal methods.

Importance

Provides a systematic way to parse, evaluate, and convert algebraic expressions.
Links binary trees to mathematical computation.

Procedures

Construct a binary tree where:
- Internal nodes = operators (e.g., +, −, ×, ÷).
- Leaf nodes = operands (e.g., variables x, y, or numbers).
Traversals yield different expression formats:
- Inorder traversal → infix expression (standard algebraic notation).
- Preorder traversal → prefix expression (Polish notation).
- Postorder traversal → postfix expression (Reverse Polish notation).

Algorithm evaluate(expressionTree):

if (expressionTree is empty)
    return 0
else:
    firstOperand = evaluate(left subtree)
    secondOperand = evaluate(right subtree)
    operator = root of expressionTree
    return result of (firstOperand operator secondOperand)

Advantages & Disadvantages

Advantages:

Easy evaluation using recursive algorithms.
Traversal gives multiple forms of the same expression.

Disadvantages:

Tree must be properly structured (incorrect tree leads to wrong results).
Slightly more complex to build compared to linear expression parsing.

Impact / Effect

Expression trees are widely used in compilers, interpreters, and calculators.
Provide the link between binary tree structure and mathematical computation.

Examples

Fig. 25-14: Expression trees for different algebraic expressions.
Traversal results:
- Inorder → expression in original form.
- Preorder → prefix form.
- Postorder → postfix form.

Key Takeaways

Expression trees represent algebraic formulas as binary trees.
Operators are internal nodes, operands are leaf nodes.
Different traversals produce infix, prefix, or postfix notations.
Recursive evaluation processes the tree to compute results.

Subtopic: Decision Trees

Definition

A decision tree is a binary tree used to help users make decisions or solve problems. Each internal node represents a decision/question, and each branch represents a possible outcome, leading to further decisions or final results.

Causes

Not specified in notes.

Goals / Objectives

To support decision-making by structuring choices step by step.
To provide the basis for building expert systems that guide users through complex problems.

Importance

Widely used in problem-solving, AI, and knowledge-based systems.
Makes the decision-making process transparent and easy to follow.

Procedures

Start with a root decision node.
Branch into two possible outcomes (yes/no, true/false, or option A/option B).
Continue branching until reaching leaf nodes, which represent final outcomes or solutions.
Decision trees may grow as new facts are acquired.

Advantages & Disadvantages

Advantages:

Intuitive and easy to understand.
Can model complex decisions step by step.
Useful in AI and data classification tasks.

Disadvantages:

Trees can become very large and complex.
May require updating whenever new information is introduced.

Impact / Effect

Forms the basis of expert systems that guide users in problem solving.
Expands as more facts are included, improving accuracy but increasing complexity.

Examples

Fig. 25-15: Portion of a binary decision tree.
Guess-the-Country game (Fig. 25-16, Fig. 25-17):
- Start with a yes/no question about a country.
- Refine the decision tree as more facts are added to improve accuracy.

Key Takeaways

Decision trees model decisions and outcomes using a binary tree structure.
Each internal node = decision, each leaf = result.
Widely used in AI, problem solving, and expert systems.
Trees can grow and evolve as new knowledge is added.

Subtopic: Implementing Tree Traversals

Definition

Implementing tree traversals means writing methods or iterators to systematically visit all nodes in a binary tree in a specified order (preorder, inorder, postorder, or level-order).

Causes

Not specified in notes.

Goals / Objectives

To provide flexible and reusable methods for traversing binary trees.
To allow traversal results to be used beyond just displaying data (e.g., storing, processing).

Importance

Enables programmers to process tree data effectively.
Iterators give users control over when and how nodes are accessed, increasing flexibility.

Procedures

Recursive Traversal Example (Postorder):

public void postorderTraverse() {
    postorderTraverse(root);
}
private void postorderTraverse(BinaryNodeInterface<T> node) {
    if (node != null) {
        postorderTraverse(node.getLeft());
        postorderTraverse(node.getRight());
        System.out.print(node.getData() + " ");
    }
}

Visits left subtree → right subtree → root.
Prints node data directly.

Using Iterators (for flexibility):
- Iterators are based on java.util.Iterator.
- Methods:
  - hasNext() → check if more nodes exist.
  - next() → return the next node’s data.
- Different iterators can be implemented for preorder, inorder, and postorder traversals.
Example: Inorder Iterator (inner class):
- Constructor calls a private recursive method inorder(root).
- Traversal results are enqueued in a Queue.
- The next() method dequeues and returns the next entry.

Advantages & Disadvantages

Advantages:

Recursive methods are simple and easy to write.
Iterators provide more control and flexibility.
Clients can perform custom operations during traversal.

Disadvantages:

Recursive methods can cause stack overflow for very deep trees.
Iterator implementation is more complex (requires queues and inner classes).

Impact / Effect

Recursive traversals are good for quick output (e.g., printing).
Iterators enable more advanced operations (e.g., storing traversal order, delayed execution).

Examples

Recursive postorder traversal → directly prints values.
Inorder Iterator with Queue → stores traversal sequence and returns values one by one.

Key Takeaways

Traversals can be implemented recursively or using iterators.
Recursive methods are simpler but less flexible.
Iterators give control and allow multiple traversal strategies.
Useful in real applications where tree data must be accessed in different ways.

Subtopic: Tree Implementation

Definition

Tree implementation refers to the way a tree’s nodes and connections are represented in memory. The most common method is using a linked structure, where each node object stores both its data and references to its children.

Causes

Not specified in notes.

Goals / Objectives

To create a practical representation of a binary tree in programming.
To enable efficient storage, traversal, and manipulation of nodes.

Importance

The implementation determines how easily we can add, remove, and search nodes.
Forms the foundation for building more advanced tree-based structures like binary search trees.

Procedures

Use a node-based structure:
- Each node contains:
  - Data field → stores the element.
  - Left reference → points to the left child.
  - Right reference → points to the right child.
- If a node has no children, its references are null.

Example (Java-style pseudocode):

private class Node<T> {
    private T data;
    private Node<T> left;
    private Node<T> right;
    ...
}

The BinaryTree class manages these nodes and implements operations defined in the BinaryTreeInterface.

Advantages & Disadvantages

Advantages:

Flexible (nodes can be dynamically added/removed).
Efficient for representing hierarchical structures.

Disadvantages:

Requires more memory (each node stores two references).
Slightly more complex to implement than arrays.

Impact / Effect

Provides the base structure for binary trees and binary search trees.
Enables algorithms like traversals, insertions, deletions, and searches.

Examples

Fig. 26-1: Node with data, left child, and right child.
A leaf node is a node where both left and right references are null.

Key Takeaways

Binary trees are commonly implemented using a linked structure.
Each node stores data + left child + right child.
This design allows flexible and dynamic growth of trees.
Implementation forms the foundation for binary search trees.

Subtopic: Binary Search Trees (BST)

Definition

A binary search tree (BST) is a type of binary tree in which:

The data in the left subtree of a node is less than the node’s data.
The data in the right subtree of a node is greater than the node’s data.
All nodes contain Comparable objects.

Causes

Not specified in notes.

Goals / Objectives

To organize data so that searching, insertion, and deletion can be performed efficiently.
To enable logarithmic-time operations when the tree is balanced.

Importance

Provides a foundation for efficient search algorithms.
Serves as a base for advanced data structures like AVL trees, B-trees, and Red-Black trees.

Procedures

Searching:
- Compare the desired value with the root.
- If smaller → search the left subtree.
- If larger → search the right subtree.
- Repeat until found or subtree is empty.
Efficiency:
- Searching a BST of height h has complexity O(h).
- Height depends on the balance of the tree:
  - Balanced tree → height ≈ log₂(n), efficient searches.
  - Unbalanced tree → height ≈ n, inefficient searches (like a linked list).

Recursive Search Algorithm (bstSearch):

if (tree is empty) → return false
if (desiredObject == root) → return true
if (desiredObject < root) → search left subtree
else → search right subtree

Operations:
- add(newEntry) → inserts new value.
- contains(entry) → checks existence.
- getEntry(entry) → retrieves a matching entry.
- remove(entry) → deletes an entry from the tree.

Advantages & Disadvantages

Advantages:

Fast searching, insertion, and deletion in O(log n) (when balanced).
Naturally recursive, simple to implement.

Disadvantages:

Performance degrades to O(n) if tree becomes skewed.
Requires balancing techniques (not handled automatically in basic BST).

Impact / Effect

Improves efficiency of data handling compared to arrays or linked lists.
Height directly affects worst-case performance.
The concept of balance is critical for maintaining efficiency.

Examples

Fig. 25-18: Binary search tree of names.
Exercise 9.3: Construct BST with values: M, E, Q, W, C, H, B, D, G, L, A.
- Preorder → M, E, C, B, A, D, H, G, L, Q, W
- Inorder → A, B, C, D, E, G, H, L, M, Q, W
- Postorder → A, B, D, C, G, L, H, E, W, Q, M

Key Takeaways

A BST enforces an ordering property: left < root < right.
Efficient searching depends on the height of the tree.
Balanced BST → O(log n) operations; skewed BST → O(n) operations.
Supports key operations: search, add, contains, getEntry, remove.

Subtopic: Removing an Entry in a Binary Search Tree

Definition

Removing an entry from a binary search tree (BST) means deleting a node that contains the specified value, while maintaining the BST property (left < root < right).

Causes

Not specified in notes.

Goals / Objectives

To delete a given entry from the BST.
To ensure the tree structure remains valid and follows BST rules.

Importance

Removing entries is as essential as inserting or searching.
Ensures trees remain usable in dynamic applications where data changes frequently.

Procedures

There are three main cases for removal:

Removing a Leaf Node
- If the node has no children → simply set its parent’s reference to null.
Removing a Node with One Child
- Replace the node with its only child (adjust parent’s reference).
Removing a Node with Two Children
- Cannot directly delete the node.
- Instead:
  - Find the rightmost node in the left subtree (inorder predecessor) OR the leftmost node in the right subtree (inorder successor).
  - Replace the node’s data with that value.
  - Delete that replacement node (which will be a leaf or have one child).

Special Case – Removing the Root

When the root is removed, the algorithm ensures a new root is chosen.
Implemented by removeEntry() method returning a new root reference.
A helper class ReturnObject is used to hold the removed entry’s data.

Java-style method outline:

public T remove(T entry) {
    ReturnObject oldEntry = new ReturnObject(null);
    root = removeEntry(root, entry, oldEntry);
    return oldEntry.get();
}

Advantages & Disadvantages

Advantages:

Maintains BST ordering even after deletions.
Supports dynamic datasets.

Disadvantages:

Deletion with two children is more complex.
Removing many nodes can lead to unbalanced trees, reducing efficiency.

Impact / Effect

Affects the balance of the tree, which influences efficiency.
Incorrect deletion can break the BST property and lead to incorrect searches.

Examples

Exercise 9.4: Remove values A, Q, and E in sequence.
1. Remove A → A is a leaf node → set parent’s reference to null.
2. Remove Q → Q has one child (W) → replace Q with W.
3. Remove E → E has two children → replace E with the largest value in its left subtree (D), then remove D.

Key Takeaways

Removal has 3 cases: leaf, one child, or two children.
The two-child case uses inorder predecessor/successor replacement.
Removing root requires special handling to maintain references.
Deletion impacts tree balance, which in turn affects efficiency.

Subtopic: Efficiency of Operations

Definition

The efficiency of operations in a binary search tree (BST) refers to the time complexity of searching, adding, and removing nodes, which depends on the height of the tree.

Causes

Efficiency is affected by whether the tree is balanced or unbalanced.
Insertion order plays a major role: inserting sorted data leads to a skewed (inefficient) tree.

Goals / Objectives

To analyze the performance of BST operations.
To understand how tree height influences complexity.

Importance

Helps determine whether BSTs are suitable for large datasets.
Provides a foundation for why balanced trees (AVL, B-trees, etc.) are necessary.

Procedures

Basic Operations (add, remove, getEntry):
- Require searching from the root down to a node.
- Complexity = O(h), where h is the height of the tree.
Height considerations:
- Worst case (skewed tree): height h = n → operations take O(n) (like a linked list).
- Best case (balanced tree): height h = log₂(n) → operations take O(log n).
- Full binary tree: shortest possible height for n nodes ≈ log₂(n+1).

Advantages & Disadvantages

Advantages:

Balanced trees allow fast searching, insertion, and deletion in O(log n).
Efficient storage and retrieval for hierarchical data.

Disadvantages:

Unbalanced trees degrade to O(n).
Maintaining balance requires additional algorithms (not present in simple BSTs).

Impact / Effect

Efficiency depends on tree balance:
- Balanced tree → O(log n) performance.
- Unbalanced tree → O(n) performance.
The balance of subtrees at each node directly impacts searching speed.

Examples

Fig. 27-13: Two BSTs with the same data but different heights:
- Shorter tree = O(log n).
- Taller (skewed) tree = O(n).
Fig. 27-14: Examples of height-balanced trees.
Notes:
- Avoid inserting already sorted data into an empty tree (produces unbalanced structure).
- Random insertion order leads to a more balanced tree.

Key Takeaways

BST operation efficiency = O(h), where h is tree height.
Balanced trees → O(log n); unbalanced trees → O(n).
Tree balance is critical for performance.
Advanced balanced search trees (AVL, 2-3 Trees, B-Trees) solve unbalance issues.

Note51

Explorer

0.0 by Claude

Chapter 9: Binary Trees

Subtopic: Tree Concepts

Definition

Causes

Goals / Objectives

Importance

Procedures

Advantages & Disadvantages

Impact / Effect

Examples

Key Takeaways

Subtopic: Tree Terminology

Definition

Causes

Goals / Objectives

Importance

Procedures

Advantages & Disadvantages

Impact / Effect

Examples

Key Takeaways

Subtopic: Binary Trees

Definition

Causes

Goals / Objectives

Importance

Procedures

Advantages & Disadvantages

Impact / Effect

Examples

Key Takeaways

Subtopic: Traversals of a Tree

Definition

Causes

Goals / Objectives

Importance

Procedures

Advantages & Disadvantages

Impact / Effect

Examples

Key Takeaways

Subtopic: Building a Binary Tree

Definition

Causes

Goals / Objectives

Importance

Procedures

Advantages & Disadvantages

Impact / Effect

Examples

Key Takeaways

Subtopic: Expression Trees

Definition

Causes

Goals / Objectives

Importance

Procedures

Advantages & Disadvantages

Impact / Effect

Examples

Key Takeaways

Subtopic: Decision Trees

Definition

Causes

Goals / Objectives

Importance

Procedures

Advantages & Disadvantages

Impact / Effect

Examples

Key Takeaways

Subtopic: Implementing Tree Traversals

Definition

Causes

Goals / Objectives

Importance

Procedures