Chapter 9: Binary Trees
Subtopic: Tree Concepts
Definition
A tree is a data structure that organizes data items in a hierarchical manner, where items have ancestors and descendants.
Causes
- Arises when data cannot be efficiently represented in a strict linear order.
- Needed when data naturally fits into groups and subgroups.
Goals / Objectives
- To provide a structure for hierarchical classification of data.
- To allow efficient representation of relationships such as parent-child or ancestor-descendant.
Importance
- Trees allow more flexible and rich organization of data compared to linear structures like arrays or linked lists.
- Useful for representing real-world hierarchies such as family trees, organizational charts, and file directories.
Procedures
- Data items are placed at various levels in the hierarchy.
- Each node is connected by edges (links) to show relationships.
- The root is the top-level node, and all others are arranged below it.
Advantages & Disadvantages
Advantages:
- Represents hierarchical data naturally.
- Allows efficient access to parent/child relationships.
- More powerful than simple linear ADTs.
Disadvantages:
- More complex to implement than arrays or linked lists.
- Traversal and manipulation can require recursive algorithms, which are harder for beginners.
Impact / Effect
- Enables representation of complex systems (e.g., directory structures, decision-making processes).
- Provides the foundation for specialized trees like binary trees, expression trees, and decision trees.
Examples
- Family trees: show ancestors and descendants.
- Organization charts: represent reporting structures.
- File directories: show folders and subfolders in a computer.
Key Takeaways
- A tree organizes data hierarchically, unlike linear ADTs.
- Nodes have parent-child relationships, forming multiple levels.
- Real-world examples include family trees, org charts, and file systems.
- Trees form the basis for advanced structures like binary trees.
Subtopic: Tree Terminology
Definition
- A tree is a collection of nodes connected by edges (links).
- Nodes are arranged in levels that represent hierarchy.
- The root is the topmost node (the only node with no parent).
Causes
Not applicable to this topic.
Goals / Objectives
- To define and describe the roles and relationships of nodes in a tree.
- To provide a foundation for understanding how trees are structured.
Importance
- Understanding terminology is essential for learning tree algorithms and for distinguishing between node roles (root, parent, child, sibling, leaf).
Procedures
- Root node: Topmost node with no parent.
- Parent node: A node that has children.
- Child nodes: Nodes directly connected below a parent.
- Siblings: Nodes that share the same parent.
- Leaf node: Node with no children.
- Path: A sequence of edges leading from the root to a node.
- Path length: Number of edges in a path.
- Height of a tree: Number of levels (or
1 + height of the tallest subtree). - Subtree: A tree rooted at a child of a node.
Advantages & Disadvantages
Advantages:
- Clear terminology helps in standardizing understanding across algorithms and structures.
- Makes it easier to visualize and implement tree operations.
Disadvantages:
- Many terms can confuse beginners if not linked to examples.
Impact / Effect
- Correct use of terminology allows efficient communication about tree structures and algorithms.
- Misunderstanding terms may cause errors in algorithm design and implementation.
Examples
-
In a sample tree with nodes A, B, C, D, E:
- A = root
- B, C, D, E = children of A
- B, C, D, E = siblings
- If nodes N, O, P have no children, they are leaf nodes.
Key Takeaways
- Root: top node, no parent.
- Parent–child–sibling terms describe relationships.
- Leaf = no children, path = route from root to a node.
- Height = number of levels, subtree = tree within a tree.
Subtopic: Binary Trees
Definition
A binary tree is a tree in which each node has at most two children (commonly referred to as the left child and the right child).
Causes
Not specified in notes.
Goals / Objectives
- To structure hierarchical data such that each node has 0, 1, or 2 children.
- To serve as the foundation for binary search trees, expression trees, and decision trees.
Importance
- Provides a more restrictive but powerful form of hierarchical organization.
- Essential for implementing searching, sorting, and expression evaluation.
Procedures
-
A binary tree can be:
-
Empty (no nodes), or
-
Consist of a root node with two subtrees:
Tleft(left subtree)Tright(right subtree)
-
-
The height of a binary tree is defined as:
Height(T) = 1 + height of the tallest subtree of T.
Advantages & Disadvantages
Advantages:
- Simpler and more efficient than general trees.
- Enables powerful algorithms like binary search.
- Useful for representing hierarchical structures with ordered relationships.
Disadvantages:
- Becomes inefficient if unbalanced (resembles a linked list).
- Requires extra work (balancing techniques) for efficiency.
Impact / Effect
- Enables efficient search, insertion, and deletion operations when structured properly.
- Forms the basis for advanced data structures like binary search trees, AVL trees, and B-trees.
Examples
- Fig. 25-6(b): A binary tree where each node has at most 2 children.
- An empty binary tree is also a valid binary tree.
Key Takeaways
- A binary tree restricts each node to at most two children.
- The tree can be empty or composed of a root with left and right subtrees.
- Height depends on the tallest subtree.
- Efficiency depends heavily on whether the tree is balanced.
Subtopic: Traversals of a Tree
Definition
Tree traversal is the process of visiting all the nodes in a tree exactly once.
- To visit a node means to process the data stored in that node.
Causes
Not specified in notes.
Goals / Objectives
- To ensure every node in the tree is accessed and processed.
- To define systematic methods of navigating a tree.
Importance
- Essential for performing operations like searching, evaluating expressions, copying, or deleting trees.
- Provides different ordering strategies depending on application needs.
Procedures
Four main types of traversals:
-
Preorder Traversal (Depth-first)
- Visit the root
- Visit all nodes in the left subtree
- Visit all nodes in the right subtree
-
Inorder Traversal (Depth-first)
- Visit all nodes in the left subtree
- Visit the root
- Visit all nodes in the right subtree
-
Postorder Traversal (Depth-first)
- Visit all nodes in the left subtree
- Visit all nodes in the right subtree
- Visit the root
-
Level-order Traversal (Breadth-first)
- Begin at the root
- Visit nodes level by level from top to bottom
Advantages & Disadvantages
Advantages:
- Different traversals serve different purposes (e.g., inorder for sorted output, postorder for deletion).
- Systematic approach avoids missing nodes.
Disadvantages:
- Recursive traversals may be less efficient for very deep trees.
- Choice of traversal must match the intended application.
Impact / Effect
- Preorder, inorder, and postorder are depth-first strategies.
- Level-order is a breadth-first strategy.
- Different traversals yield different node visiting sequences → critical in algorithms like expression tree evaluation.
Examples
-
Exercise 9.1:
- Preorder → 11, 8, 3, 2, 1, 5, 4, 6, 10, 9, 7
- Inorder → 2, 3, 1, 8, 4, 5, 6, 11, 9, 10, 7
- Postorder → 2, 1, 3, 4, 6, 5, 8, 9, 7, 10, 11
-
Exercise 9.2:
-
Reconstruct tree using:
- Preorder: 6, 4, 2, 1, 3, 5, 8, 7, 9, 10, 11
- Inorder: 1, 2, 3, 4, 5, 6, 7, 9, 8, 10, 11
-
Key Takeaways
- Traversal = visiting all nodes once.
- Four main methods: preorder, inorder, postorder (depth-first), and level-order (breadth-first).
- Traversal order affects the output sequence.
- Used in searching, sorting, evaluation, and deletion tasks.
Subtopic: Building a Binary Tree
Definition
Building a binary tree means constructing the tree structure step by step using the BinaryTree class, which implements the BinaryTreeInterface.
Causes
Not specified in notes.
Goals / Objectives
- To demonstrate how to construct a binary tree programmatically.
- To show how smaller trees (subtrees) can be combined to form larger ones.
Importance
- Understanding construction is necessary before performing traversals, searching, or manipulation of binary trees.
- Serves as a foundation for building more complex structures like expression trees and decision trees.
Procedures
Steps to build the binary tree (example given in Fig. 25-13):
-
Start with leaf nodes → represent each leaf (e.g.,
D, F, G, H) as one-node trees. -
Move upward → use method
setTree()to form higher-level subtrees:- Combine leaves into subtrees (
E, B, C).
- Combine leaves into subtrees (
-
Form the final tree → use
setTree()again to combine subtrees and form the root (A).
Code outline:
public interface BinaryTreeInterface<T> {
public T getRootData();
public boolean isEmpty();
public void clear();
public void setTree(T rootData);
public void setTree(T rootData,
BinaryTreeInterface<T> leftTree,
BinaryTreeInterface<T> rightTree);
}Advantages & Disadvantages
Advantages:
- Modular: build trees bottom-up from leaves to root.
- Reusable: small subtrees can be reused in different trees.
Disadvantages:
- Manual building can be tedious for very large trees.
- Requires understanding of recursive structure.
Impact / Effect
- Enables flexible programmatic creation of binary trees.
- Once built, the tree can be traversed, searched, or used in applications like expression evaluation.
Examples
-
Fig. 25-13: A binary tree with nodes labeled
A–H. -
Example construction:
- Leaves →
D, F, G, H - Intermediate nodes →
E, B, C - Final root →
A
- Leaves →
Key Takeaways
- Build binary trees bottom-up starting from leaves.
- Use
setTree()method to join subtrees. - Construction process is crucial for applications like expression and decision trees.
- Trees are modular: small parts combine into a larger whole.
Subtopic: Expression Trees
Definition
An expression tree is a binary tree that represents an algebraic expression where the operators are stored in internal nodes and the operands (variables or constants) are stored in leaf nodes.
Causes
Not specified in notes.
Goals / Objectives
- To represent algebraic expressions in a structured tree form.
- To allow evaluation of expressions using different traversal methods.
Importance
- Provides a systematic way to parse, evaluate, and convert algebraic expressions.
- Links binary trees to mathematical computation.
Procedures
-
Construct a binary tree where:
- Internal nodes = operators (e.g., +, −, ×, ÷).
- Leaf nodes = operands (e.g., variables
x, y, or numbers).
-
Traversals yield different expression formats:
- Inorder traversal → infix expression (standard algebraic notation).
- Preorder traversal → prefix expression (Polish notation).
- Postorder traversal → postfix expression (Reverse Polish notation).
Algorithm evaluate(expressionTree):
if (expressionTree is empty)
return 0
else:
firstOperand = evaluate(left subtree)
secondOperand = evaluate(right subtree)
operator = root of expressionTree
return result of (firstOperand operator secondOperand)
Advantages & Disadvantages
Advantages:
- Easy evaluation using recursive algorithms.
- Traversal gives multiple forms of the same expression.
Disadvantages:
- Tree must be properly structured (incorrect tree leads to wrong results).
- Slightly more complex to build compared to linear expression parsing.
Impact / Effect
- Expression trees are widely used in compilers, interpreters, and calculators.
- Provide the link between binary tree structure and mathematical computation.
Examples
-
Fig. 25-14: Expression trees for different algebraic expressions.
-
Traversal results:
- Inorder → expression in original form.
- Preorder → prefix form.
- Postorder → postfix form.
Key Takeaways
- Expression trees represent algebraic formulas as binary trees.
- Operators are internal nodes, operands are leaf nodes.
- Different traversals produce infix, prefix, or postfix notations.
- Recursive evaluation processes the tree to compute results.
Subtopic: Decision Trees
Definition
A decision tree is a binary tree used to help users make decisions or solve problems. Each internal node represents a decision/question, and each branch represents a possible outcome, leading to further decisions or final results.
Causes
Not specified in notes.
Goals / Objectives
- To support decision-making by structuring choices step by step.
- To provide the basis for building expert systems that guide users through complex problems.
Importance
- Widely used in problem-solving, AI, and knowledge-based systems.
- Makes the decision-making process transparent and easy to follow.
Procedures
- Start with a root decision node.
- Branch into two possible outcomes (yes/no, true/false, or option A/option B).
- Continue branching until reaching leaf nodes, which represent final outcomes or solutions.
- Decision trees may grow as new facts are acquired.
Advantages & Disadvantages
Advantages:
- Intuitive and easy to understand.
- Can model complex decisions step by step.
- Useful in AI and data classification tasks.
Disadvantages:
- Trees can become very large and complex.
- May require updating whenever new information is introduced.
Impact / Effect
- Forms the basis of expert systems that guide users in problem solving.
- Expands as more facts are included, improving accuracy but increasing complexity.
Examples
-
Fig. 25-15: Portion of a binary decision tree.
-
Guess-the-Country game (Fig. 25-16, Fig. 25-17):
- Start with a yes/no question about a country.
- Refine the decision tree as more facts are added to improve accuracy.
Key Takeaways
- Decision trees model decisions and outcomes using a binary tree structure.
- Each internal node = decision, each leaf = result.
- Widely used in AI, problem solving, and expert systems.
- Trees can grow and evolve as new knowledge is added.
Subtopic: Implementing Tree Traversals
Definition
Implementing tree traversals means writing methods or iterators to systematically visit all nodes in a binary tree in a specified order (preorder, inorder, postorder, or level-order).
Causes
Not specified in notes.
Goals / Objectives
- To provide flexible and reusable methods for traversing binary trees.
- To allow traversal results to be used beyond just displaying data (e.g., storing, processing).
Importance
- Enables programmers to process tree data effectively.
- Iterators give users control over when and how nodes are accessed, increasing flexibility.
Procedures
-
Recursive Traversal Example (Postorder):
public void postorderTraverse() { postorderTraverse(root); } private void postorderTraverse(BinaryNodeInterface<T> node) { if (node != null) { postorderTraverse(node.getLeft()); postorderTraverse(node.getRight()); System.out.print(node.getData() + " "); } }- Visits left subtree → right subtree → root.
- Prints node data directly.
-
Using Iterators (for flexibility):
-
Iterators are based on
java.util.Iterator. -
Methods:
hasNext()→ check if more nodes exist.next()→ return the next node’s data.
-
Different iterators can be implemented for preorder, inorder, and postorder traversals.
-
-
Example: Inorder Iterator (inner class):
- Constructor calls a private recursive method
inorder(root). - Traversal results are enqueued in a
Queue. - The
next()method dequeues and returns the next entry.
- Constructor calls a private recursive method
Advantages & Disadvantages
Advantages:
- Recursive methods are simple and easy to write.
- Iterators provide more control and flexibility.
- Clients can perform custom operations during traversal.
Disadvantages:
- Recursive methods can cause stack overflow for very deep trees.
- Iterator implementation is more complex (requires queues and inner classes).
Impact / Effect
- Recursive traversals are good for quick output (e.g., printing).
- Iterators enable more advanced operations (e.g., storing traversal order, delayed execution).
Examples
- Recursive postorder traversal → directly prints values.
- Inorder Iterator with Queue → stores traversal sequence and returns values one by one.
Key Takeaways
- Traversals can be implemented recursively or using iterators.
- Recursive methods are simpler but less flexible.
- Iterators give control and allow multiple traversal strategies.
- Useful in real applications where tree data must be accessed in different ways.
Subtopic: Tree Implementation
Definition
Tree implementation refers to the way a tree’s nodes and connections are represented in memory. The most common method is using a linked structure, where each node object stores both its data and references to its children.
Causes
Not specified in notes.
Goals / Objectives
- To create a practical representation of a binary tree in programming.
- To enable efficient storage, traversal, and manipulation of nodes.
Importance
- The implementation determines how easily we can add, remove, and search nodes.
- Forms the foundation for building more advanced tree-based structures like binary search trees.
Procedures
-
Use a node-based structure:
-
Each node contains:
- Data field → stores the element.
- Left reference → points to the left child.
- Right reference → points to the right child.
-
If a node has no children, its references are
null.
-
Example (Java-style pseudocode):
private class Node<T> {
private T data;
private Node<T> left;
private Node<T> right;
...
}- The BinaryTree class manages these nodes and implements operations defined in the
BinaryTreeInterface.
Advantages & Disadvantages
Advantages:
- Flexible (nodes can be dynamically added/removed).
- Efficient for representing hierarchical structures.
Disadvantages:
- Requires more memory (each node stores two references).
- Slightly more complex to implement than arrays.
Impact / Effect
- Provides the base structure for binary trees and binary search trees.
- Enables algorithms like traversals, insertions, deletions, and searches.
Examples
- Fig. 26-1: Node with
data,left child, andright child. - A leaf node is a node where both
leftandrightreferences arenull.
Key Takeaways
- Binary trees are commonly implemented using a linked structure.
- Each node stores data + left child + right child.
- This design allows flexible and dynamic growth of trees.
- Implementation forms the foundation for binary search trees.
Subtopic: Binary Search Trees (BST)
Definition
A binary search tree (BST) is a type of binary tree in which:
- The data in the left subtree of a node is less than the node’s data.
- The data in the right subtree of a node is greater than the node’s data.
- All nodes contain Comparable objects.
Causes
Not specified in notes.
Goals / Objectives
- To organize data so that searching, insertion, and deletion can be performed efficiently.
- To enable logarithmic-time operations when the tree is balanced.
Importance
- Provides a foundation for efficient search algorithms.
- Serves as a base for advanced data structures like AVL trees, B-trees, and Red-Black trees.
Procedures
-
Searching:
- Compare the desired value with the root.
- If smaller → search the left subtree.
- If larger → search the right subtree.
- Repeat until found or subtree is empty.
-
Efficiency:
-
Searching a BST of height
hhas complexity O(h). -
Height depends on the balance of the tree:
- Balanced tree → height ≈ log₂(n), efficient searches.
- Unbalanced tree → height ≈ n, inefficient searches (like a linked list).
-
-
Recursive Search Algorithm (bstSearch):
if (tree is empty) → return false if (desiredObject == root) → return true if (desiredObject < root) → search left subtree else → search right subtree -
Operations:
add(newEntry)→ inserts new value.contains(entry)→ checks existence.getEntry(entry)→ retrieves a matching entry.remove(entry)→ deletes an entry from the tree.
Advantages & Disadvantages
Advantages:
- Fast searching, insertion, and deletion in O(log n) (when balanced).
- Naturally recursive, simple to implement.
Disadvantages:
- Performance degrades to O(n) if tree becomes skewed.
- Requires balancing techniques (not handled automatically in basic BST).
Impact / Effect
- Improves efficiency of data handling compared to arrays or linked lists.
- Height directly affects worst-case performance.
- The concept of balance is critical for maintaining efficiency.
Examples
-
Fig. 25-18: Binary search tree of names.
-
Exercise 9.3: Construct BST with values: M, E, Q, W, C, H, B, D, G, L, A.
- Preorder → M, E, C, B, A, D, H, G, L, Q, W
- Inorder → A, B, C, D, E, G, H, L, M, Q, W
- Postorder → A, B, D, C, G, L, H, E, W, Q, M
Key Takeaways
- A BST enforces an ordering property: left < root < right.
- Efficient searching depends on the height of the tree.
- Balanced BST → O(log n) operations; skewed BST → O(n) operations.
- Supports key operations: search, add, contains, getEntry, remove.
Subtopic: Removing an Entry in a Binary Search Tree
Definition
Removing an entry from a binary search tree (BST) means deleting a node that contains the specified value, while maintaining the BST property (left < root < right).
Causes
Not specified in notes.
Goals / Objectives
- To delete a given entry from the BST.
- To ensure the tree structure remains valid and follows BST rules.
Importance
- Removing entries is as essential as inserting or searching.
- Ensures trees remain usable in dynamic applications where data changes frequently.
Procedures
There are three main cases for removal:
-
Removing a Leaf Node
- If the node has no children → simply set its parent’s reference to
null.
- If the node has no children → simply set its parent’s reference to
-
Removing a Node with One Child
- Replace the node with its only child (adjust parent’s reference).
-
Removing a Node with Two Children
-
Cannot directly delete the node.
-
Instead:
- Find the rightmost node in the left subtree (inorder predecessor) OR the leftmost node in the right subtree (inorder successor).
- Replace the node’s data with that value.
- Delete that replacement node (which will be a leaf or have one child).
-
Special Case – Removing the Root
- When the root is removed, the algorithm ensures a new root is chosen.
- Implemented by
removeEntry()method returning a new root reference. - A helper class
ReturnObjectis used to hold the removed entry’s data.
Java-style method outline:
public T remove(T entry) {
ReturnObject oldEntry = new ReturnObject(null);
root = removeEntry(root, entry, oldEntry);
return oldEntry.get();
}Advantages & Disadvantages
Advantages:
- Maintains BST ordering even after deletions.
- Supports dynamic datasets.
Disadvantages:
- Deletion with two children is more complex.
- Removing many nodes can lead to unbalanced trees, reducing efficiency.
Impact / Effect
- Affects the balance of the tree, which influences efficiency.
- Incorrect deletion can break the BST property and lead to incorrect searches.
Examples
-
Exercise 9.4: Remove values
A,Q, andEin sequence.- Remove
A→ A is a leaf node → set parent’s reference tonull. - Remove
Q→ Q has one child (W) → replace Q with W. - Remove
E→ E has two children → replace E with the largest value in its left subtree (D), then removeD.
- Remove
Key Takeaways
- Removal has 3 cases: leaf, one child, or two children.
- The two-child case uses inorder predecessor/successor replacement.
- Removing root requires special handling to maintain references.
- Deletion impacts tree balance, which in turn affects efficiency.
Subtopic: Efficiency of Operations
Definition
The efficiency of operations in a binary search tree (BST) refers to the time complexity of searching, adding, and removing nodes, which depends on the height of the tree.
Causes
- Efficiency is affected by whether the tree is balanced or unbalanced.
- Insertion order plays a major role: inserting sorted data leads to a skewed (inefficient) tree.
Goals / Objectives
- To analyze the performance of BST operations.
- To understand how tree height influences complexity.
Importance
- Helps determine whether BSTs are suitable for large datasets.
- Provides a foundation for why balanced trees (AVL, B-trees, etc.) are necessary.
Procedures
-
Basic Operations (add, remove, getEntry):
- Require searching from the root down to a node.
- Complexity = O(h), where
his the height of the tree.
-
Height considerations:
- Worst case (skewed tree): height
h = n→ operations take O(n) (like a linked list). - Best case (balanced tree): height
h = log₂(n)→ operations take O(log n). - Full binary tree: shortest possible height for
nnodes ≈log₂(n+1).
- Worst case (skewed tree): height
Advantages & Disadvantages
Advantages:
- Balanced trees allow fast searching, insertion, and deletion in O(log n).
- Efficient storage and retrieval for hierarchical data.
Disadvantages:
- Unbalanced trees degrade to O(n).
- Maintaining balance requires additional algorithms (not present in simple BSTs).
Impact / Effect
-
Efficiency depends on tree balance:
- Balanced tree → O(log n) performance.
- Unbalanced tree → O(n) performance.
-
The balance of subtrees at each node directly impacts searching speed.
Examples
-
Fig. 27-13: Two BSTs with the same data but different heights:
- Shorter tree = O(log n).
- Taller (skewed) tree = O(n).
-
Fig. 27-14: Examples of height-balanced trees.
-
Notes:
- Avoid inserting already sorted data into an empty tree (produces unbalanced structure).
- Random insertion order leads to a more balanced tree.
Key Takeaways
- BST operation efficiency = O(h), where
his tree height. - Balanced trees → O(log n); unbalanced trees → O(n).
- Tree balance is critical for performance.
- Advanced balanced search trees (AVL, 2-3 Trees, B-Trees) solve unbalance issues.