1. Abstraction, Encapsulation, Modularization, Data Structures, ADTs & Algorithms

1.1 Abstraction

  • Definition: The process of hiding unnecessary details and exposing only the essential features of a concept or system.
  • Purpose: Focus on what a system does, not how it does it.
  • Example:
    • Using Integer.parseInt("123") without knowing the internal parsing logic.
  • Contribution to Software Quality:
    • Simplifies complexity → easier to design and understand.
    • Promotes reusability since abstract definitions can be applied in multiple contexts.

1.2 Encapsulation

  • Definition: Bundling data and methods that operate on that data into a single unit, while restricting direct access to some components.
  • Key Mechanism: Information hiding (e.g., private fields, public methods).
  • Example:
    • A BankAccount class with private balance and public methods deposit() and withdraw().
  • Contribution to Software Quality:
    • Enhances maintainability: internal changes don’t affect external code.
    • Improves security: prevents unintended interference with internal state.

1.3 Modularization

  • Definition: Dividing a software system into independent, interchangeable modules.
  • Example:
    • Separating a program into modules like UI, Database, and BusinessLogic.
  • Contribution to Software Quality:
    • Easier debugging and testing.
    • Promotes team productivity: different teams can work on different modules.
    • Supports reuse of modules across projects.

1.4 Data Structures

  • Definition: Concrete implementations of ways to store and organize data efficiently.
  • Examples: Arrays, Linked Lists, Stacks, Queues, Trees, Graphs.
  • Contribution:
    • Provide efficient access and manipulation of data.
    • Enable scalable and optimized solutions.

1.5 Abstract Data Types (ADTs)

  • Definition: A logical description of a data type defined by its behavior (operations) rather than its implementation.
  • Examples:
    • List: ordered collection with operations like add, remove, get.
    • Stack: LIFO structure with push, pop, peek.
    • Queue: FIFO structure with enqueue, dequeue.
  • Contribution:
    • Promotes abstraction: users focus on operations, not implementation.
    • Supports reusability: same ADT can be implemented differently (e.g., stack via array or linked list).

1.6 Algorithms

  • Definition: A step-by-step procedure to solve a problem.
  • Qualities of Good Algorithms:
    • Correctness: Produces the right output.
    • Efficiency: Optimized in terms of time and space.
    • Clarity: Easy to understand and implement.
  • Contribution:
    • Efficient algorithms improve performance.
    • Reusable algorithms (e.g., sorting, searching) save development time.

1.7 How They Work Together for Software Quality

ConceptContribution to Quality & Productivity
AbstractionSimplifies complexity, improves reuse.
EncapsulationProtects data, eases maintenance.
ModularizationEnables parallel development, reuse.
Data StructuresEfficient data organization.
ADTsStandardized, reusable interfaces.
AlgorithmsEfficient problem-solving.

2. Hashing, Collisions & Collision Resolution

2.1 Hashing

  • Definition: A technique to map data (keys) into fixed-size indices in a hash table using a hash function.
  • Goal: Achieve near O(1) time complexity for insertion, deletion, and search.
  • Example:
    • Hash function for phone numbers: h(key) = last 4 digits % tableSize.

2.2 Collisions

  • Definition: When two different keys map to the same hash index.
  • Cause: Imperfect hash functions or limited table size.
  • Problem: Multiple values competing for the same slot.

2.3 Collision Resolution Methods

2.3.1 Open Addressing

  • Idea: Find another open slot in the table.
  • Techniques:
    • Linear Probing: Check next slots sequentially (k+1, k+2, …).
      • Advantage: Simple.
      • Disadvantage: Causes primary clustering.
    • Quadratic Probing: Use quadratic increments (k+1², k+2², …).
      • Advantage: Reduces clustering.
      • Disadvantage: More complex, still subject to secondary clustering.
    • Double Hashing: Use a second hash function to determine step size.
      • Advantage: Avoids both primary & secondary clustering.
      • Disadvantage: Requires two good hash functions.

2.3.2 Separate Chaining

  • Idea: Each table index stores a bucket (linked list or dynamic array) of entries.
  • Advantages:
    • Simple and flexible.
    • Load factor can exceed 1.
  • Disadvantages:
    • Requires extra memory for chains.
    • Search time depends on chain length.

2.4 Efficiency Considerations

  • Load Factor (λ) = (Number of entries) ÷ (Table size).
  • Guidelines:
    • Keep λ < 0.5 for open addressing.
    • Separate chaining tolerates higher λ but increases chain length.

3. Recursion

3.1 Concept

  • Definition: A problem-solving technique where a method calls itself to solve smaller subproblems until reaching a base case.
  • Structure:
    • Base Case: Direct solution (stopping condition).
    • Recursive Case: Problem reduced to a smaller version of itself.

3.2 Examples

  • Factorial:
    • n! = n × (n-1)! with base case 0! = 1.
  • Fibonacci:
    • F(n) = F(n-1) + F(n-2) with base cases F(0)=1, F(1)=1.
  • Towers of Hanoi: Classic recursive puzzle.

3.3 Advantages of Recursion

  • Simplicity: Solutions are often shorter and easier to understand.
  • Natural Fit: Some problems are inherently recursive (e.g., tree traversal, divide-and-conquer algorithms).
  • Avoids Complex Iteration: Eliminates deeply nested loops.

3.4 Disadvantages

  • Overhead: Each recursive call consumes stack memory.
  • Efficiency: Some recursive solutions (e.g., naive Fibonacci) are inefficient (O(2ⁿ)).
  • Risk: Poorly defined base cases can cause infinite recursion.

3.5 When to Use Recursion

  • Use recursion when:
    • The problem has a natural recursive structure (trees, graphs, divide-and-conquer).
    • The recursive solution is clearer than the iterative one.
  • Prefer iteration when:
    • Efficiency is critical and recursion adds unnecessary overhead.

✅ Final Summary

  • Abstraction, Encapsulation, Modularization, ADTs, Data Structures, and Algorithms → Together, they improve software quality by enhancing maintainability, reusability, and productivity.
  • Hashing → Provides fast data access but requires collision resolution (linear probing, quadratic probing, double hashing, separate chaining).
  • Recursion → A powerful problem-solving tool that simplifies solutions but must be used carefully to avoid inefficiency.