DAA Concepts

1. Introduction to Algorithms & Basic Techniques

We started with understanding what algorithms are and how to evaluate their efficiency using recurrence relations and Big-O notation.

Insertion Sort: Shifts elements to place values in order. Good for small datasets.
Time Complexity: O(n²)
Bubble Sort: Compares adjacent elements repeatedly.
Time Complexity: O(n²)
Linear Search: Sequentially checks each element in the list.
Time Complexity: O(n)

Use case: Linear Search is still useful in small or unsorted datasets.

Insertion Sort Visualization

Consider array: [5, 2, 4, 6, 1, 3]
Pass 1: [2, 5, 4, 6, 1, 3]
Pass 2: [2, 4, 5, 6, 1, 3]
Pass 3: [2, 4, 5, 6, 1, 3]
Pass 4: [1, 2, 4, 5, 6, 3]
Pass 5: [1, 2, 3, 4, 5, 6]

2. Divide and Conquer

This paradigm breaks a problem into smaller parts, solves them recursively, and merges results. The "divide and conquer" technique leads to efficient algorithms for a wide range of problems.

Binary Search: Efficient for sorted data.
O(log n)
Merge Sort: Uses recursion and merging.
O(n log n)
Quick Sort: Partition-based sorting. Fast on average.
Avg: O(n log n), Worst: O(n²)
Strassen's Matrix Multiplication: Reduces complexity from O(n³) to O(n^2.81).
Karatsuba Algorithm: Fast multiplication of large integers.

Real-world: QuickSort is widely used in libraries due to its average-case speed.

Merge Sort vs Quick Sort

Merge Sort guarantees O(n log n) performance in all cases but requires extra memory.

Quick Sort is typically faster in practice due to better cache locality and less overhead, but can degrade to O(n²) in worst case scenarios.

Partition example for Quick Sort with array [3, 8, 2, 5, 1, 4, 7, 6] and pivot=4:

After partition: [3, 2, 1, 4, 8, 5, 7, 6]

3. Greedy Algorithms

Make the best choice at each step hoping for an overall optimal solution. While not always guaranteed to find the global optimum, greedy algorithms are efficient for many problems.

Kruskal's Algorithm: Uses Disjoint Set for MST. O(E log E)
Prim's Algorithm: Grows MST from a starting vertex. O(E log V)
Dijkstra's Algorithm: Finds shortest paths in weighted graphs. O((V+E) log V)
Huffman Coding: Creates optimal prefix codes for data compression.
Fractional Knapsack: Selects items with highest value-to-weight ratio.
Job Sequencing with Deadlines: Maximizes profit by optimal job scheduling.

Use case: Network design, resource allocation, and scheduling.

Fractional Knapsack Example

Items: [(value=60, weight=10), (value=100, weight=20), (value=120, weight=30)]

Knapsack capacity: 50

Value-to-weight ratios: [6, 5, 4]

Solution: Take all of item 1, all of item 2, and 2/3 of item 3

Total value: 60 + 100 + (120 × 2/3) = 240

4. Dynamic Programming

Overlapping subproblems and optimal substructure are solved by storing past results (memoization). DP is powerful for optimization problems where brute force would be exponential.

Longest Common Subsequence (LCS): Dynamic string comparison. O(m×n)
Matrix Chain Multiplication: Optimal parenthesis placement. O(n³)
0/1 Knapsack: Select items to maximize profit with a weight constraint. O(n×W)
Shortest Path: Floyd-Warshall for all-pairs shortest paths. O(V³)
Edit Distance: Minimum operations to transform one string to another.
Longest Increasing Subsequence: Find longest subsequence in ascending order.
Coin Change Problem: Minimum coins needed to make a sum.

Real-world: Used in DNA sequence alignment, text comparison, and route optimization.

LCS Example: "ABCBDAB" and "BDCABA"

DP Table (Partial):

   | "" | B  | D  | C  | A  | B  | A
---+----+----+----+----+----+----+----
"" | 0  | 0  | 0  | 0  | 0  | 0  | 0
A  | 0  | 0  | 0  | 0  | 1  | 1  | 1
B  | 0  | 1  | 1  | 1  | 1  | 2  | 2
C  | 0  | 1  | 1  | 2  | 2  | 2  | 2

Result: The LCS is "BCBA" with length 4

5. Backtracking

Systematically tries all options and backtracks if a solution fails. Backtracking is essential for solving constraint satisfaction problems and combinatorial optimization.

N-Queens: Place queens such that no two threaten each other. O(n!)
Subset Sum: Identify subsets with a given total.
Hamiltonian Path: Find a path visiting each vertex exactly once.
Graph Coloring: Assign colors to vertices with constraints.
Sudoku Solver: Common backtracking application.
Permutations & Combinations: Generate all possible arrangements.

Key idea: Depth-first search + undo invalid decisions when hitting constraints.

4-Queens Backtracking Solution

. Q . .  
. . . Q  
Q . . .  
. . Q .

This is one of the two valid solutions for the 4-Queens problem.

Backtracking tries placing a queen in each column, then backtracks when attacks are detected.

6. Graph Algorithms

Graphs model complex systems like networks, cities, and social connections. From traversal to pathfinding and network flow, graph algorithms are foundational in computer science.

DFS/BFS: Graph traversal techniques. O(V+E)
Dijkstra's Algorithm: Single-source shortest path. O((V+E) log V)
Bellman-Ford: Handles negative edge weights. O(V×E)
Floyd-Warshall: All-pairs shortest path using DP. O(V³)
Topological Sort: Linear ordering of vertices in a DAG.
Strongly Connected Components: Decomposition of directed graphs.
Ford-Fulkerson: Maximum flow in a flow network.
A* Search: Heuristic-based pathfinding algorithm.

Graph Traversals Comparison

Depth-First Search (DFS): Explores as far as possible along a branch before backtracking. Useful for topological sorting, detecting cycles, and maze generation.

Breadth-First Search (BFS): Explores all neighbors at the current depth before moving to vertices at the next depth level. Ideal for finding shortest paths in unweighted graphs.

For the same graph, DFS and BFS may produce different traversal orders!

7. Randomized Algorithms & Complexity Theory

Random choices lead to simpler or faster average-case behavior. Complexity theory classifies problems based on their inherent difficulty.

Hiring Problem: Expected performance optimization.
Randomized Quick Sort: Avoids worst-case by choosing pivot randomly.
Monte Carlo Algorithms: May produce incorrect results with small probability.
Las Vegas Algorithms: Always correct, but randomized runtime.
NP-Completeness: Hard problems for which no polynomial-time solution is known (e.g., SAT, TSP).
Approximation Algorithms: Provide near-optimal solutions quickly.
Complexity Classes: P, NP, NP-Complete, NP-Hard, PSPACE.

Fun fact: P vs NP is one of the seven Millennium Prize Problems, with a $1 million reward!

NP-Complete Problems

Well-known NP-Complete problems include:

Traveling Salesperson Problem (TSP)
Boolean Satisfiability Problem (SAT)
Knapsack Problem (decision version)
Hamiltonian Cycle Problem
Graph Coloring Problem

These problems are believed to require exponential time in the worst case.

8. String Algorithms

String processing algorithms are fundamental for text manipulation, pattern matching, and bioinformatics applications.

Naive Pattern Matching: Simple but inefficient approach. O(m×n)
KMP Algorithm: Efficient string pattern matching. O(m+n)
Rabin-Karp Algorithm: Uses hashing for pattern matching. O(m+n) average
Suffix Trees & Arrays: Data structures for string operations.
Trie Data Structure: Efficient for dictionary operations.
Longest Palindromic Substring: Find the longest palindrome in a string.

KMP Algorithm Visualization

Pattern: "ABABCABAB"

Prefix table: [0,0,1,2,0,1,2,3,4]

The prefix table helps skip unnecessary comparisons by leveraging previously matched characters, making KMP much faster than naive matching for repeated patterns.

Design and Analysis of Algorithms - Core Concepts