8. Algorithms
An algorithm is a step-by-step process, defined by a set of instructions to be executed sequentially to achieve a specified task producing a determined output.
Examples of common discrete mathematics algorithms include:
-
Searching Algorithms to search for an item in a data set or data structure like a tree.
-
Sorting Algorithms to sort items in a specific order.
-
Insertion and Deletion Algorithms to insert or delete item in a data structure such as a tree or list.
-
Division Algorithms such as a procedure for dividing two integers or the Euclidean Algorithm to determine the greatest common divisor between two integers.
-
Optimization algorithms such as finding the line of best fit of set of points, or the problem of finding the nearest neighbor in a set of points to a given point (here close could mean most similar according to some mathematically defined measure of closeness or similarity)
8.1. Structure of Algorithms in Pseudocode
While the algorithms we describe are implemented in the Python programming language, we also provide an overview of pseudocode syntax here. Pseudocode is used to provide high level description of a program or algorithm intended to be more human understandable than computer or program language. Its syntax preserves the structure of the program or algorithm while ignoring programming language specific details. It is meant to be easier to read and analyze. Often, in designing an algorithm for language specific implementation, a pseudocode implementation is obtained first. In addition to coming early in the design of a computer program, pseudocode also has two other important uses:
-
It can be used to help non-programmers understand what a program or algorithm does and how it works.
-
It can be used for debugging purposes when a programmer is trying to debug and solve a logic error in a computer program as it is closer to human language. Defects can be easier to find in a program implementation by analyzing the sequence of implementation steps in the pseudocode description.
Many of the terms in this section were also discussed in Introduction to Python . |
Recall that an algorithm is a process of finite sequences of instructions used to solve a problem or obtain a result of a computation. The key features of an algorithm include:
-
Input
-
Variables
-
Data and data types
-
Instructions or statements
-
Output
Pseudo Code | Python Code | ||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
|
8.2. Examples of Algorithms
We now proceed to develop some algorithms beginning with algorithms for common mathematical operations. Much of mathematical notation can be considered pseudocode.
8.2.1. Sum Notation
Consider the sum
\(Sum=a_1+a_2+\ldots+a_n=\sum_{i=1}^{n}a_i,\)
which is a description of adding using the index \(i\), the numbers \(a_i\), running from \(i=1\) to \(i=n\).
8.2.2. Exponentiation
Suppose we want to evaluate \(b^n\), with base \(b>0\), and exponent, \(n\) a positive integer. For example to evaluate \(5^6\), we could multiply iteratively, 5,
\(1, 5, 5\times5, 5\times5\times5, 5\times5\times5\times5,\) etc .
8.2.3. Factorial
The factorial of a positive integer \(n\) can be computed iteratively. For example \(4!\) can be calculated as
\(1, 1\times2, 1\times2\times3, 1\times2\times3\times4.\)
8.2.4. Find Minimum in a List of Integers
Consider an algorithm to determine the minimum element in a finite sequence or list of integers.
The algorithm would be constructed as follows:
-
We define a variable \(min\) and assign it to the first indexed element in the list.
-
Traverse along the list to the next indexed element and compare that indexed element in the list with the currently assigned value of the variable \(min\). If the inspected element is smaller than the currently assigned value of \(min\), then update the value of \(min\).
-
Repeat the previous step if there are more elements in the list to inspect and compare.
-
Stop when the entire list has been traversed and all elements in the list have been inspected and compared against the variable \(min\).
A Python implementation of finding the minimum in a set of \(n\) integers is given below.
8.3. The Linear Search Algorithm
A linear search algorithm involves searching for a target integer \(x\) in a list of distinct integers \( (a_1, a_2, ..., a_n)\), and returns the location \(i\) in the list that the target element \(x\) is found or returns a value indicating that the target element \(x\) is not in the list \( (a_1, a_2, ..., a_n)\).
A Python implementation of the linear search algorithm is given below.
8.4. The Bubble Sort Algorithm
The bubble sort algorithm is a simple sorting procedure. It is typically used to sort an array of \(n\) data elements in either increasing or decreasing order. We describe the bubble sort algorithm for arranging a list of \(n\) real numbers in increasing order.
-
The algorithm compares the first two elements of the array and swaps them if they are out of order.
-
It continues by traversing up the array comparing each pair of adjacent elements and swaps them if they are out of order until we reach the last entry in the array at location \(n\).
-
The last entry in the array will then be the largest element of the original list.
-
After the largest element has been sorted into position \(n\), the algorithm continues by again comparing the first two elements and swapping if they are out of order.
-
Continue traversing the array and comparing and swapping adjacent elements that are out of order until position \(n-1\) of the array, after which the 2nd largest element is in position \(n-1\). The elements, now at positions \(n\), and \(n-1\), are sorted.
-
Continue to sort at positions \(n-2\), then \(n-3\) until position \(1\).
A Python implementation of the bubble sort algorithm is given below.
8.5. The Insertion Sort Algorithm
The insertion sort works by working through a list and classifying two pieces as sorted and unsorted.
-
The insertion sort scans through each element of the list using an outer loop with a variable, say \(i\).
-
At each stage, the list is divided into a sorted section, say the left section, and a section that is not sorted, say the right.
-
The location up to which the list is sorted, is denoted by a pointer or index, called a key.
-
At the current stage, the next element from the unsorted section, on the right, is inserted into its appropriate position in the sorted section on the left.
-
The process of inserting smaller elements in the left involves shifting, larger elements to the right, using a variable, say \(j\).
A Python implementation of the Insertion Sort Algorithm is given below:
8.6. The Binary Search Algorithm
The binary search algorithm searches a sorted array of integers for a target value \(t\). The algorithms looks for \(t\) in the middle of the array. If it does not find \(t\) in the middle, it considers either the first half or the second half. It continues recursively splitting the search space in half until it either finds \(t\) or fails.
8.7. Algorithmic Complexity of Common Algorithms
In this section we give time complexity using big 0 notation of some of the important algorithms in this section. Recall big \(O\) notation is used to quantify the worst case an algorithm performs on a data input size of n.
8.7.1. The Linear Search Algorithm is \(O(n)\).
The linear search algorithm iterates across a list of \(n\) data elements. If the first element in the list is the target element, the algorithm stops. Otherwise, move to the next element and continue repeatedly until the target element is found or not. If the target element is not in the search list the algorithm exhaustively searches through every single element.
This is the worst case scenario with linear search in which the algorithm inspects every single element, either because the target element is the last element of the array, or the target element is not actually in the search list at all. The algorithm runs in \(O(n)\) time in the worst case.
8.7.2. The Bubble Sort and Insertion Sort Algorithms are \(O(n^2)\).
We analyze the bubble sort algorithm beginning with a concrete list of size \(n=5\) and generalize the analysis.
Consider the case of a list of size \(n=5\). The naive bubble sort algorithm in this case will involve 4 passes.
In the first pass, there will be 4 comparisons and up to 4 swaps, after which the element in position 5 is in its correct position.
In the second pass, there will be 3 comparisons and up to 3 swaps, after which the element in position 4 is in its correct position.
In the third pass, there will be 2 comparisons and up to 2 swaps, after which the element in position 3 is in its correct position.
In the fourth pass, there will be 1 comparison and one possible swap , after which both the elements in positions 1 and 2 are both in their correct positions.
Adding the comparisons from each pass we obtain,
\(4+3+2+1=1+2+3+4\).
In general, if the list is of size \(n\), there will be \(n-1\) passes with swaps,
\(n-1+n-2+...3+2+1 = 1+2+3+...+n-1\).
We will show later, that
\( 1+2+3+\cdots+n-1= \frac{(n-1)\cdot\ n}{2}=\frac{n^2}{2}-\frac{n}{2}\), which is \(O(n^2)\).
It is left as an exercise to verify that the insertion sort algorithm is \(O(n^2)\).
8.7.3. The Binary Search Algorithm is \(O(\log{n})\).
The binary search algorithm searches for a target element \(x\) in a list of \(n\) elements by comparing the middle element in the the sorted data set with the target \(x\). The algorithm stops if the middle element \(a_m\) is the target element. Otherwise the search continues with half the data set—the half to the left if the middle element is larger than the target \(x\) or the half to the right if the middle element is smaller than the target.
The number of steps in the binary search then is the number of times we have to split the data set until we locate the target element, or determine that the target element is not in the search list after splitting down to 1 element.
The number of times we need to split the data set of size \(n\), in the worst case then, is \(p\) which is found by solving the exponential equation,
\(2^p = n\).
The algorithm then is \(O(p)\).
The solution of the exponential equation, \(2^p = n\), is in log form,
\(p=\log_2{n}\).
The binary search algorithm then is \(O(p)=O(\log{n})\).
8.8. Exercises
-
Explain why the insertion sort algorithm is \(O(n^2)\).
-
Estimate the time complexity for each of the following two algorithms by first counting the number of steps, and then summarizing your findings using big O notation.
-
A procedure to find the sum divided by the product of \(n\) integers from \(1\) to \(n\).
procedure Sum_By_Product(n) begin sum = 0 product = 1 for i = 1 to n do sum = sum + i product = product * i end for return sum / product end
-
A procedure to create an \(n\) by \(n\) matrix \(A_{n,n}\) with every value set to \(m\).
procedure Create_Matrix(n, m) begin sum = 0 product = 1 for i = 1 to n do for j = 1 to n do Ai,j = m end for end for return A end
-
-
Explain how the binary search algorithm can be modified or used to insert a new integer element \(x\) into a sorted list of \(n\) integers.
-
Write a recursive procedure in pseudocode to implement the binary search algorithm.
-
A binary string of length n is a sequence of \(0^\prime s\) and \(1^\prime s\) of length \(n\). For example, the binary strings of length two are \(00, 01, 10,\) and \(11\), and the binary strings of length three are \(000, 001, 010, 011, 100, 101, 110,\) and \(111\). Write a recursive algorithm in pseudocode to generate all binary strings of length \(n\).
-
Write recursive procedures in pseudocode to generate the following sequences, all of which are variations of the Fibonacci sequence,
-
\(a_1 =1, a_2 = 1,\) and \(a_n = a_{n-1}+2a_{n-2}.\)
-
\(b_1 = 2, b_2 = 3,\) and \(b_n = b_{n-1}\cdot\ b_{n-2}\)
-
\(c_1 = 5\) and \(c_n = {2c}_n+n\)
-
-
Recall that the median of an ordered set of numbers \(\left\{a_1,a_2,\ldots,\ a_n\right\}\) is the number in the middle of the set. If the sorted set has an odd number of elements, then the median is \(a_M\), where \(M=\left\lceil \frac{n}{2}\right\rceil\). For example, the set \(a_1=10,\ a_2=10,\ a_3=7,\ a_4=6,\ a_5=2\) has median \(a_3=7\).
If the sorted set has an even number of elements, then the median \(a_M\) is the mean of \(a_\frac{n}{2}\) and \(a_{\frac{n}{2}+1}\). For example, the set \(a_1=10,\ a_2=10,\ a_3=7,\ a_4=6,\ a_5=2,\ a_6=0\) has median \(\frac{\left(a_3+a_4\right)}{2}=\frac{7+6}{2}=6.5.\)
Write an algorithm in pseudocode to calculate the median of a data set by first sorting in decreasing order.
-
The range of a set of integers \({a_1,a_2,\ldots, a_n}\) is the maximum value minus the minimum value, \(range={a}_{max}-{a}_{min}\).
-
Give an algorithm to find the range that uses a sorting algorithm and subtracts the first element from the last element in the sorted list.
-
Give an algorithm to find the range that modifies the linear search algorithm to find both \(a_{max}\) and \(a_{min}\), and then does the subtraction \({a}_{max}-{a}_{min}\).
-
Explain, using complexity analysis, which of the two algorithms is more efficient?
-
-
Consider the following lists of integers.
-
\(\{4,8,3,10,7,2,6,9,5,1\}\)
-
\(\{10,5,3,12,2,9,7,1,6,11,8,5\}\)
-
\(\{8,3,14,5,4,11,9,2,7,6,12,13,1,10\}\)
Sort the lists using:
-
The insertion sort algorithm, showing all the shifts.
-
The bubble sort algorithm, showing all the swaps in the case.
-
Give a count of the steps used in the both algorithms.
-
-
-
For the following lists,
-
\(\{6,31,46,49,49,55,56,59,65,82\}\)
-
\(\{9,14,16,25,26,33,44,45,52,55,57,68,72,72,84,94\}\)
-
\(\{11,14,23,29,31,36,41,41,44,47,47,50,65,70,82,85,88,89,92,96\}\)
Apply the binary search algorithm to search for the target \(x=65\). Give a count of the number of steps involved in each search.
-