Home

Home

Products

 Lecture 4 February 24 2001
 Introduction to Algorithm Analysis

Definition:
An Algorithm is any Well defined computational procedure that takes some value or set of values as input and produces some vale or set of set of values as output.

The algorithm is the sequence of steps that transforms the input to the output.

Consider the folloWing problem definition:

Input: A sequence of n numbers (a1,a2,,an)
Output: A permutation or reordering (a1,a2,,an)
of the input sequence such that (a1  a1  an)

Given the folloWing input: (31,52,23,12)
The algorithm returns the sequence: (12,23,31,52)

An Algorithm is said to be correct if for every problem instance it halts With the correct output.

There has been a lot of sorting routines developed.. each having its oWn performance characteristics. Choosing a particular sorting algorithm (or any algorithm for that matter) depends on the requirements of the problem.

It is important to knoW hoW the time and space resources required by an algorithm varies as a function of input size.

To do this is to analyze the performance of the algorithm.

Analysis of Insertion Sort:

Given the folloWing sequence of numbers to be sorted:
5 2 4 6 1 3

Sort the numbers in ascending order using insertion sort.
The list of elements left of the key is alWays sorted after the first iteration of the algorithm.

According to lines 5,6 and 7 of the algorithm, the numbers pointed to by A[i] are moved to the right until i reaches zero or until A[i] is no longer larger than the key. The key is then inserted in the resulting space.

The time taken by insertion sort depends on the input.

Sorting 1000 numbers Will take longer than sorting 10 numbers.

The algorithm may also take different amounts of time to sort a sequence of numbers of the same size depending on hoW nearly sorted those numbers are.

The running time of an algorithm on a particular input is the number of primitive operations or steps executed. Usually these steps coincide With the lines of code in the algorithm.

We can assume a constant cost for each step.
We must determine the number of times each step is executed.

Utilizing the pseudocode for insertion sort We obtain the folloWing:

For each j = 2,3,n Where n=length[A], We let tj be the number of times the While loop test in line 5 is executed for that value of j.
For line 1, assume that there are n elements to process.

The running time T(n) of insertion sort is the sum of the products of the cost and the times columns.

T(n) = c1n + c2(n-1) + c4(n-1)+c5+c6+c7+c8(n-1)

The best case occurs if the array is already sorted. Hence, for each j=2,3,,n We see that A[i]  key in line 5 and therefore tj = 1.

Hence by substitution:

T(n) = c1n + c2(n-1) + c4(n-1)+c5(n-1)+ c8(n-1)

T(n) = (c1 + c2 + c4+c5+ c8)n  (c2 + c4+c5+ c8)

This is a linear function in n. (i.e. can be expressed as an+b)

Consider the situation Where the input is in reverse order (Worst case scenario):
Thus We must compare each element A[i] With the each element in the entire sorted subarray A[1..j-1]. Hence tj = j for j = 2,3,,n

Note that:

and

Hence the Worst case running time of insertion sort is:

T(n) = c1n + c2(n-1) + c4(n-1)+c5+c6+c7+c8(n-1)

This can be manipulated to produce an expression of the form:

T(n) = an2+bn+c

Where a,b and c are constants.

Hence the running time is a quadratic function of n in the Worst case.

The Worst case running time is an upper bound on the running time of the algorithm for any given input.

 Order of Growth Functions

Formally, we can use O-notation to provide an asymptotic upper bound for a given function

Let T(n) be a function that expresses the running time of a given algorithm.

T(n) = O(f(n))  if there exists an integer n0 and a constant c > 0 such that for all integers n ³ n0, T(n) £ cf(n).

Example:

To show that T(n) = O(f(n)) for the given functions T and f, we can apply the above definition.

Suppose: T(0) = 1, T(1) = 4 and T(2) = 9, and in general T(n) = (n+1)2.

Choosing n0=1 and c =4, we would need to prove that (n+1)2 £ 4n2.

By expansion of terms we have:

(n+1)2  = n2+2n+1

If n ³ 1, then n £ n2 and 1 £ n2

Hence:

n2 +2n + 1 £ n2 + 2 n2 + n2

=> n2 +2n + 1 £ 4n2

In general:

If T(n) is a polynomial of the form:

T(n) = aink + ai-1nk-1 +  + a2n2 + a1n + a0   where the leading coefficient is positive,

We can ignore the lower order terms and the coefficient of the leading term.

T(n) = O(nk)

If T1(n) = O(f(n)) and T2(n) = O(g(n)) then the following are true:

(a)  T1(n) + T2(n) = max(O(f(n)), O(g(n)))

(b)          T1(n) * T2(n) = O(f(n) *g(n))

Logk n = O(n)   Logarithmic functions grow very slowly since n grows faster than any power of log n

 Definition of  W-notation

Alternately, we can use W-notation to provide an asymptotic lower bound for a given function.

Let T(n) be a function that expresses the running time of a given algorithm.

T(n) = W(f(n))  if there exists an integer n0 and a constant c > 0 such that for all integers n ³ n0, T(n) ³ cf(n).

Hence we can say that if T(n) = (n+1)2 then T(n) = W(n)

 q-notation

The q-notation defines an asymptotic bound of a function from above and below.

Hence for any two functions f(n) and g(n), f(n) = q(g(n)) if and only if

f(n) = W(g(n)) and f(n) = O(g(n))

This is an asymptotic tight bound.

For example,                 (n+1)2  = q(n2)

Some Common Big-O running times

 Big-OH Informal Name O(1) Constant O(log n) Logarithmic O(n) Linear O(n log n) n log n O(n2) Quadratic O(n3) Cubic O(2n) Exponential

## Graphical Illustration of  the definition of q,O and W-notation

 Comparison of Functions

Transitivity:

f(n) = q(g(n)) and g(n) = q(h(n)) implies that f(n) = q(h(n))

f(n) = W(g(n)) and g(n) = W(h(n)) implies that f(n) = W(h(n))

f(n) = O(g(n)) and g(n) = O(h(n)) implies that f(n) = O(h(n))

Reflexivity:

f(n) = q(f(n))

f(n) = O(f(n))

f(n) = W(f(n))

# Symmetry

f(n) = q(g(n)) if and only if g(n) = q(f(n))

# Transpose Geometry

f(n) = O(g(n)) if and only if g(n) = W(f(n))