Asymptotic Analysis in Data Structure

In this section of the tutorial, we will learn about the Asymptotic Analysis in Data Structure.

It is an expressional form of algorithm analysis that evaluates the running time and memory of an algorithm for three cases, i.e. best case, average case, and worst case.

Now, let move on further with the definition of the Asymptotic Analysis.

What is Asymptotic Analysis?

Asymptotic Analysis/Notation is a set of expressions that evaluates the running time and memory of an algorithm. It has expressions for three cases, i.e. best, average, worst case. All three cases are identified on the basis of upper and lower bounds.

For representing the upper and lower bounds, we need valid syntax. For example, we have an algorithm in the form of the function f(n).

As this tutorial progresses, we will frequently use this function f(n) to represent the expressional form of the best, worst, and average cases of an algorithm to determine the efficiency of each case, and hence to find the optimal solution to our problem.

But first, let’s understand the need of using Asymptotic Analysis.

Working of Asymptotic Analysis

For analyzing the best case, worst case, and average case of an algorithm, we perform operations on upper bound (O) and lower bound (Ω) and average time (Θ). But it is not always possible to analyze the algorithm for upper bound, lower bound, and average running time for a given function.

  • For instance, only in evaluating the best case of an algorithm, we include the upper bound and lower bound and average running time.
  • But for other cases, i.e. worst case and average case, we generally focus on the upper bound only as the lower bound of an algorithm is of no practical significance.
  • In every case, for a given function f(n), we have to find function g(n) which approximates function f(n) at higher inputs of n.
  • The above functions will be laid and hence will form a curve which we call as Asymptotic Curve.
  • In other words, function g(n)  is Asymptotic Curve for function f(n).
  • This is why we call the analysis of algorithm, Asymptotic Analysis.

Concepts used in Asymptotic Analysis

If an algorithm has a function f(n) that is linear, i.e. if it contains no loops or recursion, then its efficiency is a function of the number of instructions it contains. On the other hand, functions that use loops or recursion have a different range of efficiency that it delivers.

Hence, we use loops to compute the efficiency of an algorithm. The analysis of the algorithm depends entirely on the loops as recursion can always be converted into loops.

For the algorithm’s efficiency as a function of the number of elements to be processes, we use this general form:

f(n) = efficiency

Now, we will discuss the different approaches of using loops that delivers different efficiency of an algorithm.

Linear Loops

It includes simple loops. Here, we will evaluate how many times the loops is iterated for the following code:

for(i=0;i<=1000;i++)
      application code
  • Here, i is an integer, and the answer would be 1000 times.
  • So we can say that, the number of iteration is directly proportional to the loop factor, i.e. 1000.
  • The more the factor, the more the number of loops will be repeated.
  • This also means that the algorithm’s efficiency is directly proportional to the number of iterations.

Now, the function takes the form: f(n) = n

If we plot this loop we will get a straight line in the output, which is why it is called linear loops.

Logarithmic Loops

In logarithmic loops, in each iteration the variable worked upon is multiplied or divided. Now we will see how many times the body of the loops is repeated in both the operations, i.e. multiply loops and divide loops.

Multiply Loops
for (i = 0;i < 1000; i *= 2)
      application code
  • Here, let’s suppose the number of iterations of the loop is 10.
  • So in each iteration, the value of i doubles for the multiply loop.
  • Hence, the number of iterations is a function of the multiplier.
  • So, the loop continues to work while the condition remains true.

Now, the function takes the form: f(n) = logn

When the analysis is generalized, the iterations that multiply can be determined by the following form of the function given above.

Divide Loops
for (i = 0;i < 1000; i /= 2)
      application code
  • Here, let’s suppose the number of iterations of the loop is 10.
  • So in each iteration, the value of i is cut in half for the divide loop.
  • Hence, the number of iterations is a function of the divisor.
  • So, the loop continues to work while the condition remains true.

Similarly, as the multiply loops, the function takes the form: f(n) = logn

When the analysis is generalized, the iterations that divide can be determined by the following form of the function given above.

Nested Loops

Loops within loops are known as Nested Loops. In nested loops, we have to evaluate how many iterations the outer loops and the inner loops completes. Then the total number of iterations becomes equal to the sum of iterations of inner loop and outer loop.

Total iterations = outer loop iterations x inner loop iterations

Now, nested loops are divided into three classes of loop, i.e. Linear Logarithmic Loop, Quadratic Loop, and Dependent Quadratic Loop.

Linear Logarithmic Loop

In a linear logarithmic loop, the inner loop multiplies the variables worked upon. To see the working of the multiply loop, we have to look into the following  code:

for(i = 0; i < 10; i++)
    for (j = 0; j < 10; j *= 2)
        application code
  • Here, the number of iterations of the inner loop is log[10].
  • Also, the inner loop is controlled by the outer loop.
  • Hence, the number of iterations of the inner loop is multiplied by the number of iterations of the outer loop.

Then the code will be modified and the output will be: 10log10

So, the general form of the evaluated output will be: f(n) = nlogn

Quadratic Loop

In a quadratic loop, the number of iterations of the inner loop is equal to the number of iterations of the outer loop. To see the working of the quadratic loop, we have to look into the following code:

for(i = 0; i < 10; i++)
    for (j = 0; j < 10; j++)
        application code
  • Here, the outer loop executes for 10 times.
  • For each time the outer loop executes, the inner loop executes for 10 times.
  • So the total number of iterations of the outer loop and the inner loop is multiplied.

Then the code will be modified and the output will be: 100

So, the general form of the evaluated output will be: f(n) = n^2

Dependent Quadratic Loop

In a dependent quadratic loop, the number of iterations of the inner loop depends on the execution of the outer loop. To see the working of the dependent quadratic loop, we have to look into the following code:

for(i = 0; i < 10; i++)
    for (j = 0; j < i; j++)
        application code
  • Here, the outer loop is same as the inner loop except that the inner loop depends on the number of iterations of the outer loop, i.e. i
  • In this way, the inner loop is executed only one in the first iteration, twice in the second iteration, thrice in the third iteration, and so on.

So the loop will compute the value and gives the following output:

Output: 1 + 2 + 3 + 4 + 5 + 6 + 7 + 8 + 9 + 10 = 55

So, the general form of the evaluated output will be: f(n) = n((n+1)/2)

What is Rate of Growth?

The rate of change of running time with respect to the function in input is called rate of growth. The running time increases as the function of input increases.

For example, if we go to a marketplace to buy a number of items. We can represent the cost of individual items in terms of function, and for a lower rate we ignore such inputs because a greater value of input is present.
So we approximate all the inputs and consider only the highest rate of growth.

Example:

Total cost = cost_of_pen + cost_of_book + cost_of_bag + cost_of_bulb
or
Total cost = 50n + 6n^2 + n^8 + 100n

Total cost =~ cost_of_bag (approximately)
or
Total cost =~ n^8 (approximately)

  • Here, we add all the costs of different item that we have purchased.
  • But we will approximate all the costs of different items and consider the highest cost of an item (say cost_of_bag) and consider it the rate of growth, i.e. n^8.
Commonly used Rate of Growth

So now, we are very much familiar with the looping concept used in Asymptotic Notations and the Rate of Growth.

Below is the listed of the commonly used rate of growth that we will come across while studying Asymptotic Notations.


Efficiency

Time Complexity Estimated Time Examples
Logarithmic logn microseconds

Finding an element in a sorted array

Linear

n seconds Finding an element in an unsorted array
Linear logarithmic nlogn seconds

Sorting n items by merge sort

Quadratic

n^2 minutes Shortest path between two nodes in a graph
Exponential c^n intractable

The Towers of Hanoi problem


Big-O (O) Notation

The Big-O Notation handles an upper bound of a given function. It is represented by f(n) = O(g(n)). The function will work as the larger the value of n will be, the tighter upper bound of f(n) is g(n).

For example: f(n) = O(g(n))

Where, f(n) = n^6 + 50n^4 + 5n^2 + 25n + 6

Then, n^6 is function g(n), which means function g(n) gives the maximum rate of growth for function f(n) at larger inputs of n.

Rate of Growth in Big-O (O) Notation

Big-O Notation is defined as O(g(n)) = {f(n): there exist positive constants c and n 0 such that 0 ≤ f(n) ≤ cg(n) for all n > n 0}.

  • Where, function g(n) is an asymptotic tighter upper bound for function f(n).
  • The rate of growth of function g(n) should be the smallest, which is otherwise greater than or equal to the given rate of growth of an algorithm.

Example: Find the upper bound for f(n) = n

The inputs of n <= n for all n >= 1
Therefore, n = O(n) with c=1 and k =1

Example: Find the upper bound for f(n) = n^2 + 1

The inputs of n^2 + 1 <= 2n^2, for all n >= 1
Therefore, n^2 + 1 = O(n^2) with c = 2 and k =1

Omega (Ω) Notation

The Omega (Ω) Notation handles a lower bound of a given function. It is represented by f(n) = O(g(n)). The function will work as the larger the value of n will be, the tighter lower bound of f(n) is g(n).

For example: f(n) = (g(n))

Where, f(n) = 50n^2 +  5n + 50

Then, n^2 is function g(n), which means function g(n) gives the maximum rate of growth for less than or equal to the input of function f(n).

Rate of Growth in Omega Notation

Omega Notation is defined as (g(n)) = {f(n): there exist positive constants c and n 0 such that
0 ≤ cg(n) ≤ f(n) for all n ≥ n 0}

  • Where, function g(n) is an asymptotic tighter lower bound for function f(n).
  • The rate of growth of function g(n) should be the largest, which is otherwise less than or equal to the given rate of growth of an algorithm.

Example: Find lower bound for f(n) = 5n^2

The inputs of all c, k, such that:
0 <= cn^2 <= 100n + 5 -> cn^2 <= 5n^2 -> c = 5 and k =1
Therefore, 5n^2 = (n^2) with c = 5 and k = 1

Theta (Θ) Notation

The Theta (Θ) Notation decides whether a upper and lower bounds of a given function are the same. The average running time lies between the lower bound and the upper bound. The theta notation will have the same rate of growth if the upper bound and lower bound have the same result.

For example: f(n) = 10n + n

Where, g(n) = O(n) which represents best case for the rate of growth.

Rate of Growth in Theta Notation

Theta Notation is defined as Θ(g(n)) = {f(n): there exist positive constants c1 ,c2 and k such that 0 ≤ c1 g(n) ≤ f(n) ≤ c2 g(n) for all n ≥ k}

  • Where, function g(n) is an asymptotic tight bound for function f(n).
  • The rate of growth of function g(n) will be of the same order of growth as the function Θ(g(n)).

Example: Find lower bound for f(n) = (n^2/2)-(n/2)

(n^2/5) <= (n^2/2) – (n/2) <= n^2 for all inputs of, n >= 2
Therefore, (n^2/2)-(n/2) = (n^2) with c1 = 1/5, c2 = 1, and k = 2

Properties of Asymptotic Notation

  1. Transitivity: f(n) = Θ(g(n)) and g(n) = Θ(h(n)) ⇒ f(n) = Θ(h(n)). Valid for O and Ω as well.
  2. Reflexivity: f(n) = Θ(f(n)). Valid for O and Ω.
  3. Symmetry: f(n) = Θ(g(n)) if and only if g(n) = Θ(f(n)).
  4. Transpose symmetry: f(n) = O(g(n)) if and only if g(n) = Ω(f(n)).
  5. If f(n) is in O(kg(n)) for any constant k > 0, then f(n) is in O(g(n)).
  6. If f1(n) is in O(g 1 (n)) and f 2 (n) is in O(g 2 (n)), then (f 1 + f 2 )(n) is in O(max(g 1 (n)), (g 1 (n))).
  7. If f1(n) is in O(g 1 (n)) and f 2 (n) is in O(g 2 (n)) then f 1 (n) f 2 (n) is in O(g 1 (n) g 1 (n)).

We will discuss a very important concept of Master Theorem in Data Structure in detail in the next section of this tutorial.