About The Article
This article will tell you almost everything about the crucial concepts of data structures and algorithms. This article is the first one in the series “Data Structure and Algorithm (DSA)”. You’ll learn more about data structures and algorithms in detail in my coming articles. So stay tuned for the next articles.
Read this article, then feel free to give your feedback.
The topics to be covered are
Introduction
- Data structures and their purpose
- Ways of implementing a data structure
- Understanding Abstract Data Type (ADT)
- Classification of data structure
- How we’ll study data structures
Analysis of Algorithms
- Efficiency of algorithms
- Time Complexity
- Space Complexity
- Measuring time complexity
- Empirical or Posteriori method
- Theoretical or Apriori method
- Apriori analysis with an example
- Asymptotic notations
- Time complexity using Big oh O
- Categories of Time Complexity
- Comparison between different Time Complexities
- Dependence of Time Complexity
Data structure and its purpose
We have many problems around us which are of different types. But we want to solve these problems in an efficient and effective way. For our daily life, if we have any problem, then we find a solution to solve it. We face three phases of problem-solving in our daily lives.
- Define the solution
- Analyze the solution
- Implement the solution
There is the same situation in the field of computer science. The solution we applied to solve computer problems is called an algorithm. We have to design this algorithm, analyze it according to the business requirement, and then implement this to solve the problem. The algorithm is nothing but a procedure that contains a specific set of rules to get output from input. We’ll learn the ways and techniques to analyze algorithms in this article.
Data structure is all about how to design, analyze, and implement “efficient” algorithms. So data structure is the most fundamental and building block concept in computer science to solve computer problems. It has a lot of importance in creating program logic.
Data structure is defined as “A way to store and organize data into the computer so that the data can be used efficiently and effectively.”
For example, in the dictionary, every word is sorted in an organized way. If the word is not sorted then it is almost impossible to find any word in the dictionary. So we have to use data structures to store and organize data efficiently.
Ways of implementing a data structure
Simply put, the way of implementing data structure is, to first make its ADTs, then we implement the data objects. We can say that first, we need to make the mathematical and logical model, then detaiimplement it on the data objects. Let’s understand it in more detail.
Mathematical/logical model
In this step, we see the abstract view or high-level features/operations. For example, an abstract view of TV is
- It can turn ON and OFF.
- It can receive signals from the antenna.
- It can produce sound and video.
In the above example, we aren't concerned with circuits embedded in the TV. So this is the Abstract view of the television.
As we see only the abstract view, we can call this abstract data type ADT.
Understanding Abstract Data Type (ADT)
The data and operations of which the data structure is comprised are called Abstract Data Type (ADT). In ADT, we don’t give any implementation of the data objects and their operations. We can say that ADT provides us with Data Abstraction. ADT focuses on “what a data structure does” rather than on “how a data structure does it”. Let's take an example, we want to make a list that can perform the following operations,
- Store a given number of elements of any data type.
- Read elements by their position in the list.
- Modify elements at any position in a list.
All the above features are about data and operations on the specific data type. So this is an abstract data type, because, we only focus on “what the list does” rather than the implementation in terms of elements of “how the list does it”.
Now the question comes -- what is the implementation of this list? So there is a second concept about that data structure which is the implementation. It means concrete implementation of a list in which you can store elements, read elements, and modify elements. Let’s say we implement the above list ADT as “arrays”.
Classification of data structure
We’ll study all types of data structures in this series. The typical classification of the data structure is as follows.
You can see, that data structure has two types, linear and non-linear. Linear data structures are uni-dimensional and further classified into Sequential and Linked data structures while non-linear are two-dimensional.
We’ll study all the above-mentioned data structures in this series. Generally, we’ll study thumb the following aspects.
- Their logical view
- Operations on them
- Cost of operations on them
- Their implementation
Analysis of algorithms
You have learned what data structures are, as well as their purpose and classification. As you know, the solution to every problem is the algorithm. Now it is time to learn about the method and techniques to analyze the efficiency of algorithms.
Efficiency of algorithms
There can be any number of possible solutions/algorithms to one problem in computer science. Now how can we decide, which algorithm is the best for our problem? So there are two performance measures to measure the efficiency of algorithms
- Time Complexity
- Space Complexity
Time Complexity
The time complexity of an algorithm or program is a function of the running time of it. It focuses on the fastest algorithm for the problem or that algorithm takes the minimum time in performing its task.
Space Complexity
The space complexity of an algorithm or program is a function of the space needed by the algorithm to complete its task. It focuses on that algorithm that consumes or needs less memory space for its completion of the task.
In this article, you’ll learn how we can analyze an algorithm in terms of time complexity.
Measuring time complexity
There are two following ways to measure time complexity.
- Empirical or Posteriori method
- Theoretical or A priori method
Let’s understand its difference.
Empirical or Posteriori method |
Theoretical or A priori method |
In this approach, we implement and execute the complete algorithm on various computers for various instances of the problem. Then we note the time taken by each algorithm and compare the time difference. The algorithm that takes the least time for its complete execution is considered the best algorithm. |
In this approach, we mathematically determine the resources (time and space) needed by the algorithm, as a function of a parameter related to the instances of the problem considered. Normally, we use the parameter "size of input instances". |
The empirical or Posteriori method is dependent on various factors such as the computer machine on which the algorithm runs, the programming language in which the algorithm/program is written, and the skills of a programmer who builds it. So it is its disadvantage. |
Theoretical or A priori method is independent of a computer machine, programming language, and the programmer. |
It is not a good approach. Because we should test the algorithm on a large number of input instance sizes. Testing on smaller to moderate input instance sizes, the algorithm may lose its performance in a real-time environment. But in this approach, we test the efficiency of the algorithm only on moderate input instance sizes, whose results will not be so beneficial in the future. |
It is a good approach as compared to the Empirical or Posteriori method because we can test the efficiency of the algorithm on any number of input instances of any size. |
In this article, you’ll learn the Theoretical or Apriori method for analyzing time complexity.
Understanding Apriori Analysis with Example
As you have learned, in this approach, we mathematically determine the resources (time and space) needed by the algorithm, as a function of a parameter related to the instances of the problem considered. Often, we take a parameter as the size of the input instances. In this section, you’ll understand apriori analysis with the help of an example.
Let’s say you have a program as follows and you want to estimate the time complexity of its execution. For apriori estimation, there can be the following two factors
- The number of times the statement is executed in the program is called the frequency count of the statement.
- The time is taken by the execution of the single statement.
As we know, the time taken by the execution of the single statement is dependent on the computer machine that runs the program’s statement. But we know very well that the a priori method is independent of the computer machine, the programming language, and the programmer's skills. So we only consider the first factor and estimate the efficiency of the algorithm according to the total frequency count of the statements on which the program exists.
There are three programs in which there are a different number of statements.
In program 1, the statement is executed only once time, so the frequency count will be 1.
In program 2, the for loop comes into action, so remember the following way to find the frequency count of the for loop.
- from where the for loop starts, the formula is as follows
(upper-bound – lower-bound + 1) + 1
- and for every single statement in the for loop, the formula will be as follows
(upper-bound – lower-bound + 1)
In program 2, the nested for loop comes into action. Let’s see the total frequency count for each program
The total frequency counts of programs 1, 2, and 3 given by 1, (3n+1), and (3n^2+3n+1) respectively can be expressed as O(1), O(n), and O(n^2) respectively. Let’s move towards understanding O notation called Asymptotic Notation.
Asymptotic notations
Asymptotic notations are the meaningful approximations of functions that represent the time or space complexity of the algorithms/programs.
There are some following asymptotic notations.
- O − Big Oh
- Ω − Big omega
- θ − Big Theta
- o − Little Oh
- ω − Little omega
In this article, you’ll learn only popular “Big oh O” asymptotic notation.
Big Oh Notation (O)
A function f(n) can be represented in the order of g(n) that is O(g(n)), if there exists a value of the positive integer n_0 and a positive constant C such that, f(n) ≤C.g(n), for all n ≥ n_0.
Example
- f(n) = 6n^3+8n^2+2, let’s say g(n)= n^3, hence f(n)=O(n^3).
- f(n) = 62n+2, let’s say g(n)= n, hence f(n)=O(n).
- f(n) = 2, let’s say g(n)=1, hence f(n)=O(1).
Hence we can say that function g(n) is an upper bound for function f(n), as g(n) grows faster than f(n).
Time complexity using Big programs,O
As we know, big oh asymptotic notation is used to report the time complexity of the algorithm. Now you’ll learn the terms regarding different time complexities while using Big oh asymptotic notation. Specifically, we can get the following time complexities by different algorithms.
In general, we can divide time complexities using big-oh asymptotic notation in the three categories.
- Polynomial Time Complexity
- Exponential Time Complexity
- Logarithmic Time Complexity
So let’s understand each generic category of time complexity reported by algorithms.
Categories of Time Complexity
- Logarithmic Time Complexity
The algorithm reports time complexity as O(logn).
- Polynomial Time Complexity
The time complexity of the type O(n^k) is called polynomial time complexity. Time complexity of O(n^2), and O(n^3) is examples.
- Exponential Time Complexity
The time complexity of the type O(k^n) is called exponential time complexity. The time complexity of O(2^n), and O(3^n) are examples.
Comparison between different Time Complexities
In this section, you’ll see the comparison between all the described time complexities.
You can see in the above picture that, algorithms with constant time complexity are more efficient than all the existing time complexities. Also, the algorithm with O(log〖n)〗 time complexity is much faster (if n is a large value) than the algorithm with O(n) time complexity. Similarly, algorithms with O(n.log〖n )〗 time complexity are better than algorithms having O(n^2). Hence we can write,
O(1) ≤ O(logn) ≤ O(n) ≤ O(n.logn) ≤ O(n^2) ≤ O(n^3) ≤ O(2^n)
Also, you can see that Polynomial Time Complexity [O(n^k)] behaves more efficiently than Exponential Time Complexity [O(k^n)].
Dependence of Time Complexity
Time complexity mainly depends on many parameters. Often we analyze algorithms on the behalf of parameter “size of input instances”. It is not the last word that, the larger the input size larger the time complexity. It is not always true.
The time complexity of an algorithm may be dependent on the nature of the input, you may say the data type of input. So these andparameters make three types of cases for analyzing algorithms.
- Average case
- Best case
- Worst case
You’ll learn all these concepts with practical examples in my coming articles.
Summary
In this article, you have learned what the data structure is, its purpose, and its implementation including ADT and its classification. You also learned the ways to analyze the algorithm in terms of its efficiency using time complexity as Big Oh asymptotic notations. Then you have learned different time complexities reporting by different algorithms.
This article has helped you understand the basic concepts of data structures and their ways of analyzing the efficiency of algorithms. Stay tuned for my next articles because this article is the first one in the series of “Learn about Data Structures and Algorithm (DSA)” and you’ll learn more about data structure in coming articles. If you have any their, please feel free to contact me in the comments section. Also, do provide feedback whether positive or negative. It will help me to make my articles better and increase my enthusiasm to share my knowledge.