Introduction
In this article, we will learn about SQL query optimization techniques. There are many ways to write a query to get the same result in structured query language (sql). But when the performance comes into the picture, a developer should know which query is best to get the result in a minimum execution time. The process of increasing performance of a Sql query is called Sql query Tuning or Sql query Optimization. It is not a single tool or technique. It is a set of practices that makes use of a wide array of techniques, tools, and processes.
Most essential components related to performance are as displayed in the below pie chart,
We will discuss all these components one by one.
Let’s begin.
EXECUTION PLAN
Execution plan is one of the essential parts of the query optimization process. An execution plan is a kind of helping tool provided by Sql Server, which is used by developers to identify the weaker areas. Weaker area is a part of a query which is taking more time and can be improved. This part helps to take further action. We can say, it is a decision making step. Here you can decide whether your query needs any further improvement or not. There are three kinds of execution plans as follows:
- Estimated execution Plan
- Actual Execution Plan
- Cached execution plan
If you are not familiar with these execution plans, please go through
my earlier article. This article will help to understand execution plan details. Once you have good knowledge of execution plans, you can think in the direction of the query optimization process.
INDEXING TECHNIQUES
Proper Indexes are one of the best ways to improve query performance. Searching something in a sql table without an index is like searching a keyword in a book by reading a complete book. Indexes are like a clue to find something. We have two types of indexes in sql i.e. Clustered index and non-clustered index.
Clustered Index
|
Non-Clustered Index
|
1. There could be only one clustered index per table.
2. The size of the clustered index is quite large.
3. Faster
4. A clustered index can improve the performance of data retrieval.
|
1. We can have upto 999 non-clustered index in a table with the latest sql server version (limit depends on sql server version).
2. The size of the non-clustered index is small compared to the clustered index.
3. Slower as compared to clustered.
4. It should be created on columns which are used in joins.
|
You can find a detailed description on indexes at the following
link.
Let’s begin.
QUERY DESIGNING
Things we should keep in our mind while writing a query are,
Avoid Select *
We should only use select * statement when we need all the fields data of table. For example, if we need information of only two fields of a table and we are writing select * statement to get those results then it is not a good idea, because database engine will go into the system tables and read the all columns' meta data to get the results. The resultant performance is impacted when reading the system tables. Retrieving unnecessary data just increases the network traffic. You may find the difference between both query’s execution plan.
Check the DISTINCT and GROUP BY clause
Both clauses are used to get unique records but GROUP BY is mostly used with Aggregate functions. Depending on the query and situation one could give a better result than other. So while writing query check query execution plan for both to get the best result.. But this never means DISTINCT can replace GROUP BY or vice versa.
HAVING and WHERE Clause
Don’t use HAVING clause to just filter data. HAVING clause is only good with aggregate functions. This means when you need to filter records based on aggregate function’s result.
Exists Vs Joins Vs In
Most of the time IN and EXISTS gives us the same performance result. But there might be some cases where IN and Exists could give different performance results. When there are many tables in a query then it is good to use JOINS to link them and get data.
Important points
- The EXISTS clause is faster than IN clause when the subquery result is very large.
- Whereas, the IN clause is faster than EXISTS when the subquery results is very small.
- When we write a subquery with IN clause then sql database process the subquery first and then process the query as a whole.
- Avoid subquery in case of large data. Use Joins in this case
Let’s check the difference between these three clauses with a simple example,
Write a query to get the same results using IN, EXISTS and JOINS clause and execute along with statistics time on and differentiate the difference,
In the above example you will see that the Exists gives better performance from IN and JOINS. We should keep all three options in our mind while working on a requirement as we can get the same results from any one. So next time choose the best option for better performance.
Now let’s talk about a tool which recommends query improvements.
Database Engine Tuning Advisor
Database Engine Tuning Advisor is a feature which came with Sql Server Management Studio. Any developer, either fresher or experienced, can use this tool to get recommendations to improve the performance of sql server queries by doing structural changes. You can get this tool from Tools option in toolbar:
When you click on this option, the server will ask you for your login credentials. Once you provide the credentials, the following window will open:
You can choose the whole database or also choose a single table. I will choose a whole database in this example along with Plan Cache workload. Click on start analysis.
Progress tab will appear now. Here you can see the progress. Once the analysis has been done, twoother tabs will display Recommendations and Reports respectively.
In the recommendation window, you can find the improvement recommendations.
In the Reports tab you can find the reports generated by the tool.
Similarly you can also optimize a single script file by selecting a script file workload with database name for workload analysis. Here my intention is to let developers know that we have this kind of feature already in sql server which we can use.
Summary
We discussed about a tool and techniques which we can use to improve query performance in sql server.