Introduction
This article is about the different types of operations of an execution plan. We will not talk about which operation should be used or not as this article is only about the definition of the operations. If you are a beginner with an execution plan then please go through my first article (link below) about SQL execution plan.
Now when we know types of execution plans and how we can get statistics of execution plans, it is really important that we should understood the meaning of every single output of execution plan and statistics so that we can make our queries better and increase the performance. So let’s understand the execution plan’s operations,
Following are the most common operations of SQL execution plan.
Understand the difference between Scan and Seek Operation.
Seek Operation
|
Scan Operation
|
1. Seek fetches selective rows from the table.
2. Seek touches only the required or matching row.
3. Less CPU consumption.
4. Requires less I/O component usage.
5. Executes with where clause
|
1. Scan fetches all the rows from the table
2. Scan touches every single row of the table either required or not.
3. More CPU consumption.
4. Requires more I/O component usage.
5. Executes with Select statement
|
INDEX AND TABLE ACCESS
TABLE SCAN
|
Table Scan operation only performs when table doesn't have any index on it and we need to get all the data from the table (without any filter). You can validate this by doing the following steps,
- Create a table without any index.
- Write any Select statement and Press Ctrl + L to get execution plan.
- You can see a Table scan operation has been executed.
|
INDEX SCAN
|
Index Scan operation only performs when table has a column with non-clustered index and we need to get all rows of that column. You can validate this by doing following steps,
- Create a table with Non-Clustered index.
- Write query select column_name from table . (Column where non-clustered index has been created.) Press Ctrl + L to get execution plan.
- You can see an Index scan operation operator will display in execution plan.
|
CLUSTERED INDEX SCAN
|
Clustered Index Scan operation performs when a query is written to read table data based on clustered index. Let’s check this by doing the following steps,
- Create a table with a primary key
- Write any Select statement and press Ctrl+L
- You will be able to see a Clustered Index Scan operator in the execution plan.
|
INDEX SEEK
|
Index Seek operator performs when a query is written to filter records based on column with non-clustered index, which doesn't have to read all tables and then filter. The system directly goes to index and finds required records based on filter value. Let’s do an exercise,
- Create a table with Non-Clustered index.
- Write a query select column from table where column=’value’. (Column should be the same on which we created a non-clustered index)
- You will see in Index Seek operator in the execution plan.
|
CLUSTERED INDEX SEEK
|
A Clustered Index Seek operation performs when the requirement is to get filtered data. If a table has a primary key (clustered index) and we write a query to filter some records with the help of clustered index then clustered index seek operation will perform. Let’s check this,
- Create a table with a primary key
- Write a select statement with where clause on primary key column and press Ctrl+L.
- Clustered Index Seek operation will execute and you will be able to see the same in the execution plan.
|
Key Lookup (Clustered)
|
This operation performs when we use a non-clustered indexed column to filter data, but additionally other column data is needed to satisfy the requirement. Note that non-clustered index column should be unique. In this operation, system first filters records based on non-clustered indexed column then based on these record’s primary key/clustered index fetches other related details using key lookup operation. It is really bad when key lookup percentage is greater than 50%. Let’s understand this from an example,
- Create a table Employee with some columns.
- Create a primary key on a column ( e.g. ID).
- Create a non-clustered unique index on another column.
- Now write a query with where clause is using non-clustered index column, select anycolumn from Table where Non-ClusteredIndexColumn=Value'and press Ctrl+L
- You can check this with an execution plan.
If I explain this execution plan, then in this query the first query will search a primary key on a non-clustered index basis and then search the other column data on a clustered index basis.
|
RID Lookup (Heap)
|
RID stands for row identifier. This operation performs when we use a non-clustered indexed column to filter data. Note that non-clustered index column should be unique as well for a key lookup operation. In this operation, system filters records based on non-clustered indexed column. It is really bad when key lookup percentage is greater than 50%. Let’s understand this from an example,
- Create a table without clustered index.
- Create a non-clustered index on any column which is unique as well.
- Now write a search query with where clause using non-clustered index column and press Ctrl+L
|
JOINS OPERATIONS
Joins are used to get data from more than one table. If a query has many joins then joins executes sequentially. There are three types of joins as below,
NESTED LOOPS
It is the most basic join algorithm. We normally use this join when we need related data from two tables. It has two inputs and one output. The below figure is an example which executes a nested loop join. For each row in the outer(top) input, scan the inner (bottom) input and gives output of matching rows.
In the above execution plan, a clustered index seek operation is performed on the table Customer where CustomerID is 11003, and for the mentioned CustomerID, an index seek is performed on the inner table SalesOrderHeader.
HASH MATCH
When Hash Match operation performs, it uses each row from the top input to build a hash table and each row from the bottom input to probe into the hash table, outputting all matching rows. This operation is executed when we want to get matching data from two different tables:
MERGE JOIN
Merge join is performed when matching rows from two sorted input tables exploit their sorting order. In simple words when both tables have clustered index and we use them in join to get data then a Sql optimizer chooses the merge operator to execute a query. You can have a look at the below screenshot.
SORTING AND GROUPING
SORT
Sort operator sorts the data according to the order by clause. It requires large amounts of memory to materialize the intermediate result. Sort operator in execution plan tells us that the query optimizer is sorting the data. If I run a simple select statement with order by clause then you will see a sort operation will execute. See the below screenshot,
STREAM AGGREGATE
Aggregates a pre-sorted set according the group by clause. This operator does not require buffering the intermediate result. It is physical operator to aggregate the data. This operator expects input data in sorted form in order to grouping columns. If input is in sorted form then it work in very efficient manner. Let’s see an example:
HASH MATCH (AGGREGATE)
The hash aggregate operator is also a physical operator which aggregates the data and it works by creating a hash table. It works like hash join to aggregate the data. This operator is very effective and efficient when the input data row set is not in the sorted format. It groups the results. It organizes the groups in a hash table based on some internally chosen hash function.
SUMMARY
Thus we have learned about the operations of an execution plan. The above mentioned operations are the most common operations of an execution plan. I hope now every reader knows the basic operations of execution plan and understands it very well. In the upcoming article we will learn about the query optimization and how operators can help in that. So stay tuned.