Introduction
Query optimization is a common task performed by database administrators and application designers in order to tune the overall performance of the database system. Even if you have a powerful infrastructure, the performance can be significantly degraded by inefficient queries.
When the application queries the database system, it is first parsed by the SQL Parser sub-component which verifies the syntax and converts the query into a relational algebraic expression (the internal representation). Afterwards, another sub-component named Query Optimizer constructs a query execution plan based on several rules and the cost of executing the query. The cost is calculated based upon the DBMS-maintained statistics for each joined table in the query. Because the plan is typically large, the Query Optimizer often evaluates only a subset and assumes that the cost of the subset is representative of the whole. Unfortunately, there are times that the Query Optimizer does not choose the best execution plan because the subset does not accurately represent the whole or because the statistics might be missing or outdated. Thus, it is beneficial to have a little insight into query design in order to improve optimization of the queries.
Analyzing an Execution Plan
In order to analyze an execution plan in SQL Server, we can take advantage of the Display Execution Plan and Display Estimated Execution Plan features in Management Studio by using some SET options such as SHOWPLAN_XML and SHOWPLAN_ALL. Below I will lead you through this concept using the AdventureWorks database.
Let us define the query as shown in Listing 1 to report the total list price by carrier tracking number and sales order.
- select od.CarrierTrackingNumber, od.SalesOrderID,sum(p.ListPrice)
- from Sales.SalesOrderDetail od join Production.Product p on od.ProductID=p.ProductID
- where od.CarrierTrackingNumber is not null
- group by od.CarrierTrackingNumber, od.SalesOrderID
- order by od.CarrierTrackingNumber;
Listing 1
Now we will execute this query with the Include Actual Execution Plan option enabled. Figure 1 depicts the Execution Plan.
Figure 1
Each icon represents a logical and physical operation and the arrows represent the flow of data between operations. The diagram is read from left to right and from top to bottom.
Following this logic, let us analyze the proposed execution plan. The database engine performs two Cluster Index Scan operations: the first on Production.Product table and its PK_Product_ProductID clustered index (you can see details about this operation by clicking on the operation icon as shown on the Figure 2) and the other on Sales.SalesOrderDetail table and its PK_SalesOrderDetail_SalesOrderID_SalesOrderDetailID clustered index. It performs a Hash Match operation to both outputs in order to join them. Then, it sorts the output of the join operation and applies a Stream Aggregate operation to group by CarrierTrackingNumber and SalesOrderID fields on the table Sales.SalesOrderDetail. Finally, the Select operation is applied, which is not a physical or logical operation but rather a placeholder operation that represents the overall query result and cost.
Figure 2
In this execution plan, the most costly operation is the Sort. In this case, you cannot do anything to further refine this query.
Sometimes we find inefficient queries executing full table scans that can be optimized by creating an index associated with the predicate of the query, although such index creation is not always possible.
SQL Server Performance Tuning
There are several strategies for tuning the performance in SQL Server:
Problem: Low Memory Condition detected and Low Hit Cache Rate results in page faults.
Solution: Increment the total RAM memory to improve the cache hit rate, that is, the number of data pages in memory.
Problem: Full Table Scan found that is used only to find specifics rows.
Solution: Create and maintain indexes.
Problem: Excess Paging detected.
Solution: Add a RAID I/O subsystem or faster disk drives to your server.
Problem: Low Query Execution detected because data tables are very large.
Solution: Partition large data sets and create indexes. This reduces I/O contention and improves parallel operations.
Problem: Low Query Execution detected.
Solution: Tune the SQL queries and programs written in PL/SQL using the techniques explained.
Conclusion
In this article I have covered the main concepts of query optimization by explaining how to analyze the execution plan associated with SQL Server queries and and then how to tune the performance of your SQL Server database using some provided strategies.