Introduction
Every customer/user always wants a fast response to their data retrieval process. So, we need to design a good database that provides the best performance during data manipulation which results in the best performance of an application.
As a developer, we know any SQL query can be written in multiple ways but we should always follow the best practices and techniques to achieve better query performance. Here are some of the tips for that.
- Owner/Schema Name
Always prefix object names (i.e. table name, stored procedure name, etc.) with its owner/schema name.
Reason: If the owner/schema name is not provided, SQL Server’s engine tries to find it in all schemas until the object is found. SQL Server engine will not search for the tables outside of its owner/schema if the owner/schema name is provided.
- The * Operator
Do not use the * operator in your SELECT statements. Instead, use column names.
Reason: SQL Server scans for all column names and replaces the * with all the column names of the table(s) in the SQL SELECT statement. Providing column names avoids this search-and-replace, and enhances the performance.
- Nullable Columns
Do not use NOT IN when comparing with nullable columns. Use NOT EXISTS instead.
Reason: When NOT IN is used in the query (even if the query doesn’t return rows with null values), SQL Server will check each result to see if it is null or not. Using NOT EXISTS will not do the comparison with nulls.
- Table Variables and Joins
Do not use table variables in joins. Use temporary tables, CTEs (Common Table Expressions), or derived tables in joins instead.
Reason: Even though the table variables are very fast and efficient in a lot of situations, the SQL Server engine sees it as a single row. Due to this, they perform horribly when used in joins. CTEs and derived tables perform better with joins compared to table variables.
- Stored Procedure Names
Do not begin your stored procedure’s name with sp_.
Reason: When the stored procedure is named sp_ or SP_, SQL Server always checks in the system/master database even if the Owner/Schema name is provided. Providing a name withoutSP_ to a stored procedure avoids this unnecessary check in the system/master database in SQL Server.
- Use SET NOCOUNT ON
Use SET NOCOUNT ON with DML operations.
Reason: When performing DML operations (i.e. INSERT, DELETE, SELECT
, and UPDATE), SQL Server always returns the number of rows affected. In complex queries with a lot of joins, this becomes a huge performance issue. Using SET NOCOUNT ON will improve performance because it will not count the number of rows affected.
- Avoid Using GROUP BY, ORDER BY, and DISTINCT
Avoid using GROUP BY, ORDER BY, and DISTINCT as much as possible
Reason: When using GROUP BY, ORDER BY, or DISTINCT, SQL Server engine creates a work table and puts the data on the work table. After that, it organizes this data in the work table as requested by the query, and then it returns the final result.
Use GROUP BY, ORDER BY, or DISTINCT in your query only when absolutely necessary.
I’m highlighting some of them below,
- Use EXISTS instead of IN to check the existence of data.
- Avoid * in a SELECT statement. Give the name of the columns which you require.
- Choose appropriate Data Type. E.g. To store strings use varchar in place of the text data type. Use text data type, whenever you need to store large data (more than 8000 characters).
- Avoid nchar and nvarchar if possible since both the data types takes just double memory as char and varchar.
- Avoid NULL in a fixed-length field. In case of requirement of NULL, use variable-length (varchar) field that takes less space for NULL.
- Avoid Having Clause. Having clause is required if you further wish to filter the result of aggregation.
- Create Clustered and Non-Clustered Indexes.
- Keep clustered index small since the fields used in a clustered index may also be used in a non-clustered index.
- Most selective columns should be placed leftmost in the key of a non-clustered index.
- Drop unused Indexes.
- Better to create indexes on columns that have integer values instead of characters. Integer values use less overhead than character values.
- Use joins instead of sub-queries.
- Use WHERE expressions to limit the size of result tables that are created with joins.
- Use TABLOCKX while inserting into a table and TABLOCK while merging.
- Use WITH (NOLOCK) while querying the data from any table.
- Use SET NOCOUNT ON and use TRY- CATCH to avoid deadlock condition.
- Avoid Cursors since cursors are very slow in performance.
- Use Table variable in place of Temp table. Use of Temp tables required interaction with the TempDb database which is a time-taking task.
- Use UNION ALL in place of UNION if possible.
- Use Schema name before SQL objects name.
- Use Stored Procedure for frequently used data and more complex queries.
- Keep transaction as small as possible since transaction lock the processing tables data and may result into deadlocks.
- Avoid prefix “sp_” with user-defined stored procedure name because SQL server first searches the user-defined procedure in the master database and after that in the current session database.
- Avoid use of Non-correlated Scalar Sub Query. Use this query as a separate query instead of part of the main query and store the output in a variable, which can be referred to in the main query or later part of the batch.
- Avoid Multi-statement Table-Valued Functions (TVFs). Multi-statement TVFs are more costly than the inline TVFs.
Conclusion
Complex and large applications usually create complex and complicated requirements. That leads us to write complex SQL queries. These simple changes to your SQL Server queries will make a huge difference in the response times. Thank you for reading my blog. I hope it was helpful.