Statistics are the magic ingredient to help the query optimizer create its best guess for generating an execution plan. Keeping statistics as accurate as possible gives the optimizer the information it uses to calculate the estimated costs and cardinality, which enables it to properly allocate resources, such as memory grants. In this write-up, I am not going to go into what statistics are and how they are used. There are many great resources out there to read. I am, however, going to tell you about a trace flag that can help you with keeping your large table statistics up to date. I will also explain why your statistics may not be as current as you would like.
This information will apply to those running SQL Server Edition 2008 R2 SP1 up to and including SQL Server 2014. For those who are lucky enough to be running 2016 or higher (and in compatibility mode 130 or higher), you are in luck as this behavior is now enabled by default. In discussions lately, I have found that many database administrators are unaware of its existence.
If you are using auto-update statistics on your database which, by the way, is a best practice, then you are already updating your statistics on a regular basis as your data changes. By that, I mean SQL Server will automatically trigger a process to update your statistics. This occurs based on the threshold percentage value of change. The auto-update process runs when statistics are used by a query and when they may be out of date.
According to MSDN, the optimizer determines if the statistics are out of date by counting the number of modifications since the last statistic update and comparing the number of row changes, inserts, updates, deletes, and merges that occur on your table. However, as you can imagine for larger tables to meet that percent change, it could mean a significant amount of time, which can lead to subpar estimations. Imagine a table with 5 million rows; it would require 1 million rows to change before auto-update statistics to be triggered. This is especially true for those environments that do not routinely run an UPDATE STATISTICS with a FULLSCAN maintenance job.
There is a trace flag available that will assist in keeping the stats up-to-date in large tables. Currently, the algorithm that is used to automatically update the statistics is 20% + 500 rows. Trace Flag 2371 changes this algorithm to one that utilizes a sliding scale as shown below. Using this trace flag will drastically increase the frequency of which statistics updates occur on your larger tables, which in turn gives the optimizer much better estimates to work with.
Figure 1. Graph Source
Along with this setting, it is recommended to turn on the AsynchronousAuto Stats Update setting on databases that have large tables. I have a blog that talks about the difference between Asynchronous and Synchronous statistics. Because the update statistics process will be triggered more frequently, users may experience slowness or even timeouts while waiting for it to complete. By doing asynchronous statistics updates, the users' query will continue to run as normal. The query will use the old statistics while the SQL Server updates the statistics in the background. Once complete, the SQL Server will automatically make the new statistical values available to queries.
If you are in an environment that is not using SQL Server 2016 or higher and you have large tables, consider enabling this trace flag. It’s a very easy fix but does require a service restart.