In SQL Server, statistics are used by the query optimizer to determine the most efficient way to execute a query. Statistics are created on columns in a table or indexed view and include information about the distribution of the data in the column, such as the minimum, maximum, and average values. The query optimizer uses this information to estimate the number of rows that will be returned by a query and to select the most efficient execution plan.
To check if the statistics are current, you can use the DBCC SHOW_STATISTICS command. This command displays the header, histogram, and density information for the statistics of a table or indexed view. The header information includes the name of the statistics, the date and time when the statistics were last updated, and the number of rows in the table or indexed view. The histogram information includes a list of steps, or ranges, that represent the distribution of the data in the column. The density information is a measure of the uniqueness of the data in the column.
To ensure that the statistics are current, you can use the UPDATE STATISTICS command to update the statistics on a table or indexed view. You can also set the AUTO_UPDATE_STATISTICS option to ON to have SQL Server automatically update the statistics when it determines that they are out of date. It's generally a good idea to keep the statistics up to date in order to ensure that the query optimizer has accurate information about the data and can generate the most efficient execution plans.
One benefit of having AUTO_UPDATE_STATISTICS set to ON is that it can help to improve query performance by ensuring that the optimizer has accurate statistics. Another benefit is that it can help to prevent the use of outdated statistics, which can lead to poor query performance.
However, there are also some potential drawbacks to having AUTO_UPDATE_STATISTICS set to ON. One potential drawback is that updating statistics can be resource-intensive, particularly on large tables. This can lead to increased CPU and I/O usage, which can impact the performance of other queries that are running on the server. In addition, updating statistics too frequently can also be resource-intensive, so it's important to consider the trade-off between the benefits of having up-to-date statistics and the potential performance impact of updating them.
The UPDATE STATISTICS command is used to update the statistics on a table or indexed view in SQL Server. The basic syntax of the command has not changed significantly across different versions of SQL Server, but there have been some changes to the options that are available and the default behavior of the command.
In SQL Server 2005 and later, the UPDATE STATISTICS command includes an option to specify the SAMPLE size, which determines the percentage of rows used to update the statistics. The default SAMPLE size is 10%, but you can specify a different percentage if needed.
In SQL Server 2008 and later, the UPDATE STATISTICS command includes an option to specify the RESAMPLE option, which causes the statistics to be updated using a new sample of the data. This can be useful if the data distribution has changed significantly since the last time the statistics were updated.
In SQL Server 2012 and later, the UPDATE STATISTICS command includes an option to specify the FULLSCAN option, which causes the statistics to be updated using a full scan of the data. This can be useful if the data distribution is not representative of the entire table and a full scan will provide more accurate statistics.
In SQL Server 2016 and later, the UPDATE STATISTICS command includes an option to specify the INCREMENTAL option, which causes the statistics to be updated using only the data that has been added or modified since the last time the statistics were updated. This can be more efficient than updating the statistics using a full scan or sample of the data, particularly for large tables.
In summary, the UPDATE STATISTICS command has remained relatively consistent across different versions of SQL Server, but there have been some changes to the options that are available and the default behavior of the command.
How often should I update statistics? There is no one-size-fits-all answer to this question, as the frequency with which statistics should be updated depends on various factors such as the size of the table, the volume of data changes, and the performance requirements of the system.
In general, it's a good idea to update statistics periodically to ensure that the query optimizer has accurate information about the data distribution and can generate efficient execution plans. However, updating statistics can be resource-intensive, particularly on large tables, so it's important to balance the need for up-to-date statistics with the impact on performance.
There are a few approaches you can take to determine the optimal frequency for updating statistics:
Monitor query performance: If you notice that query performance is degrading over time, it may be a good indication that the statistics are out of date and need to be updated. You can use the DBCC SHOW_STATISTICS command to check the header information and see when the statistics were last updated.
Monitor data changes: If the data in the table is frequently changing, it may be a good idea to update the statistics more frequently to ensure that the optimizer has accurate information. You can use the sys.dm_db_stats_properties DMV to check the modification_counter column, which indicates the number of data changes since the last time the statistics were updated.
Test different update frequencies: You can experiment with different update frequencies to determine the optimal frequency for your specific environment. You can use the sys.dm_db_stats_properties DMV to check the rows column, which indicates the number of rows in the table, and the modification_counter column, which indicates the number of data changes since the last time the statistics were updated. This can help you determine the rate at which the statistics become outdated and how frequently you should update them to maintain good performance.
It's important to note that the optimal frequency for updating statistics may vary depending on your specific environment and workload, so it's a good idea to monitor query performance and data changes to determine the best frequency for your system.
Here is the T-SQL code to update stats on every table in every user database. using the UPDATE STATISTICS command. Summary And Recomendations From Mike
Do Not set AUTO_UPDATE_STATISTICS option to ON -UNLESS- you have a working SQL job that is managing your statistics.
Update statistics daily, unless you have a good reason not to
Cheers Mike B
Comments