Introduction
Most developers encounter the problem of how to effectively maintain the production database. If one is not a professional DBA and has been assigned the responsibility for the database for some reason then the problem seems to be bigger. A poorly maintained database can lead to low application performance and data loss. So here are a few tips.
Data and log file management
Data and log files should be separate and isolated from any other application that creates/updates files, as it will lead to file fragmentation. If data files have excessive file fragmentation then it will lead to slow execution of queries. If log files have excessive file fragmentation then it will lead to slow backups and rollbacks.
Internally, transaction logs are broken down into smaller, more granular chunks called VLFs (Virtual Log Files). To see the number of VLFs in a transaction log file, look at the number of rows returned by DBCC LOGINFO. If it is more than 50, I would recommend fixing it and adjusting auto growth so that it doesn't occur as frequently. To get rid of all of the excessive VLFs, shrink the fragmented chunk and add a new clean chunk to the transaction log
Create only ONE transaction log file. Isolating the transaction log can allow the disk head to be ready for the next write by not having other files contend for the physical disk resource. Defrag the disk(s) on which transaction logs reside to eliminate external fragmentation of the transaction log. Place the log on mirrored disks to increase the chances of recovering data and minimize data loss.
Data files should be of reasonable size depending on the potential for additional data being added to the database in the short term.
For example
If the initial size of the data is 20GB and you know that over the next 6 months, it will grow to nearly 45GB of data, then it is worth creating the data file with an initial size of 65/70 GB instead of growing it again and again to meet the requirements. Similarly, Log files should be of reasonable size depending on transaction procession, frequency of changes, data recovery model, frequency of back-ups, and whether replication is used.
Monitor the file sizes at various times and Auto-grow should be left as a just-in-case protection so files can still grow if they need. Because Auto-grow will lead to file fragmentation and is a time-consuming process that can occur at unpredictable times. Set the Auto-grow to a specific value rather than a percentage bound to the time and space needed to perform auto-grow.
Do not shrink the database nor use shrink in a regular maintenance plan. Shrink is used to reduce the size of a data or log file, but it is a resource-heavy process that causes a massive amount of logical scan fragmentation in data files and leads to poor performance.
Do not use Auto-shrink since it starts every 30 minutes in the background and tries to shrink the database where the Auto-shrink database option is set to true. It is an unpredictable process in that it only shrinks a database with more than 25% of free space. It uses many resources and causes performance-dropping fragmentation. Switch-off auto-shrink with
ALTER DATABASE databasename SET AUTO_SHRINK OFF
Index fragmentation
Type of fragmentation in the data file
- Internal Fragmentation: Fragmentation within individual data and index pages. Internal fragmentation is where there is substantial space in a page. It generally results from data modification insert, update, delete, and a mismanaged fill factor. Depending on the table/index schema and application characteristics, this space can never be reused once it is created and can lead to ever-increasing amounts of unusable space in the database. Wasted space on data/index pages leads to the need for more pages to hold the same amount of data. Thus it takes more disk space and the query needs more I/Os to read the same amount of data. All these extra pages occupy more space in the data cache thus taking more server memory.
- Logical Scan Fragmentation: Fragmentation within index or table structures consisting of pages. It is caused by an operation called page split. Occurs when a record has to be inserted on a specific index page according to the index key identification but there is not enough space on the page for data being inserted. The page is split in half and about 50% of the records are moved to the newly allocated page. This new page is usually not physically contiguous with the old one and is therefore called fragmented.
- Extent Scan Fragmentation: In a well-ordered table, the next page will either be on the same extent or the next extent. As extent fragmentation increases, a "next page" may be found on an extent much further down the extent chain instead of the next extent in the sequence. This means that more switching of extents has to happen to read each page in turn.
Fragmentation can sometimes be prevented by changing the table/index schema and if not possible, rebuilding or reorganizing an index.
Rebuilding an index involves creating a new copy of the index and then dropping the old one. Reorganizing uses an in-place algorithm to compact and defragment the index. It requires only 8KB of additional space to run and always runs online.
Newly created index pages with no fragmentation.
Index pages show internal and logical scan fragmentation after random inserts, updates, and deletes.
SQL Server 2000 commands
- DBCC SHOWCONTIG: Displays fragmentation information for the data and indexes of the specified table.
- DBCC DBREINDEX: Rebuilds one or more indexes for a table in the specified database.
- DBCC INDEXDEFRAG: Defragments clustered and secondary indexes of the specified table or view.
SQL Server 2005 commands
Statistics
A query plan is how a query should be executed, which tables and indexes to use, and which operations to perform on them to obtain the results. This is decided by a Query Processor which is the part of SQL Server. One of the important inputs into this is statistics which describe the distribution of data values for columns within a table or index. Statistics are generated by reading the table/index data and determining the data distribution for the relevant columns. Statistics need to be accurate and up-to-date to be useful to the Query Processor.
Full scan Statistics are built by scanning all the data values for a particular column and sampled scan Statistics are based on a user-specified percentage of the data. If the distribution of values in a column is nearly even, then a sampled scan should be good enough, and this makes creating and updating the statistics faster than with a full scan. By default AUTO_CREATE_STATISTICS and AUTO_UPDATE_STATISTICS options are true and statistics are automatically created and maintained by the database.
In the case of manually updating statistics, use UPDATE STATISTICS, which operates on specific sets of statistics. And the sp_updatestats Stored Procedure, which updates all statistics that are out of date (in SQL Server 2000, it updates all statistics, regardless of age).
Index rebuilds automatically update statistics with a full scan. If statistics are updated manually after an index rebuild, it's possible to end up with less accurate statistics. This can happen if a sampled scan from the manual update overwrites the full scan generated by the index rebuild. On the other hand, reorganizing an index does not update statistics at all. If a more complex route is chosen for fragmentation removal then statistics should be maintained as below:
- Analyze indexes and determine which indexes to operate on and how to do the fragmentation removal.
- For all indexes that were not rebuilt, update the statistics.
- Update statistics for all of the non-indexed columns.
Corruption detection
The majority of corruption is caused by hardware and power failure. SQL Server has a mechanism to detect corrupt pages. It involves storing a couple of bits from every sector of the page and writing a specific pattern in their place (this happens just before the page is written to disk). If the pattern is not the same when the page is read back in, SQL Server knows the page was "torn" and raises an error.
In SQL Server 2005 and later, a more comprehensive mechanism called page checksums is available that can detect any corruption on a page. This involves writing a whole-page checksum on the page just before it is written out and then testing it when the page is read back in, just as for torn-page detection. After enabling page checksums, a page needs to be read into the buffer pool, changed in some way, and then written out to disk again before it is protected by a page checksum.
To enable torn-page detection for SQL Server 2000
ALTER DATABASE databasename SET TORN_PAGE_DETECTION ON;
To enable page checksums
ALTER DATABASE databasename SET PAGE_VERIFY CHECKSUM;
Use the DBCC CHECKDB command to read all pages in the database, thus causing any page checksums or torn-page detection to be verified. Set up alerts to know when users encounter corruption problems when running queries.
Backups
When a corruption or other disaster occurs, the most effective way to recover is to restore the database from backups. Choose a frequency that reflects how much data or work your business is comfortable losing. A full-database backup can be taken using the BACKUP DATABASE command. For added protection, use the WITH CHECKSUM option, which verifies the page checksums of pages being read and calculates a checksum over the entire backup. To verify the integrity of backups use the RESTORE WITH VERIFYONLY command. If you used the WITH CHECKSUM option when the backup was created, running the verification command will check that the backup checksum is still valid, as well as re-check all the page checksums of pages within the backup.
The ultimate in recoverability results from the use of log backups which are only available in the FULL or BULK_LOGGED recovery models and provide a backup of all the log records generated since the previous log backup. Maintaining a set of log backups with periodic full database backups provides an unlimited number of points in time to recover including up-to-the-minute recovery. The trade-off is that the transaction log will continue to grow unless it is freed by taking a log backup.
Memory management
By default SQL Server will try to grab all the memory available to its process resulting in sqlserver.exe using too much memory and slowing the system down. To overcome this problem set "Maximum server memory" to an appropriate based on system configuration and applications running.
Processor management
In many cases, Windows Server hosting the database has some backup-restore processes running which can lead to slow query processing and I/O bottle-necks. To overcome this problem the Server Priority can be increased.