We all know that SQL Server is the heart of the SharePoint product. Two synchronization objects that affect SQL Server's overall performance are locks and latches. In simple words we can say that locks control the state of data to ensure transactional integrity whereas latches control the location of data, in memory or on disk. These are quite complex constructs and need an in-depth study to understand them. This article is restricted to the understanding of the two latches.
- Page Latch: This type of wait occurs on a page when it is already available in memory and the worker thread needs to wait for a page to become available.
- Page IO Latch: This type of wait occurs due to physical I/O, such as when a page needs to be made available in the buffer pool for reading or writing and SQL Server needs to retrieve it from disk.
To investigate performance problems in SharePoint, there are a few columns in "sys.dm_db_index_operational_stats" as follows that are important:
- page_latch_wait_count: Cumulative number of times the Database Engine waited, because of latch contention
- page_latch_wait_in_ms: Cumulative number of milliseconds the Database Engine waited, because of latch contention
- page_io_latch_wait_count : Cumulative number of times the Database Engine waited on an I/O page latch
- page_io_latch_wait_in_ms: Cumulative number of milliseconds the Database Engine waited on a page I/O latch
Let's run the following query on SharePoint Administrator and the Content Database. First I am generating some page IO latch that waits by uploading a 50Mb file.
SELECT OBJECT_SCHEMA_NAME(ios.object_id) + '.' + OBJECT_NAME(ios.object_id) as table_name,
i.name as index_name, page_io_latch_wait_count, page_io_latch_wait_in_ms
,CAST(1. * page_io_latch_wait_in_ms / NULLIF(page_io_latch_wait_count ,0) AS decimal(12,2)) AS page_io_avg_lock_wait_ms
FROM sys.dm_db_index_operational_stats (DB_ID(), NULL, NULL, NULL) ios
INNER JOIN sys.indexes i ON i.object_id = ios.object_id AND i.index_id = ios.index_id
WHERE OBJECTPROPERTY(ios.object_id,'IsUserTable') = 1
ORDER BY page_io_avg_lock_wait_ms DESC
In the Content Database, I get the following results:
dbo.AllLists | AllLists_PK | 4 | 1506 | 376.5 |
dbo.AllWebParts | PageUrlID_FK | 3 | 946 | 315.33 |
dbo.EventBatches | EventBatches_PK | 1 | 261 | 261 |
dbo.DocStreams | DocStreams_CI | 1 | 249 | 249 |
dbo.SolutionResourceUsageWindowed | SolutionResourceUsageWindowed_Id | 1 | 243 | 243 |
dbo.Features | Features_PK | 2 | 402 | 201 |
dbo.EventCache | EventCache_Id | 3 | 594 | 198 |
dbo.SolutionResourceUsageLog | SolutionResourceUsageLog_Id | 1 | 195 | 195 |
dbo.AllWebParts | ListIdUserId_NCI | 2 | 370 | 185 |
dbo.AllSites | Sites_Id | 1 | 179 | 179 |
dbo.AllLists | Lists_FullText | 1 | 156 | 156 |
dbo.AllDocs | AllDocs_ParentId | 9 | 1019 | 113.22 |
And in the Admin Database, I get the following results:
dbo.UserInfo | UserInfo_PK | 6 | 4474 | 745.67 |
dbo.AllLists | AllLists_PK | 18 | 7133 | 396.28 |
dbo.AllWebParts | ListIdUserId_NCI | 17 | 6015 | 353.82 |
dbo.AllDocs | Docs_IdLevelUnique | 14 | 4285 | 306.07 |
dbo.AllDocs | AllDocs_Url | 23 | 5696 | 247.65 |
dbo.DatabaseInformation | DatabaseInformation_Name | 1 | 244 | 244 |
dbo.AllUserData | AllUserData_ParentId | 53 | 10577 | 199.57 |
dbo.AllLinks | Links_Forward | 8 | 1534 | 191.75 |
dbo.AllWebs | Webs_PK | 6 | 982 | 163.67 |
dbo.AllDocs | AllDocs_ParentId | 20 | 2992 | 149.6 |
dbo.UserInfo | UserInfo_SID | 6 | 839 | 139.83 |
dbo.AllUserData | AllUserData_PK | 10 | 1079 | 107.9 |
dbo.Features | Features_PK | 6 | 641 | 106.83 |
dbo.EventCache | EventCache_Id | 59 | 6025 | 102.12 |
dbo.WebMembers | WebMembers_PK | 6 | 606 | 101 |
Similarly, you can run the following Page Latch query and check the results.
SELECT OBJECT_SCHEMA_NAME(ios.object_id) + '.' + OBJECT_NAME(ios.object_id) as table_name
,i.name as index_name
,page_latch_wait_count
,page_latch_wait_in_ms
,CAST(1. * page_latch_wait_in_ms / NULLIF(page_latch_wait_count ,0) AS decimal(12,2)) AS page_avg_lock_wait_ms
FROM sys.dm_db_index_operational_stats (DB_ID(), NULL, NULL, NULL) ios
INNER JOIN sys.indexes i ON i.object_id = ios.object_id AND i.index_id = ios.index_id
WHERE OBJECTPROPERTY(ios.object_id,'IsUserTable') = 1
ORDER BY 5 DESC
So, we get these results. What's next? How does it help in investigating performance problems. Looking at the time required to wait for latches, you can decide if you need to increase the memory or change the buffer pool settings. Are the SQL files saved on a faster drive or if on SANs? Are these SANs fast enough? Probably with SQL Server 2014, in-memory procedures and column store indexes, in the future there will be no need to look at these constructs again, but for that we need to wait for the next SharePoint version.