Environment
- SharePoint 2013 Enterprise with all the Services enabled on the same server.
- SQL Server 2008 R2 Standard also on the same SharePoint standalone server.
Problem
We are using a custom BCS connector to crawl the data from SQL for a SharePoint 2013 search. There are 2.1 million records in the SQL table. The data is being read in batches and provided to the crawler (we have implemented our own logic to supply the records to the search crawler in batches).
The issue here is the crawl rate. The rate of crawl at which the out of the box (OOTB) External Content type retrieves the data is way higher than the custom process (2.1 M items in 8.30 hrs.). But when we try to crawl the BCS External Content type using custom BCS connector, it takes 10 hours. (If the load is high or multiple users log into the server sometimes it takes 19 hrs. to crawl 1M items)
At this point in time, we are unable to identify the cause of the issue or find the section where the bottleneck is.
Solution
When we compared the OOTB and Custom BDC model files (.bdcm), we found that one additional property called UseClientCachingForSearch was used in the OOTB BDC model. Then, we started investigating this property and found that this is the property that produced the magical number for OOTB BCS.
The UseClientCachingForSearch property in the BDC model improves the speed of full crawls by caching the item during enumeration. This property is also recommended when implementing incremental crawls that are based on change logs because it improves incremental crawl speed.
Basically, when we make a model using Out of the box, the UseClientCachingForSearch property is added automatically. When we make our own custom BCS model we have to add this property manually.
Important Note
If BCS crawl items are larger than 30 kilobytes on average, we should not set this property, as it will lead to a significant number of cache misses and negate performance gains. In our case, our records are not more than this limit.
Output
After implementing this, we get the magical number similar to out of the box BCS crawl in our custom BCS crawl activity. This is just in the development environment. When it goes to production, it should definitely give very good performance.
Why Custom BCS Connector?
Custom BCS connector is used here for the following reasons,
- Out of the box, BCS Connector has a response limit. The maximum number of rows that can be read through the Database Connector is 1,000,000. The limit can be changed via the ‘Set-SPBusinessDataCatalogThrottleConfig’ cmdlet using PowerShell. But it is not recommended. To handle this threshold issue on the search crawler, we are writing the custom BCS connector with the batch processing.
- Currently, our external system (Teamcenter) has 2.1 million data and this may grow up to 6 million in the next 4 to 5 years. To process this dynamic data growth we have implemented batch processing in our Custom BCS connector.
- OOTB SharePoint incremental crawl will not delete the deleted items in the external system from the search indexes until we do a full crawl. To implement this logic for incremental crawl we choose custom BCS connector.
The researches, findings, and learning we have done in this project are going to be a great knowledge-base for everyone.
Feel free to get in touch with me if you have any queries on Custom BCS connector/SharePoint 2013 Search.