Introduction
Hi guys, whoever didn't get the chance to go through my previous blog, please click
here. Today, let's talk more on overall high-level points and different approaches to take care of them in the Archival Process of Lists/Libraries with beyond Threshold and the entire site collections that hold them.
If the whole site collection needs to be archived, then simply create a migration using some tools like ShareGate, if Online to Online, SPMT, if from On-Prem to Online, Metallogix, if from OnPrem + SQL DB to Online etc. Furthermore, use PowerShell to lock them for READ Only after your complete migration of all site collections and related objects, such as:
- Classic and Modern Lists
- Classic and Modern Libraries
- Related Infopath Customizations
- Related PowerApps Customizations
- Related Workflows or Flows
- Related Permissions
- Related Navigations
- Related Large Lists
- Related Sub sites
Now let's talk about specific Lists/Libraries within a Site Collection. Here is the overall summary:
Business Requirement:
To provide the best suitable Archival Solutions for ABCM Corp, SharePoint Online Lists/Libraries both Modern/Classic Environment that hold records/items/documents beyond the threshold limit in each Site Collection.
SharePoint Online Site Collection List/Library Inventory Report Generation
Use the PowerShell For Each function to get all Lists/Libraries inside each Site collection with the Item Count > 5000 List View Threshold.
Challenges to be Addressed:
- To Archive list items/documents/pictures based on user-defined data filters.
- To Extract only the desired content based on various filter criteria (e.g., modified date, created by, stale libraries/lists, etc.), as SharePoint content is voluminous and structured.
- To Export SharePoint content that has been recently modified since the previous export/archive session in order to avoid exporting the full content repeatedly.
- The Exported content and metadata must be ready for automation using other tools/applications without the need for manual intervention, such as modifying or cleansing the metadata and/or the content.
- To Maintain the same source folder structure and version history when exporting data from the SharePoint list/document library to manage and maintain older versions.
- To Backup SharePoint content in file shares or file servers and migrate contents back to new SharePoint libraries/lists in a new server after de-duplication and metadata cleansing to avoid redundancy.
List/Libraries Archival Process High-level Points:
- We have the below scenarios:
- PowerApps Customized Modern Lists
- Non-PowerApps Classic Customized Lists
- Libraries with InfoPath Customizations
- Modern Libraries
- For the PowerApps Customized Modern Lists, the threshold limit is 2000 due to the Power App functions Delegate limit[Data row limit for non-delegable queries in PowerApps is 2000], while the other Lists/Libraries have a List View Threshold limit of 5000.
- For both scenarios, we must come up with Archival Solutions. Here are the major high-level steps:
- Create an Archival Site Collection[For Ex:- https://abcdefgcorp.sharepoint.com/sites/archive ].
- Create an exact mirror image/replica of the given required Source Lists in the Target/Archive Site Collection with the below attributes:
- List Columns,
- Content Types,
- Look Ups,
- InfoPath Forms,
- PowerApps functionalities if available [Migrating PowerApp form],
- MS Flows,
- SharePoint Designer WFs etc.
- Set the Threshold Limit to the same as the source lists above.
- Migration using 3rd Party Sharegate Tool OR Any other suitable Tool based on the complexity or based on the business requirements.
- Notification to the Stakeholders once the Migration has completed.
- Verification of all old Source List records in the Target Lists by the Stakeholders.
- Approval from the Client to clean up old records on the Source List.
- Clean up of Migrated Source List records programmatically using Scheduled Jobs through MS Flows,
- PowerShell scripting.
- Set the Created Target SP List to READ ONLY once the above process is finished.
- Search Configuration in the Archive Site Collection with implementing Indexing on the Archived content.
Note
*Active Workflow-related List Items should be carefully ignored for the archival process.
- The above Archival Process a be repeated if the Archival List/Lib reached the Threshold limit. For Ex: AR Reports Archive, Archive2, Archive3, etc.
Workflow History List Handling
We have 2 Scenarios:- Active working Workflow List & Status Completed Workflow Items List.
For an Active working Workflow List
One good idea is to create an MS Flow that quickly Get the Item from the Source List to be Archived due to Threshold >> Create the Item on the Target Archival List >> Deletion of the Source List Item once Archived in the Target promoting a seamless, continuous Scheduled Microsoft Flow that could be run Weekly once.
For Status Completed Workflow Items List
If it's SharePoint Designer related WorkFlow History List Items it's already a READ Only List and can't be deleted through PowerShell scripting.
However, there is a concept called Purging. Here are a few important points on it,
- When you delete the List Items from List, it goes to the first recycle bin and then it goes to the second one. So you may want to turn this off when deleting the items. Sometimes it is not practical to delete items, so think about the overhead. Delete the item from a list, then 1st recycle bin and then 2nd recycle bin.
- Delete Items in a batch. Instead of deleting them as a big list, delete them in chunk. (for ex: start with 2000 items)
This is something not recommended, but can be tried in one-off case. Basically, you can read List Item ID from the database instead of reading it using Object Model.
- === === === === === === === using System;
- using System.Collections.Generic;
- using System.Text;
- using Microsoft.SharePoint;
- using System.Data.SqlClient;
- namespace PurgeList {
- class Program {
- private static int iteration = 5;
- private static int Count = 2000;
- static void Main(string[] args) {
- if (args.Length != 1) {
- Console.WriteLine(“first argument: No of iteration”);
- Console.WriteLine(“YourprgramName.exe 5”);
- Console.WriteLine(“ ** ** ** Program terminated: Argument missing!Please give one argument ** ** ”);
- } else {
- try {
- iteration = Convert.ToInt32(args[0]);
- } catch (Exception ex) {
- Console.WriteLine(“Error: Failed to convert arguments: {
- 0
- }”, ex.Message);
- return;
- }
- Console.WriteLine(“Current System Time(Start): ”+DateTime.Now.ToString());
- try {
- for (int iterate = 0; iterate < iteration; iterate++) {
- int start = GetMinID();
- Console.WriteLine(“Start Index: ”+start.ToString());
- using(SPSite site = new SPSite(“http:
- {
- using(SPWeb web = site.OpenWeb()) {
- SPList wflist = web.Lists[“Workflow History”];
- string wflistID = wflist.ID.ToString();
- Console.WriteLine(“No of items before deletion: ”+wflist.ItemCount);
- Console.WriteLine(“Building query…“);
- StringBuilder batchString = new StringBuilder();
- batchString.Append(“xml version = \”1.0\” encoding = \”UTF - 8\” ? > ”);
- int end = start + Count– 1;
- for (int i = start; i <= end; i++) {
- batchString.Append(“ < Method > ”);
- batchString.Append(“ < SetList Scope = \”Request\” > ”+wflistID + “”);
- batchString.Append(“ < SetVar Name = \”ID\” > ”+Convert.ToString(i) + “”);
- batchString.Append(“ < SetVar Name = \”Cmd\” > Delete”);
- batchString.Append(“ < /Method>”);
- }
- batchString.Append(“ < /Batch>”);
-
- try {
- web.AllowUnsafeUpdates = true;
- Console.WriteLine(“Executing query…“);
- Console.WriteLine(“Batch Execution(Start): ”+DateTime.Now.ToString());
- string result = web.ProcessBatchData(batchString.ToString());
-
- web.AllowUnsafeUpdates = false;
- Console.WriteLine(“Batch Execution(END): ”+DateTime.Now.ToString());
- } catch (Exception ex) {
- Console.WriteLine(“Process batch error: ”+ex.Message);
- }
- Console.WriteLine(“No of items before deletion: ”+wflist.ItemCount); using(SPSite site1 = new SPSite(“http:
- {
- using(SPWeb web1 = site1.OpenWeb()) {
- Console.WriteLine(“No of item after deletion: ”+web1.Lists[“Workflow History”].ItemCount);
- }
- }
- Console.WriteLine(“————————————————–“);
- }
- }
- }
- Console.WriteLine(“Current System Time(End): ”+DateTime.Now.ToString());
- Console.WriteLine(“———–Program Completed————“);
- } catch (Exception ex) {
- Console.WriteLine(“Error: ”+ex.Message);
- Console.WriteLine(“ ** ** ** Program terminated due to error ** ** “);
- }
- }
- }
- public static int GetMinID() {
- try {
- string connString = @” Data Source = .\SHAREPOINT;
- Initial Catalog = WSS_Content_c20feb22657e4e2ab82f7db433f3e4c7;
- Integrated Security = SSPI”;
- using(SqlConnection objConn = new SqlConnection(connString)) {
- objConn.Open();
- string sqlString = “Select min(tp_ID) as col1 from dbo.alluserdata where tp_listid = ‘6463 BECE - 3560 - 4 D15 - B965 - B245F3203BEE '”;
- SqlCommand cmd = new SqlCommand(sqlString, objConn);
- int id = Convert.ToInt32(cmd.ExecuteScalar());
- return id;
- }
- } catch (Exception ex) {
- Console.WriteLine(“SQL ERROR: ”+ex.Message);
- throw ex;
- }
- }
- }
- } === === === === === === === === === === === =
Microsoft Strategy for Archival
- https://techcommunity.microsoft.com/t5/sharepoint/sharepoint-online-backup-strategies-for-a-cloudy-day/m-p/225418
- https://www.sharepointeurope.com/archiving-in-sharepoint/
3rd Party Tools available for Archival Solutions
AvePoint
AvePoint offers a robust platform for managing SharePoint Online. The backup and restore option offers a comprehensive suite of capabilities for SharePoint Online.
http://www.avepoint.com/products/office-365-online-services/management
Quick Overview on Technical Capabilities can be found here,
http://www.avepoint.com/assets/pdf/technical_overview/DocAve_Online_Technical_Overview.pdf
Metalogix
Metalogix offers a comprehensive solution to managing Office 365 including comprehensive backups for content across Office 365 servers including SharePoint Online and OneDrive for Business.
http://www.metalogix.com/Products/essentials-for-office-365/manage
Notes on feature capabilities can be found here,
http://www.metalogix.com/docs/default-source/product-collateral/metalogix-essentials-for-office-365-data-sheet.pdf
CloudAlly
CloudAlly is a solution Focal Point does not have experience with. It can meet the backup/archive needs based on information presented on this site.
http://www.cloudally.com/sharepoint-backup/
LEAP
LEAP, like CloudAlly, is a solution Focal Point does not have experience with. Backups are stored in Windows Azure Storage where only Motoman has access.
https://www.leaphq.com/
LISTMAN
A modern .NET app that uses CSOM and allows you to archive or export data from large lists without a 5000 view limit.
www.listman.io
Happy archiving with the above knowledge!!