Issue
When deploying a solution in a multi-server SharePoint farm, it did not deploy to all servers in the farm. During deployment, it missed one or more servers and the status of the solution was "Not Deployed".
Note
In our environment, we have four SharePoint servers with custom MinRole, two for app servers and two for web front end purposes.
- If you browse the Manage Farm Solution from Central admin, the Status of the Solution is Not Deployed,
- If you click on the Solution and Solution Property page, you will see that it deployed to 3 servers out of 4 and kfca1 is missing.
- Even we try to retract an existing solution, it retracts from 3 servers out of 4, again missing kfca1.
Trouble Shooting
We checked the following things,
- Checked SharePoint Timer Service is running on All servers ( from services console)
- Checked SharePoint Admin Service is running on All servers ( from services console)
- Checked from Central admin if there was any Timer job stuck or paused ( from central admin > Monitoring > timer job status)
- Cleared the Config cache on all servers in the farm ( please check the wiki for Clear Config Cache)
Note
Clearing Config cache in Production required extra precautions, otherwise it will cause an outage.
- Re-deploy the Solution either PowerShell or from Central Admin
- Reboot the faulty server
- Enable the Verbose Logging and try to check ULS logs for any clue.
- Check the Event Log for any clue
Root Cause
We opened a support ticket with MSFT after all our troubleshooting. During that, we found the Internal SharePoint Foundation Timer job was disabled on the server.
Note
Farm-level SharePoint Foundation Timer job was only visible from PowerShell.
- We ran the following script to get the status of Internal SharePoint Foundation Timer Job.
- $farm = get - spfarm
- $ss = $farm.Servers | ? {
- $_.Role - notlike "Invalid"
- }
- foreach($s in $ss) {
- $s.name
- Write - host "........................."
- $is = $s.ServiceInstances
- foreach($i in $is) {
- if ($i.TypeName - eq "Microsoft SharePoint Foundation Administration") {
- $i.Typename
- $i.status
- }
- if ($i.TypeName - eq "Microsoft SharePoint Foundation Timer") {
- $i.Typename
- $i.status
- }
- }
- }
- From the out below, clearly we are seeing that SharePoint foundation Timer Job is disabled on kfca1, which is the problem.
Resolution
Now, we know the Internal SharePoint Foundation Timer job instance is disabled on the kfca1, we have to bring that service instance back online.
- We can change the status via PowerShell only. Please run the below PowerShell to bring all the Service Instances Online.
- $farm = Get - SPFarm
- $disabledTimers = $farm.TimerService.Instances | where {
- $_.Status - ne "Online"
- }
- if ($disabledTimers - ne $null) {
- foreach($timer in $disabledTimers) {
- Write - Host "Timer service instance on server "
- $timer.Server.Name " is not Online. Current status:"
- $timer.Status
- Write - Host "Attempting to set the status of the service instance to online"
- $timer.Status = [Microsoft.SharePoint.Administration.SPObjectStatus]::Online
- $timer.Update()
- }
- } else {
- Write - Host "All Timer Service Instances in the farm are online! No problems found"
- }
- Here is output which tells us that service instance is online now.
- If you run the Script to check the status of the timer job, you will see all the servers return with Online Status:
- Now, we have to the Clear the Config Cache on the all servers in the farm. Please follow this wiki for clearing config cache.
- Finally, redeploy the Solution either using PowerShell or Via Central admin.
- Now, it is successfully deployed to all servers,
. - If you check the solution properties then it will show it has deployed to all servers as expected.
Conclusion
So, it ends up that the SharePoint timer service instance in SharePoint, which is available using the PowerShell, stopped on one server which makes it impossible to deploy the solution. When you deploy it using the local parameter then it deploys on the local server but is not deployed using the global one. A big clue from the troubleshooting is, the timer job for solution deployment was never created for the faulty server. Another day in a SharePoint admin’s life.
This applies to all SharePoint Server on-prem versions; i.e. SharePoint 2010, 2013, 2016 and 2019.