I was at Microsoft’s TechEd North America 2014 last week learning all kinds of new and cool technology. On Monday, I got pinged by my client’s helpdesk support that no ConfigMgr operations could take place. No one could connect with the SCCM console, no OS deployments would work nor would any software deployments. WTF?
This has never happened before where absolutely nothing would work. “Great”, I thought. “What better time for this to happen than when myself and the two other guys that help admin SCCM are gone. I’m going to have to miss a session to look into this.”
I grabbed a comfy chair there at the Houston Convention Center and thought this shouldn’t to too difficult. I mean, what is the worst that could have happened? All 3 of us admins were gone and no one else makes any changes to the core site server.
I verified they weren’t crazy and managed to find there was no directory on the file system of the site server where the installation files were. I thought I was nuts and just forgot where it was installed to. Proceeding to open up SQL Server Management Studio shows no SCCM database at all! OK, I really thought I was off my rocker now. How could the entire install and database just vanish?
After some frantic minutes I finally saw that the site server itself was UNINSTALLED. This ended up being due to another helpful admin that decided to push an untested antivirus upgrade to it. For anyone that has Kaspersky antivirus and Configuration Manager 2012 be forewarned!
Needless to say I was forced to “test” out my site backup last Monday. I can’t complain about that. I was just using the integrated site backup maintenance task every night and the afterbackup.bat batch file to backup the certificates and SSRS database and it restored without a hitch; files, database and all. Fast forward through TechEd and me coming back to work.
I open up the SCCM console and notice all of my clients are showing up as Inactive. That’s strange, however it appears everything else is working.
I did a couple software deployments, no one’s complaining so I get busy doing something else. The next day, I’m trying a new deployment and notice that the clients aren’t reporting to the management point and haven’t been since the restore.
Looking.looking.looking. Aha! .a problem with the management point. No problem, it’s probably just some fallout and needs to be reinstalled. After the MP role removal from the site server and a reinstall I notice in the MPSetup.log file that it’s failing on a MSXML6 prereq by looking into that wrong file path again (the old installation file path).
What does a good IT pro do whenever a piece of software is looking at a wrong file path? The registry! Time to start replacing all instances of the folder path in the registry. But wait.there’s more! Turns out there’s also file path references in the database as well. Never fear. ApexSQL’s SQL Search tool made that a cinch by searching all tables at once.
The next management point role went without a hitch and the clients were now showing up as Active in the console and successfully getting their new policies from the server. Success!
Now, after all this I have some rhetorical questions as to why it had to be this hard.
- Why didn’t any component under the monitoring tab show a problem? According to the component statuses, all was good. No problems to see here.
- Why didn’t the new Support Center tool see that a problem was afoot? I ran the troubleshooting diagnostics on a client and all green checks abounded. In hindsight, it appears that the tool just checks to ensure the client was able to process it’s policy correctly which it did.
In summary, this was unfortunate but allowed me to see the holes in my restore procedure. Here’s what I should have done but will do from now on.
- Backup the \\SITESERVER\SMS_SITECODE\Client directory. This isn’t backed up by default and when the site system is uninstalled, it’s gone. This includes your client (which DOES get restored) but does not include any hotfixes you had been getting installed on the client. I used the little trick to put a ClientPatch folder under the i386 and x64 client folders to install the MSP immediately after all client installs (yet unsupported).
- Backup your afterbackup.bat file. It is not backed up, by default. This file is located in CONFIGMGRINSTALLPATH\inboxes\smsbkup.box in case you forgot.
- Pay attention to the Post-Recovery tasks instead of just thinking it’s done after the database is back. I missed a couple of minor things there as well.
- Look into forgoing the built-in backup task altogether in lieu of a ConfigMgr SQL Server maintenance plan.