On the importance of backups

This story at Security Awareness for Ma, Pa and the Corporate Clueless offers insight into the value or good backups and the importance of testing everything that affects the backup routine.

A Toronto advertising firm had a really good systems administrator who was religious about backup. For years, they had been in good shape. He even tested the restore/recovery process from time to time as part of their disaster planning. Smart.
As part of their growth, the ad firm moved into new larger facilities a few blocks away. The architects coordinated with the techs to make sure wires were put in the right place, phones, VoIP, 1Gig backbone… all the stuff modern companies have when they do things right.

Then, the company moved. All the typical stuff that happens during a move happened. Testing was done on everything that was moved. All was good.

One Monday morning, some of the servers had problems. To get back up and running quickly, they chose to restore from the weekend’s backup, an automatic process. They performed the backup as per procedure.

It took less than an hour. Users at the ad firm were screaming at the Help Desk and Network Admin. “Where are my files?” “We have customers waiting.” You get the idea. From the admin’s standpoint, they had done their job. From the user’s perspective, critical files were gone.

Hey! Something’s wrong here. They’ve tested everything. Shouldn’t these files be available after the restore? Well, it turns out, sometimes things aren’t quite what you expect.

No one had bothered, in the office construction, to test the electrical circuits. What had happened was an electrician had mis-wired a switch in the hallway near the NOC (Network Operations Center). When that switch was turned off by the cleaning crew on Friday evening, it turned off the power to the rack of backup servers. The backups were never made that weekend and the ad agency had to recreate a week’s worth of creative projects… and their customers were not happy.

Ouch. Having been hit by a similar problem at a job once, I can empathize. We didn’t lose much though. The ad-agency here did. So the lesson:

Lesson Learned: Test everything that can affect your backups. Everything, and do not forget the power switches, the battery supplies in UPSs, lights, air conditioning, etc.

In fairness to the company though, I’m not even sure how one would go about testing an external power switch that isn’t even supposed to be part of the server electrical configuration.

[tags]Security awareness, Backups[/tags]