This morning I was reading about some T-mobile Sidekick customers could recover data. I think Microsoft has the wrong definition of the term “backup”.
Microsoft announced that, contrary to rumors, they DID have backups of the T-mobile Sidekick data, however, the failure disrupted the backup database as well as the primary database. Now, I’m making the assumption here that the backup database was not stored on the same physical hardware as the primary (that would be phenomenally stupid). So, what aspects of the term “backup” did Microsoft miss here?
Any “backup”, whether it’s server functionality, data, data center infrastructure (power, redundant Internet connections, etc), all shares the same characteristic. A “backup” MUST be isolated from failure of the primary. The function of the backup is to provide disaster recovery and/or business continuity. If you allow the same failure to take out both primary and backup facilities, the backup isn’t only worthless, it’s actually damaging.
A poorly thought out backup facility can be damaging because it provides a false sense of security. Anything else that is built on top of that flawed foundation is now at risk. Suddenly, what would have been limited to a localized hit if it failed becomes a cascading catastrophe. Not the kind of scenario that should let any data center operator or system admin sleep well at night.
Fortunately, the answer is simple. Insure that your primary and backup facilities are not only logically separated but also physically separated as well. Make sure that there’s no way that a failure of one can damage the other.
Then get some good sleep.
Vern, SwiftWater Telecom
server co-location and data center services
data center facilities engineering