Tag Archives: t-mobile

The data center in review, top 10 bozos of the year, 2009!

My coined term from my data center analysis and commentary this year: bozosity. Bozosity is a condition brought on by an imbalance of invisible particles known as bozons. This condition causes otherwise competent and sensible people to do incomprehensibly boneheaded things.

The Winners of the 2009 Data Center Bozo Awards are:

1. Microsoft and Danger for the T-Mobile Sidekick data loss debacle. MicroDanger did not win for operating critical storage systems without backups, but for the handling of the aftermath. MicroDanger immediately announced all data was lost, then, by the time they did recover most of the data, significant damage was done to T-Mobile and the sidekick, leaving everyone involved with a reputation for incompetence.

2. Fisher Plaza for knocking out major Internet services by blowing up an antquated, obsolete, and improperly maintained electrical system in their data center building. Aluminum bus bars are evil, k?

3. IBM for blowing Air New Zealand out of the water by performing power work during peak load period of one of Air New Zealand’s busiest travel weekends, unnecessarily running ANZ’s mainframe from a fallible generator alone, and taking an inordinate amount of time to restore service.

4. IBM for allowing a state of Texas elections commission storage system in their care to fail because it wasn’t in the contract to back it up.

5. Google for their brilliant example of cascading failure by sequentially overloading every router feeding their Gmail service.

6. Research in Motion for seeing how many BlackBerry back end failures they could squeeze in before the end of the year.

7. Amazon, Rackspace, Google, and a number of other providers who managed to blacken the term cloud computing by multiple reliability problems, most of which were self inflicted. Thanks a heap guys.

8. DreamHost for giving us a shining example of how NOT to do a major data center migration.

9. The people who operate Sweden’s top level DNS domain for turning lose an untested script and taking the entire thing down. Who knew a few missing dots could be so much trouble?

10.The Perth iX data center in western Australia for allowing a smoldering mulch bed outside the building to shut down the entire data center because they couldn’t locate a miniscule amount of smoke that was infiltrating the building and setting off an overly sensitive detection system.

Finally, I’d like to add a “dishonorable mention” award to FedEx for turning overnight delivery of a critical part into 3 days and nearly sticking me with working in the data center overnight Christmas Eve.

Looks like we survived the year but it sure wasn’t pretty.

Vern, SwiftWater Telecom


Data Center Follies, Microsoft and T-mobile keep on floundering

This morning I was reading about some T-mobile Sidekick customers could recover data. I think Microsoft has the wrong definition of the term “backup”.

Microsoft announced that, contrary to rumors, they DID have backups of the T-mobile Sidekick data, however, the failure disrupted the backup database as well as the primary database. Now, I’m making the assumption here that the backup database was not stored on the same physical hardware as the primary (that would be phenomenally stupid). So, what aspects of the term “backup” did Microsoft miss here?

Any “backup”, whether it’s server functionality, data, data center infrastructure (power, redundant Internet connections, etc), all shares the same characteristic. A “backup” MUST be isolated from failure of the primary. The function of the backup is to provide disaster recovery and/or business continuity. If you allow the same failure to take out both primary and backup facilities, the backup isn’t only worthless, it’s actually damaging.

A poorly thought out backup facility can be damaging because it provides a false sense of security. Anything else that is built on top of that flawed foundation is now at risk. Suddenly, what would have been limited to a localized hit if it failed becomes a cascading catastrophe. Not the kind of scenario that should let any data center operator or system admin sleep well at night.

Fortunately, the answer is simple. Insure that your primary and backup facilities are not only logically separated but also physically separated as well. Make sure that there’s no way that a failure of one can damage the other.

Then get some good sleep.

Vern, SwiftWater Telecom

server co-location and data center services
data center facilities engineering

Spectacular data center flameout, T-mobile, Sidekick, Danger, Microsoft

Talk about your phenomenal data center disasters. All T-mobile Sidekick data lost and it has Microsoft’s name written all over it.

All of this big hype for super advanced stuff like “cloud computing” and the $500 million aquisition of the largest software company in the world commits the most rookie mistake possible and fails to back up critical customer data. So now the cleanup of the mess is costing T-mobile and people are dumping a formerly popular gadget like it was toxic waste.

So, where does that leave “Danger” (ironic name) now? And does anyone else think they should have spent a teeny bit of that $500 mil on a second server?

Brand names get damaged from time to time but I don’t believe I ever saw one so thoroughly trashed in such a short time.

Vern, SwiftWater Telecom

data center, web hosting, Internet engineering