Tag Archives: navisite

The great data center power debate.


After commenting on the Navisite San Jose data center outage (original here), I had a number of different viewpoints presented to me on the data center power reliability subject. I thought I’d take a few minutes to expand on where I believe data centers are really going wrong.

It used to be a tenet of data center operations that if the facility went down, someone’s head was going to roll. It was a rare thing that a data center went down and it usually because of a large scale disaster. However, in 2008, 2009 and so far in 2010, there have been rampant power failures in major data centers that have taken all or large parts of facilities down.

It seems that data center operators have developed a much greater tolerance to the pain of having their data center go down uncontrolled in the name of saving money up front. Short run battery systems and incredibly short run options such as flywheel depend on the most fallible piece of equipment in the data center, the generator. I hope the hit team from CAT and Generac don’t come knocking on my door, but starting a generator is like rolling the dice. Stake your entire facility on it working perfectly in the 15 second run time of flywheel and it’s just a matter of when it’s going to let you down.

It’s certainly true that batteries, such as the ones I use for green DC power plants, can fail you, there’s almost nothing that can happen with batteries that will let you down totally and most battery issues are very easy to spot with minor maintenance. On the other hand, failure to transfer, generator failure to start, or generator failure while operating is catastrophic. You’re not going to be operating degraded, you’re going down hard.

This issue is becoming even more important with the use of other green data center technologies, such as cloud computing. Now instead of just one angry customer, you have 10 times (or more) the amount of angry customers for the same amount of failure. Couple that with data loss, fixing screwed up hard drives, time and money to straighten the mess out, and monetary loss to the customers and this doesn’t make any sense to me at all.

Keep your reputation intact, keep your customers happy, and avoid becoming a trending topic on Twitter (the bad way). Stop skimping on the power protection.

If you’re looking for highly reliable cloud computing services or data center green DC power plant engineering, installation, and maintenance, drop me an email, give me a call, or visit our site at SwiftWater Telecom today!

Vern

swiftwater telecom rcs cloud computing logo

Thursday data center tidbits.


This morning we have more details on the Navisite data center outage in San Jose. Root cause was an engineering failure, allowing the automatic transfer switch to fail to start the generators. This is a great example why ultra short run time backup systems like flywheel are such a bad idea, since the power could be transferred and the generators started manually (you’re not going to get a chance to do that in the 15 second runtime of a flywheel!). On the bad side, they obviously didn’t have enough battery capacity to hold up either (or didn’t have the monitoring to allow them to react to the problem promptly). Is it REALLY worth the savings to take this kind of gamble?

Vern

swiftwater telecom rcs cloud computing logo

Tuesday data center tidbits.


In the news today is the crash of the NaviSite data center in San Jose as the result of backup power failure. This is the best example I’ve seen why the move to extremely short run backup power, such as flywheel, is a totally boneheaded idea. If everything goes perfectly, short carry over may be just fine, but counting on perfection from generators is bad odds.

Vern