Date center power savings and crazy infrastructure design.

I’ve just been reading about how saving power can cause more problems for your data center. It’s not saving power that’s the problem, it’s making boneheaded infrastructure choices.

First, let’s talk about the data center power saving techniques mentioned here. Many modern servers come equipped with the capability to reduce CPU clock speeds or shut down CPU cores when demand is low, server fans can reduce speed, hard drives can spin down, all reducing the demand for power.

Now, the article claims that increased load from increased demand on the energy saving servers can overload breakers and temperature hotspots. That’s true but only if you under design the facility infrastructure based on minimum loads.

Under designing power and cooling capacity is most always a bad gamble. If you design your power assuming you’ll only ever have 50% of the servers running at full power consumption, don’t look shocked when that suddenly spikes to 75% and the overloaded infrastructure dumps it all in a heap. If there’s only a 5% chance of this happening, are the cost savings worth the consequences? How about 4%? 3%? It’s a dangerous game and it’s not a question of if, it’s a question of when it’s going to bite you.

I can’t imagine anyone trying to design the capacity of their power or cooling infrastructure based on assumptions about what the dynamic power saving features of the servers is doing. Any significant saving in infrastructure from this is going to make the data center highly vulnerable to spike overloads, with the ensuing chaos.

This is a perfect storm scenario for cascading failures (I’ve written about these before). One failure triggers another and another until everything ends up down hard.

The only safe way to equip power and cooling capacity is based on maximum load (with some elbow room for safety and expansion). You’re still getting the power saving benefits of the servers and you don’t have to worry about whether some unexpected spike in demand is going to come along and cause you to fail catastrophically.

Don’t roll the dice with your data center power and cooling capacity, it’s a bad bet every time.

