The reliable data center, leave the lights on.

Tonight I was reading on lights out data centers. Reducing overhead in the data center isn’t a bad idea but it can be taken to extremes.

The idea of lights out data centers means that all functions normally required in the data center can be performed remotely. This means that management and control and monitoring facilities can be shared among multiple data centers, saving money.

Hardware such as the HP/Compaq RilO card gives full remote console access to the servers. Many modern servers also have remote console capability via serial ports. We use these for emergency out of band control. These can certainly reduce the amount of time to respond to trouble situations.

On the bad side of lights out, there are many things that can happen that could be resolved with a fast staff response. As my previous posts reveal, I’m a great proponent of dealing with things long before they get bad enough to turn into a disaster. Cascading failures are another good example where quick response can head off and contain problems.

A good example of this is the trend of using absolutely minimum UPS runtime (15 seconds for flywheel) before starting a backup generator. Short run UPS power combined with the potential of a generator failing to start and long tech response time equals a great potential for disaster.

I think there are a lot more productive ways to reduce data center costs (virtualization, removing zombie servers, etc) than removing skilled hands that can resolve problems before they can impact customers.

Vern, SwiftWater Telecom


