If you’re in the market for rack space from a colocation facility, you probably already know a thing or two about the importance of data centre cooling. You probably have a general idea of the right temperature range for your servers and infrastructure, and understand how hot or cold aisle containment can help your provider to deliver that. But can you really pinpoint the correlation between data centre cooling and the risk of unplanned downtime?
There’s a lot of conflicting information out there, and with the likes of Google running their data centres a smouldering 27 degrees higher than most other facilities, it can be difficult to know for sure how a change in data centre climate then translates into a higher (or lower) overall level of resilience. Here’s what you need to know.
What’s the optimal data centre temperature?
Whilst some academic studies show that running a data centre at a higher temperature is perfectly acceptable, this can only be truly considered if you have a full understanding, control and monitoring over almost every piece of hosted equipment in the data centre.
In a multi-tenanted, multi-vendor collocated environment, the cooling requirements of hosted equipment will vary hugely and so making sweeping temperature decisions could have hard-hitting consequences for some customers.
It’s therefore recommended that you look for a provider who adhered to official guidelines set by ASHRAE, who state that 21 degrees is optimal, and stand by the consensus that lower, but not necessarily minimal, temperatures are better for system availability and performance.
SLAs should also be in place to guarantee cooling temperatures, much as you would expect with power SLAs – and extensive monitoring systems should be evident to ensure that alerts are generated where thresholds are crossed.
Do small temperature changes matter?
In practice, does it matter if you gain a few degrees here and there? Well, there’s an argument that it may not have made much difference five or six years ago. Today, however, cooling is more important than ever because data centre customers are squeezing more and more performance out of a single rack.
Half a decade ago, the average data centre’s power consumption per rack may have hovered around 2 kW. Today, with the rise of cloud, virtualisation and more demanding hardware utilisation rates, it’s generally seen at least twice this level in a standard density facility and may even reach double figure kW usage per rack.
As such, if data centres aren’t making the most of their cooling systems and, for example, are allowing hot and cold air to mix through lack of control, then both capacity and efficiency are easily lost – and the risk of downtime and equipment failure is easily introduced.
Question your data centre provider on their maximum cooling capability per rack – you may not intend to use a large amount of power today, but your cooling demands may increase as your server usage and power load grows.
It’s not just about temperature
Moreover, it’s worth noting that the relationship between cooling and resilience isn’t just about the ambient temperature. An air conditioning unit can fail, just like any other component, so you should look for significant redundancy here too.
At TeleData’s Manchester data centre, for example, as well as working in N+1 configuration, there are also failover components within the actual units themselves to ensure that the risk of downtime to any individual unit is avoided as far as possible. If a unit is lost, then the N+1 resilience level ensures cooling capacity is maintained as required.
Furthermore, we split all units alternately between our dual power feeds, which can operate independently from each other, through different switchgear and on diverse paths.
It’s therefore important to ask your provider about how they maintain their air conditioning units, what their capacity is, how it’s all monitored, and what degree of redundancy they offer to protect against failure or unit isolation for routine maintenance. After all, no degree of UPS redundancy can compensate for a power failure if it leaves you without air conditioning and your servers are fried. (See our blog 5 questions to ask your colocation provider about resilience for more.)