Posted by Matt Edgley on 09-Jun-2017 11:43:37

Data centre cooling and downtime risks: Your questions answered

How data centre cooling can affect the risk of downtime

If you’re in the market for rack space from a colocation facility, you probably already know a thing or two about the importance of data centre cooling. You probably have a general idea of the right temperature range for your servers and infrastructure, and understand how hot or cold aisle containment can help your provider to deliver that. But can you really pinpoint the correlation between data centre cooling and the risk of unplanned downtime?

There’s a lot of conflicting information out there, and with the likes of Google running their data centres a smouldering 27 degrees higher than most other facilities, it can be difficult to know for sure how a change in data centre climate then translates into a higher (or lower) overall level of resilience.  Here’s what you need to know.

(Get the ultimate guide to choosing a colocation data centre)

What’s the optimal data centre temperature?

Whilst some academic studies show that running a data centre at a higher temperature is perfectly acceptable, this can only be truly considered if you have a full understanding, control and monitoring over almost every piece of hosted equipment in the data centre.

In a multi-tenanted, multi-vendor collocated environment, the cooling requirements of hosted equipment will vary hugely and so making sweeping temperature decisions could have hard-hitting consequences for some customers.

It’s therefore recommended that you look for a provider who adhered to official guidelines set by ASHRAE, who state that 21 degrees is optimal, and stand by the consensus that lower, but not necessarily minimal, temperatures are better for system availability and performance.

SLAs should also be in place to guarantee cooling temperatures, much as you would expect with power SLAs – and extensive monitoring systems should be evident to ensure that alerts are generated where thresholds are crossed.

Do small temperature changes matter?

In practice, does it matter if you gain a few degrees here and there? Well, there’s an argument that it may not have made much difference five or six years ago. Today, however, cooling is more important than ever because data centre customers are squeezing more and more performance out of a single rack.

Half a decade ago, the average data centre’s power consumption per rack may have hovered around 2 kW. Today, with the rise of cloud, virtualisation and more demanding hardware utilisation rates, it’s generally seen at least twice this level in a standard density facility and may even reach double figure kW usage per rack.

As such, if data centres aren’t making the most of their cooling systems and, for example, are allowing hot and cold air to mix through lack of control, then both capacity and efficiency are easily lost – and the risk of downtime and equipment failure is easily introduced.

Question your data centre provider on their maximum cooling capability per rack – you may not intend to use a large amount of power today, but your cooling demands may increase as your server usage and power load grows.

It’s not just about temperature

Moreover, it’s worth noting that the relationship between cooling and resilience isn’t just about the ambient temperature.  An air conditioning unit can fail, just like any other component, so you should look for significant redundancy here too.

At TeleData’s Manchester data centre, for example, as well as working in N+1 configuration, there are also failover components within the actual units themselves to ensure that the risk of downtime to any individual unit is avoided as far as possible. If a unit is lost, then the N+1 resilience level ensures cooling capacity is maintained as required.

Furthermore, we split all units alternately between our dual power feeds, which can operate independently from each other, through different switchgear and on diverse paths.

It’s therefore important to ask your provider about how they maintain their air conditioning units, what their capacity is, how it’s all monitored, and what degree of redundancy they offer to protect against failure or unit isolation for routine maintenance. After all, no degree of UPS redundancy can compensate for a power failure if it leaves you without air conditioning and your servers are fried. (See our blog 5 questions to ask your colocation provider about resilience for more.)

Choosing a new colocation provider can feel overwhelming. Download our buyer's guide below – it’ll outline the key things to look out for, and help you keep your cool along the way.

FREE download: The data centre services buyer’s guide >

Get the ultimate guide to choosing a colocation data centre

Topics: colocation, resilience, data centre