Power to self-managed servers in the Informatics Forum

If you don’t use any servers in either of the Forum self-managed server rooms, then please ignore this blog post – and have a nice day!

However, if you do use any such machines, then it is important that you read the following, and fully understand the situation concerning power to those machines.


We are continuing to receive log messages reporting ‘power overload conditions’ from the various rack power bars which provide power to the servers in the two Forum self-managed server rooms IF-B.Z14 and IF-B.01. Under normal circumstances, we would be able to move power connections (or entire servers) around in order to better balance the power load and to sort the problem out but, as things stand under the Covid-19 lockdown, we can’t easily arrange to do that.

So, if a circuit breaker does trip as a result of an overload then, under the present circumstances, it is likely to be quite a while before we can get anybody into the Forum to reset it. And, meanwhile, your machines will be without power – with other users of the room almost certainly being affected as well.

Please therefore think carefully about power usage and, in particular, think carefully about whether it’s a really good idea to be running your power-hungry machine (or machines) flat out at the moment!

Thanks.


Note: In order that you can directly see for yourselves which rack power bars (if any) have recently been reporting ‘power overload’ conditions, we have provided the web page http://netmon.inf.ed.ac.uk/barTrapLogs which gives a processed view of the logging information we’ve recorded from all of our power bars over the previous few days, as well as a count of the total number of ‘overload’ conditions we have recently recorded.

(By the way, that web page is restricted to access from within our network for privacy reasons – so you’ll need to use to use the Informatics VPN service or similar if you want to view it from home.)

At the time of writing (Friday 24th April, 2020), we are still seeing constant power overloads being reported by rack power bar ssm012, which is one of the two power bars supplying power to Rack 6 in self-managed server room IF-B.Z14. That means that power to Rack 6 in IF-B.Z14 should be currently be considered ‘at risk’ until users of machines in that rack take steps to reduce the overall load. You have been warned!

Power-hungry machines in IF-B.Z14 Rack 6 include sigyn, cgserver, cc1 and cc2. It would be very helpful if users of those particular machines would moderate their power use now, in order to reduce the real risk of a complete loss of power.

This entry was posted in Service Update and tagged , , . Bookmark the permalink.

Leave a Reply