module load gcc/7.1.0
Alex
module load gcc/7.1.0
Alex
Chillers in the HPC room EC3-21 failed once again this Sunday. Broadwell nodes and half of Nehalem nodes (zeus[20-91,15]) were switched off until the cause of the faults will be finally found by Estates.
Compute nodes which are available : zeus[100-171, 200-217] (queues: all, long, GPU)
Regards,
Alex
Zeus HPC is operational. DataLake machine is still experiencing some problems.
Alex
Update on the HPC issue:
Temperature in main HPC room stabilised, I brought login nodes and main server and file servers up. Until further update from Estates about the cooling system stability in the room, most of the compute nodes in that room will be offline (that includes new Broadwell nodes)
I brought some Nehalem (half of 8-CPU nodes) and Sandybridge (12-CPU “GPU” queue) compute nodes up in unaffected by cooling failure room (zeus[100-171], zeus[200-217]), they can be used as file servers now are operational.
Regards,
Alex
Hi,
Zeus HPC is down due to cooling failure in the room EC3-23.
Regards,
Alex
P.S. will update you when it can come back….
Recent Comments