Category Archives: Uncategorized

HPC on MS Teams

Hi All,

I created HPC Team on MS Teams, in case support need to be provided. If you want you can add yourself to the HPC Team yourself: open MS Teams -> join> join with a code -> glxx7vy   (this is the code)

 

Regards

Alex

HPC Help

If you require help with your HPC-related stuff, you can try finding me @ MS Teams: HPC Help Channel of EEC HPC Team

Alex Pedcenko

Launching OpenFOAM on Windows 10 Linux subsystem for Windows

Check OpenFOAM for Windows Screen-casts at https://livecoventryac.sharepoint.com/portals/hub/_layouts/15/PointPublishing.aspx?app=video&p=c&chid=0aedf551-986d-4d5a-bd3a-43007bda3f64&s=0&t=av

 

 

Alex Pedcenko

Connecting to HPC terminal with Google Chrome

Instead of using PuTTY, you can also connect to HPC terminal in Google Chrome using SSH extension, see example below:

MATLAB 2019a

MATLAB 2019a is now available on zeus HPC. Use module load matlab/2019a and launch it with just matlab.

Toolboxes installed:

MATLAB                                                Version 9.6         (R2019a)
Simulink                                              Version 9.3         (R2019a)
Computer Vision Toolbox                               Version 9.0         (R2019a)
Control System Toolbox                                Version 10.6        (R2019a)
Curve Fitting Toolbox                                 Version 3.5.9       (R2019a)
DSP System Toolbox                                    Version 9.8         (R2019a)
Deep Learning Toolbox                                 Version 12.1        (R2019a)
Fixed-Point Designer                                  Version 6.3         (R2019a)
GPU Coder                                             Version 1.3         (R2019a)
HDL Coder                                             Version 3.14        (R2019a)
Image Acquisition Toolbox                             Version 6.0         (R2019a)
Image Processing Toolbox                              Version 10.4        (R2019a)
Instrument Control Toolbox                            Version 4.0         (R2019a)
MATLAB Coder                                          Version 4.2         (R2019a)
MATLAB Compiler                                       Version 7.0.1       (R2019a)
MATLAB Compiler SDK                                   Version 6.6.1       (R2019a)
Model Predictive Control Toolbox                      Version 6.3         (R2019a)
Optimization Toolbox                                  Version 8.3         (R2019a)
Parallel Computing Toolbox                            Version 7.0         (R2019a)
Partial Differential Equation Toolbox                 Version 3.2         (R2019a)
Sensor Fusion and Tracking Toolbox                    Version 1.1         (R2019a)
Signal Processing Toolbox                             Version 8.2         (R2019a)
Simscape                                              Version 4.6         (R2019a)
Simscape Driveline                                    Version 2.16        (R2019a)
Simscape Electrical                                   Version 7.1         (R2019a)
Simscape Fluids                                       Version 2.6         (R2019a)
Simscape Multibody                                    Version 6.1         (R2019a)
Simulink 3D Animation                                 Version 8.2         (R2019a)
Simulink Coder                                        Version 9.1         (R2019a)
Simulink Control Design                               Version 5.3         (R2019a)
Statistics and Machine Learning Toolbox               Version 11.5        (R2019a)
Symbolic Math Toolbox                                 Version 8.3         (R2019a)
Text Analytics Toolbox                                Version 1.3         (R2019a)
Vehicle Dynamics Blockset                             Version 1.2         (R2019a)
Vehicle Network Toolbox                               Version 4.2         (R2019a)

Let me know if any extra toolboxes are necessary.

Regards,

Alex

Update on queues of Zeus HPC

Queuing model UPDATE June 2018

To simplify usage of different queues, we combined all nodes into a single default queue (slurm partition) “all”. The usage limits are now solely user-based, each user has default (for now) number of CPU*minutes that they can use at any time moment (subject to available resources). If this number of CPU*min is reached, the new jobs from this user will be put on queue until their running jobs will free resources. This is independent of type of compute nodes. During this initial stage we will try to adapt the default CPU*min allowance to suite better and more effective HPC usage. The simple principle behind this is that user can use more CPU cores but for less time, or, less CPU cores, but for longer time. The run time of the job to be submitted is determined by the value you set in –time or -t parameter during the submission of a job (e.g. -t 24:00:00).

If you require a particular type of compute nodes (CPU/GPU/Phi etc), this can be done in submission script or during the submission with sbatch command: by specifying an additional parameter “constraint”:

  • for 56 Intel Broadwell CPU based nodes (128GB RAM each) with 32xCPU-cores , specify --constraint=broadwell
  • for 144 Intel Nehalem CPU based nodes (48 GB RAM each) with 8xCPU-cores, specify --constraint=nehalem
  • for 18 Intel SandyBridge CPU based nodes (48 GB RAM) with 12xCPU-cores, do --constraint=sandy
  • for 1 x 32 CPU, 512GB RAM SMP node ask for --constraint=smp
  • for 10 nodes x 2 NVidia Kepler K20 GPUs, ask for --gres=gpu:K20:N (where N is the Nr of GPUs needed, max is 2 GPUs/node)
  • for 18 nodes x 2 NVidia Kepler K80 GPUs, ask for --gres=gpu:K80:N  (where N is the Nr of GPUs needed, max is 2 GPUs/node)
  • for N Intel Phi, ask for --gres=mic:N or --constraint=phi.

For more details on Zeus’s CPUs and nodes see this post: http://zeus.coventry.ac.uk/wordpress/?p=336

If you have no particular preference on the type of CPU or compute node and are running parallel job, please specify ONLY TOTAL Nr of CPUs required, NOT Nr of nodes!: SLURM will assign the nodes automatically.

e.g. if I need 64 CPUs in total for 24 hours on whatever available nodes I submit my slurm script with:

sbatch -n 64 -t 24:00:00 myslurmscriptname.slurm

if I need 64 CPUs in total for 48 hours on Broadwell-based nodes (32 CPUs/node) I submit my slurm script with:

sbatch -n 64 -t 48:00:00 --constraint=broadwell myslurmscriptname.slurm

Finally if I want 2 GPU nodes with 2 Nvidia Kepler K80 GPUs (1 GPUs/node) and 2 CPUs on each node (36 hours), I do something like:

sbatch -N2 --ntasks-per-node=1 --cpus-per-task=2 --gres=gpu:K80:1 -t 36:00:00 mygpuscript.slurm

Certainly some variations of these sbatch commands are possible, also these flags can be specified inside the slurm submission script itself. For full list of possible sbatch options see slurm docs: https://slurm.schedmd.com/sbatch.html

 

Alex Pedcenko

Zeus down due to Room overheating

Hi,

Zeus HPC is down due to cooling failure in the room EC3-23.

Regards,
Alex

P.S. will update you when it can come back….

Current (experimental) limits of queues

The total number of nodes (or CPUs) you can use depend on how long your job has to run (i.e. in which queue/partition it was submitted):

[queues are listed from higher to lower priority, i.e. shorter queues have higher priority in the waiting list!]

for up to 4-hours short jobs:

  • short4” queue has unlimited number  of CPUs for up to 4 hours  and can use nodes from any of the queues! (Hint: do not specify how many nodes you need, just specify how many CPU-cores you need for your job, i.e. for 1000 CPU-cores for 4 hours “sbatch -p short4 -n 1000 -t 4:00:00  submitscript.slurm”). Default time (if you do not specify walltime) is 1 hr.

for up to 12hours jobs:

  • You can use up to 144 nodes x 8 CPUs  of “all12”  queue for up to 12 hours.

For up to 24hrs jobs:

  • You can use up to 80 nodes x 8 CPUs  in “all” queue (640 CPUs) for up to 24 hours.
  • You can also use 10 nodes x 32 CPUs of Broadwell queue (another 320 CPUs, use less nodes per job, these are “fat” nodes! ) for up to 36 hours
  • You can also use 20 nodes x 8 CPUs of “all48” (another 160 CPUs) queue for up to 48 hours (lower priority than “all“)

For up to 36hrs jobs:

  • You can use up to 10 nodes x 32CPUs in Broadwell queue
  • You can use up to 5 nodes x 32CPUs (160 CPUs and 10 K80 GPUs) in NGPU queue
  • You can use up to 18 nodes x 12 CPUs in (+ 36 K20 GPUs)  in GPU queue
  • You can use 1 SMP node (32 CPUs and 512 GB RAM) in SMP queue

(specialized queues SMP,GPU,NGPU have higher priority in the waiting list, i.e. if you need to use GPUs on these nodes, you have higher “weight”)

For up to 48hrs jobs:

  • You can use 20 nodes x 8 CPUs (160 CPUs in total) in queue “all48” for up to 48 hours

For >48 hrs long jobs:

  • You can use up to 20 nodes (160 CPUs) in queue “long” for unlimited job time

 

 

 

 

“RequiredNodeNotAvailable” status of the Job

There is a standing reservation of all nodes in “all” queue and “Broadwell+NGPU+Phi” nodes for this Sunday 13/11/16 from 0:00 to 12:00, which is needed for conducting more performance tests before commissioning of HPC. So If your submitted job spans through this time period you will get this message as a reason for “queueing”.

Alex Pedcenko

css.php