Resource Allocation & Job Submission

Loadleveler

Basic Commands:

  • llsubmit script.ll - submit job to the queue

  • llq - check job status of all jobs in the queue

  • llstatus - check available resources

  • llclass – shows available job classes and their parameters

  • llcancel JOBID – cancel job with the id “JOBID”

Job description and required resources must be defined in a special script (text file) for LoadLeveler. You can find some job script examples in this directory: /gpfs/home/freeware/LINUX/EXAMPLE_JOBS.

Job Script Syntax: The script consists of key expressions for LoadLeveler (lines starting with #@) and commands, which will be interpreted. At the beginning of the script, you need to specify the job’s resources. Lines with LoadLeveler keywords should not be split by lines that do not contain them. Keywords are followed by commands for job execution. You will usually use ‘mpiexec’ or ‘poe’ to run your parallel program. You can also use shell commands inside your script. You can find out your name and account number (account_no) using the command “showaccount”.

Script Example (IBM PE):

#!/bin/bash
#@ job_type = parallel
#@ job_name = My_job
#@ account_no = name-number
#@ class = My_class
#@ error = job.err
#@ output = job.out
#@ network.MPI = sn_all,not_shared,US
#@ node = 2
#@ rset = RSET_MCM_AFFINITY
#@ mcm_affinity_options = mcm_mem_req mcm_distribute mcm_sni_none
#@ task_affinity = core(1)
#@ tasks_per_node = 32
#@ queue
mpiexec /path/to/your/app -flags

Script Example (MPICH):

#!/bin/bash
#@ job_type = parallel
#@ account_no = name-number
#@ class = My_class
#@ output = job.out
#@ error = job.err
#@ network.MPI = sn_all,not_shared,US
#@ node = 1
#@ tasks_per_node = 32
#@ queue

export LD_LIBRARY_PATH=/gpfs/home/utils/mpi/mpich2-1.5/lib:$LD_LIBRARY_PATH
export PATH=/gpfs/home/utils/mpi/mpich2-1.5/bin:$PATH
$(which mpiexec) ./soft.x

The most important keywords are #@ total_tasks, which specifies the number of MPI processes and #@ node, which specifies the number of nodes. Our example runs a total of 64 tasks on 2 nodes. It is also important to choose the right job class. The following table shows the available classes.

class

max_node per job

maxjobs per user

max_total_tasks per user

max. walltime (HH:MM)

priority

cluster_short

4

288

288

24:00

80

smp_short

1

288

64

24:00

80

cluster_long

2

288

192

480:00

50

smp_long

16

288

64

72:00

50

You can find complete information about classes using llclass -l command.

Accessing Resources

IP addresses of the login node

Login node

IP address

Port

Žilina

147.213.242.7

22