Intel® DevCloud for oneAPI

Overview Get Started Documentation Forum external link

About the Job Queue

When you access the Intel DevCloud for oneAPI Projects through SSH, you will be connected to a login node with hostname login-2. On this node you can edit code and compile applications. However, to run computational applications, you will submit jobs to a queue for execution on compute nodes. In this manner, you are sharing computing resources with your peers, however, other users cannot see your data or applications.

If you access through JupyterHub, you will be connected to a compute node. Compared to the login node, you will have more local resources at your disposal. However, some features like longer walltime and multi-node computation is only abvailable through the job queue. You can use all of the commands listed in this page directly from JupyterHub terminal or notebook cell.

Basic Job Submission

Submit a Shell Command

This will execute the command lscpu on the first available node:

[<user>@login-2 ~]$ echo lscpu | qsub
520.login-2

Once the job enters the queue, qsub will return the job ID, in this example 520.login-2. Make a note of the job number, which in this example is 520.

The job will be scheduled for execution, and, once it completes, you will find files STDIN.o520 and STDIN.e520. Here

  • STDIN is the job name (by default, equal to STDIN for jobs submitted via the pipe |)
  • 520 is the job number
  • STDIN.o520 is the standard output stream of the job
  • STDERR.e520 is the standard error stream of the job
[<user>@login-2 ~]$ cat STDIN.o520 | grep 'Model name'
Model name:            Intel® Xeon® Gold 6128 CPU @ 3.40GHz

Now let's learn to submit more complex jobs, including your own applications.

Direct Submission (Pipe Syntax)

In principle, you can submit your own computational application to the queue in the same way as we have just submitted a command:

[<user>@login-2 ~]$ echo ~/my_first_project/myexecutable | qsub

However, if you would like to

  • change working directory,
  • specify environment variables,
  • execute multiple commands,
  • request a specific CPU model or operation mode,
  • compute on more than 1 node,
  • rename the job from STDIN to something more meaningful, etc.,

then you may find it more convenient to launch the job via a submission script.

Command File (Job Script)

A command file (sometimes also called "job script") is an executable script that will be executed on the compute node to which your job is assigned. This script is processed by a Linux shell (by default, bash) and it may also contain PBS directives in lines beginning with #PBS. Here is an example:

[<user>@login-2 my_first_project]$ cat launch
#PBS -N my_project_1
cd ~/my_first_project
echo Starting calculation
./myexecutable
echo End of calculation

In this script we rename the job to my_project_1 using a PBS directives, change directory to ~/my_first_project, print some diagnostic output ("Starting calculation"), launch the executable ./myexecutable and do more diagnostic output.

IMPORTANT: you must have an empty line at the end of the command file, otherwise the last command will not be executed!

To schedule a job with this command file, use this command:

[<user>@login-2 my_first_project]$ qsub launch
523.login-2

When the job finishes, you will get files:

  • my_project_1.o523 - this is the standard output stream
  • my_project_1.e523 - this is the standard error stream

Managing Submitted Jobs

Querying Job Status

To get a list of running and queued jobs, use qstat:

[<user>@login-2 ~]$ qstat
Job ID                    Name             User            Time Use S Queue
------------------------- ---------------- --------------- -------- - -----
551.login-2                   my1stjob         <user>             00:00:00 R batch          
552.login-2                   my2ndjob         <user>             00:00:00 R batch          
557.login-2                   mylargejob       <user>                    0 Q batch          

Status R means the job is running, status Q means it is waiting in the queue.

To get full information about the job, use qstat -f:

[<user>@login-2 ~]$ qstat -f 557
Job Id: 557.login-2
    Job_Name = mylargejob
    Job_Owner = <user>@login-2
    job_state = Q
    ...
    Resource_List.nodect = 36
    Resource_List.nodes = 36
    Resource_List.walltime = 00:10:00
    ... 

Detailed information may help you to identify why a job is not being run: too much resources requested?

Deleting Jobs from Queue

You can delete your job using its number with the command qdel. To get the job number, first run qstat:

[<user>@login-2 ~]$ qstat
Job ID                    Name             User            Time Use S Queue
------------------------- ---------------- --------------- -------- - -----
549.login-2                   badrun           <user>                    0 R batch          
550.login-2                   goodrun          <user>                    0 R batch

If we want to cancel the job with the name badrun, we will execute

[<user>@login-2 ~]$ qdel 549

To cancel all running and queued jobs, we can use

[<user>@login-2 ~]$ qdel all

Once a job is deleted, the output and error streams will be terminated where they were, and the processes launched by your user on compute nodes belonging to the job will be terminated.

You can only delete your own jobs, and other users cannot delete your jobs.

Job Parameters

When you submit a job into the queue, you can specify job parameters to request a specific number of nodes or compute node architecture, to request a specific job duration, rename the job, redirect output, etc.

You can set job parameters either as arguments of qsub, or in the command file. Here we discuss the syntax; for specific arguments, see parameters.

Syntax: as Arguments of qsub

To set job parameters or request resources without creating a command file, you can pass parameters as command line arguments to the tool qsub.

[<user>@login-2 ~]$ echo nosuchcommand 2>&1 | qsub -o ./custom_log.txt

In this example, we merge the standard error and the standard output streams and redirect the job log to the file ./custom_log.txt

Syntax: Job Parameters as PBS Directives in Command File

If you launch your jobs using a command file, you can pass parameters as PBS directives, i.e., lines beginning with the characters #PBS at the beginning of the command file. For example, it can work like this:

[<user>@login-2 ~] cat my_command_file
#PBS -e ./custom_log.txt
nosuchcommand 2>&1

Then you just submit the command file as usual:

[<user>@login-2 ~] qsub my_command_file 

List of Parameters

Whether you use qsub arguments of PBS directives, the same parameters can be specified.

Redirecting output

Syntax:

-o <path_to_stdout_file>
-e <path_to_stderr_file>

Defines the path to the file for the standard output stream (-o) or the standard error stream (-e) of batch job.

Example:

[<user>@login-2 ~]$ echo date | qsub -o ./my_log_output.txt -e ./my_log_errors.txt

Job name

Syntax:

-N <job_name>

Declares a name for the job. This name is used by qstat and it serves as the default base for the file names of output logs. Job name must begin with an alphabetic character and may be up to 15 characters in length. If not specified, the job name is set to either the name of the command file used in qsub, or to STDIN if the job was submitted directly (with the pipe syntax).

Example:

[<user>@login-2 ~]$ echo date | qsub -N my_test_job

Working directory

Syntax:

-d <path_to_working_directory>

Defines a working directory for the job. If omitted, the home directory is used. This option sets the environment variable PBS_O_INITDIR.

Example:

[<user>@login-2 ~]$ echo pwd | qsub -d ~/my_first_project

Execution time

Syntax:

-l walltime=<time>

Requests the maximum wall clock time the job may run. Format of time is either seconds, or [[hh:]mm:]ss. By default, the maximum wall clock time is set by the queue parameter resources_default.walltime. When you request less wall clock time than default, it increases the likelihood that your job will start earlier due to the scheduler's backfill policy. When you request more wall clock time than default, you still cannot get more than what is specified by the queue parameter resources_max.walltime. To query the queue parameters, run the following command on the head node:

[<user>@login-2 ~]$ qmgr -c "list queue batch"

Example:

[<user>@login-2 ~]$ echo sleep 1000 | qsub -l walltime=00:30:00

Number of nodes

Syntax:

-l nodes=<count>:ppn=2

To run distributed-memory jobs, you can request more than one node assigned to your job. Note that your command file will only execute on one node. Your command file and application must invoke MPI or other distributed-memory frameworks to launch processing on the other nodes specified in the file referred to by the environment variable PBS_NODEFILE. You can combine a request for multiple nodes with a request for their specific features.

Example:

[<user>@login-2 ~]$ echo cat \$PBS_NODEFILE | qsub -l nodes=4:ppn=2

Command line arguments

Syntax:

-F "arg1 arg2 ..."

If you want to start multiple jobs with different parameters, you can use the same command file with different arguments.

Example:

[<user>@login-2 ~]$ cat myjob
cd $PBS_O_WORKDIR
./myexecutable $1
[<user>@login-2 ~]$ qsub myjob -F "13.2"
[<user>@login-2 ~]$ qsub myjob -F "86.0"

In this example one job will run the executable myexecutable with command line argument "13.2", and the other job with argument "86.0". The value after -F must be in quotes.

Additional information

You can read more about job submission in documentation on the resource manager Torque built by Adaptive Computing:

If you want to run a job on more than one compute node (i.e., a distributed-memory calculation), you can do so as described in section "Distributed-Memory Architecture".

Compute Node Features

You can find out the architectures and features of the compute nodes available to you by running the following command on the login node: pbsnodes. Here is an example of output:

[<user>@login-2 ~] pbsnodes
s001-n001
     state = free
     power_state = Running
     np = 2
     properties = xeon,skl,gold6128,ram192gb,net1gbe
     ntype = cluster
     jobs = 1/55615.c009,0/55628.c009
     status = rectime=1522956026,macaddr=a4:bf:01:38:e0:68,cpuclock=Fixed,varattr=,jobs=55615.c009(cput=9396,energy_used=0,mem=22596140kb,vmem=35640804kb,walltime=12167,session_id=27508) 55628.c009(cput=9,energy_used=0,mem=156504kb,vmem=2042236kb,walltime=2277,session_id=58586),state=free,netload=2027633141647,gres=,loadave=0.15,ncpus=24,physmem=196704400kb,availmem=209370396kb,totmem=213478540kb,idletime=24232,nusers=2,nsessions=5,sessions=27508 55449 58586 58722 62273,uname=Linux c009-n001 4.15.2-1.el7.elrepo.x86_64 #1 SMP Wed Feb 7 17:26:44 EST 2018 x86_64,opsys=linux
     mom_service_port = 15002
     mom_manager_port = 15003

s001-n002
     state = free
     power_state = Running
     np = 2
     properties = xeon,skl,gold6128,ram192gb,net1gbe
     ntype = cluster
     jobs = 0/55630.c009,1/55636.c009
     status = rectime=1522956027,macaddr=a4:bf:01:38:e4:82,cpuclock=Fixed,varattr=,jobs=55630.c009(cput=3,energy_used=0,mem=114048kb,vmem=1490160kb,walltime=1761,session_id=211833) 55636.c009(cput=3,energy_used=0,mem=109644kb,vmem=1353112kb,walltime=955,session_id=213279),state=free,netload=2890645291930,gres=,loadave=0.03,ncpus=24,physmem=196704400kb,availmem=212713372kb,totmem=213478540kb,idletime=345043,nusers=2,nsessions=3,sessions=211833 212874 213279,uname=Linux c009-n002 4.15.2-1.el7.elrepo.x86_64 #1 SMP Wed Feb 7 17:26:44 EST 2018 x86_64,opsys=linux
     mom_service_port = 15002
     mom_manager_port = 15003

The actual output of pbsnodes may be very long if your share of the Intel® DevCloud includes a lot of compute nodes. So you may need to pipe the output through less, dump it into a file, or scroll up. Let's analyze this example.

  • The hostnames of compute nodes are the first line in a block; in this example we have compute nodes s001-n001 and s001-n002. You can request specific nodes when you submit a job, for example, by using a job parameter such as -l nodes=s001-n001:ppn=2 as described in Selecting Nodes for Jobs.
  • The line np=2 means that there are two processing slots on this server. In other words, two jobs could run simultaneously on this node. This is the default for all nodes on the cluster. If a job is submitted without the "nodes" argument, the job will use both slots by default.
  • The line beginning with properties indicates the architectural features and properties of this node. You can use them for your information, and you can also use these properties to request specific nodes for your job. Currently, Intel DevCloud for oneAPI Projects only supports one type of compute node.

The table below describes the properties that we assign to Intel® DevCloud.

GroupFeatureExplanation
Family xeon The system is based on an Intel® Xeon® processor.
Architecture skl The system has an Intel processor of the Skylake architecture (Intel Xeon Sclabale Processors family).
Model gold6128 The system has an Intel® Xeon® Gold 6128 CPU
Memory Size ram192gb The amount of on-platform memory is 192 GiB
Networking net1gbe The node has a Gigabit Ethernet interconnect

Selecting Nodes for Jobs

When you submit a job to the queue, the scheduler will pick the first available compute node for your job. However, if you are interested in specific hardware or feature you may wish to specifically request a node of a certain kind for your job.

Requesting Nodes by Property

You can get the list of all the different properties and the number of nodes associated with the property by running the following command:

 [<user>@login-2 ~]$ pbsnodes | grep "properties =" | awk '{print $3}' | sort | uniq -c

These properties can be requested by using the -l flag of qsub. For example, to request a node with the property "foo":

 [<user>@login-2 ~]$ qsub my-job-script -l nodes=1:foo

Note: some clusters may require :ppn=2 appended to the end

It is possible to request multiple properties by using colon separated list. For example, to request a node with the property "foo" and "bar":

 [<user>@login-2 ~]$ qsub my-job-script -l nodes=1:foo:bar

Requesting Specific Nodes

While it is rarely needed, you can request a specific node for your calculation. For example, requesting node s001-n003 looks like this:

[<user>@login-2 ~]$ cat my-job-script
#PBS -l nodes=s001-n003:ppn=2
cd $PBS_O_WORKDIR
./my_application
[<user>@login-2 ~]$ qsub my-job-script 

Naturally, if you request a specific node rather than a class of nodes, your job may have to wait longer in the queue.

Distributed-Memory Architecture

In distributed-memory calculations, you will run one (or several) processes on each compute node. These processes cannot access each other's memory, but they can exchange messages over the network on which they reside.

In the Intel DevCloud for oneAPI Projects, compute nodes are interconnected with a Gigabit Ethernet network; connection to the NFS-shared storage has higher-speed uplinks. The easiest and the most portable way to handle communication in the Intel DevCloud for oneAPI Projects is using Message Passing Interface (MPI) library for communication. Intel MPI is installed on all nodes.

Requesting multiple nodes

When you launch a distributed-memory job, you have to explicitly request multiple compute nodes as described in this section. For example, to request 4 nodes, use the following arguments in the command file:

#PBS -l nodes=4:ppn=2

When the calculation starts, your command file will execute on only one node. This node is called ''mother superior''. Other nodes are called ''sister nodes''. The list of all nodes is going to be placed in the ''machine file'' created for this job. The path to the machine file is in the the environment variable PBS_NODEFILE.

Example command file:

[<user>@login-2 ~]$ cat distrJob1
#PBS -l nodes=4:ppn=2
echo "
Preparing for a distributed job.
Mother superior is `hostname`.
Machine file is $PBS_NODEFILE
Contents of the machine file:
`cat $PBS_NODEFILE`
"

Example of a parallel job (no computation yet):

[<user>@login-2 ~]$ qsub distrJob1
578.login-2
[<user>@login-2 ~]$ cat distrJob1.o578
...
Preparing for a distributed job.
Mother superior is s001-n036.
Machine file is /var/spool/torque/aux//578.login-2
Contents of the machine file:
s001-n036
s001-n035
s001-n034
s001-n033
s001-n032
s001-n031
s001-n030
s001-n029
...

Basic MPI Application

In the previous example, the command file was running only on one node. All other nodes were waiting. To start a parallel job, we will need an MPI program. Let's call it hello_mpi.cc and give it contents as below:

Basic MPI program:

#include <mpi.h>
#include <cstdio>

int length, rank;
char hostname[MPI_MAX_PROCESSOR_NAME];

int main(int argc, char **argv) {
  MPI_Init(&argc, &argv);
  MPI_Get_processor_name(hostname, &length);
  MPI_Comm_rank(MPI_COMM_WORLD, &rank);
  printf("Hello world from host %s (rank %d)!\n", hostname, rank);
  MPI_Finalize();
}

It must be compiled with a special MPI wrapper over the Intel C++ compiler:

[<user>@login-2 ~]$ mpiicpc -o hello_mpi hello_mpi.cc

Launching a Distributed-Memory Job

Now that we have the executable hello_mpi, we will need to launch it on all nodes in the machine file. To make this happen, we will include into our command file the call to mpirun. The contents of the command file (named distrJob2) that we created are shown below:

#PBS -l nodes=8:ppn=2
cd $PBS_O_WORKDIR
echo Launching the parallel job from mother superior `hostname`...
mpirun -machinefile $PBS_NODEFILE ./hello_mpi

If your cluster has fewer than 8 nodes, request fewer nodes, for example, "nodes=4:ppn=2".

Finally, now we are ready to launch a distributed calculation:

[<user>@login-2 ~]$ qsub distrJob2
584.login-2

And a few moments later our result comes out like this:

[<user>@login-2 ~]$ cat distrJob2.o584 
...
Launching the parallel job from mother superior s001-n036...
Hello world from host s001-n036 (rank 0)!
Hello world from host s001-n033 (rank 3)!
Hello world from host s001-n035 (rank 1)!
Hello world from host s001-n032 (rank 4)!
Hello world from host s001-n031 (rank 5)!
Hello world from host s001-n030 (rank 6)!
Hello world from host s001-n029 (rank 7)!
Hello world from host s001-n034 (rank 2)!
...

Hybrid OpenMP+MPI Runs

If you plan to run multiple MPI processes per compute node, you can use the argument "-n" in mpirun to specify the total number of MPI processes that you require:

#PBS -l nodes=8:ppn=2
cd $PBS_O_WORKDIR
# Run a total of 16 processes on 8 nodes.
# Ranks 0 and 8 will share host, 1 and 9, 2 and 10, etc.
mpirun -machinefile $PBS_NODEFILE -n 16 ./my_application

To place ranks with nearby numbers on the same node, you will need to tweak the machine file:

#PBS -l nodes=8:ppn=2
cd $PBS_O_WORKDIR
cat $PBS_NODEFILE | sed 's/$/:2/' > hosts.txt
# Ranks 0 and 1 will share host, 2 and 3, 4 and 5, etc.
mpirun -machinefile hosts.txt ./my_application