Intel® Tiber™ Developer Cloud for oneAPI

Overview Get Started Documentation Forum external link

Launch and manage jobs

How to create a job script

In case your environment is lacking an editor that you are familiar with, you can use the cat utility to open a file for writing:

cat > job.sh

Make new box that contains OpenCL startup script (possibly more than one box. May have device dependencies)

Arria 10:

source /glob/development-tools/versions/fpgasupportstack/a10/1.2.1/intelFPGA_pro/hld/init_opencl.sh
source /glob/development-tools/versions/fpgasupportstack/a10/1.2.1/inteldevstack/init_env.sh
export FPGA_BBB_CCI_SRC=/usr/local/intel-fpga-bbb
export PATH=/glob/intel-python/python2/bin:${PATH}

Arria 10:

source /glob/development-tools/versions/fpgasupportstack/d5005/2.0.1/inteldevstack/init_env.sh
source /glob/development-tools/versions/fpgasupportstack/d5005/2.0.1/inteldevstack/hld/init_opencl.sh
export FPGA_BBB_CCI_SRC=/usr/local/intel-fpga-bbb
export PATH=/glob/intel-python/python2/bin:${PATH}

How to submit a batch job

qsub -l nodes=1:gpu:ppn=2 -d . job.sh

Note: -l nodes=1:gpu:ppn=2 (lower case L) is used to assign one full GPU node to the job.
Note: The -d . is used to configure the current folder as the working directory for the task.
Note: job.sh is the script that gets executed on the compute node.

How to request interactive mode

qsub -I -l nodes=1:gpu:ppn=2 -d .

Note: -I (upper case i) is the argument used to request an interactive session.

How to validate a job script

In order to test your job script, you first request access to a compute node in interactive mode. At the new prompt:
bash job.sh
exit

How to measure job execution time

One option is to add timestamps to the job stdout. This is what a job script could look like after adding start & stop timestamps:
#!/bin/bash

echo
echo start: $(date "+%y%m%d.%H%M%S.%3N")
echo

# TODO list

echo
echo stop:  $(date "+%y%m%d.%H%M%S.%3N")
echo

How to run a job after a dependent job completed successfully

A job can be configured to be triggered automatically after another job completes successfully. For example, in the case of the FPGA HW, the execution of a workload can be triggered automatically after the compilation completes successfully:

  1. Submit the build FPGA HW job and take note of the job ID XXXX that is returned:
    qsub -l nodes=1:fpga_compile:ppn=2 -d . build_fpga_hw.sh
  2. Use the -W depend=afterok:[job_id] argument when submitting the FPGA HW execution job:
    qsub -l nodes=1:fpga_runtime:<fpga type>:ppn=2 -d . run_fpga_hw.sh -W depend=afterok:XXXX
    Note: <fpga type> must match the device type you targeted when compiling your executable

How to change job timeout (max = 24h)

By default, any jobs will be terminated automatically at the 6h mark. Use the following syntax if your job requires more than 6h to complete:

qsub […] -l walltime=hh:mm:ss

How to monitor jobs

watch -n 1 qstat -n -1

How to terminate a job

qdel <job_id>

Accessing compute nodes

How to request an interactive shell (upper-case ‘i’)

qsub -I […]

How to request a node by node property. (lower-case ‘L’)

qsub […] -l nodes=1:[property]:ppn=2

How to request a node by node name. (lower-case ‘L’)

qsub […] -l nodes=[node_name]:ppn=2

How to list all compute nodes and their properties

pbsnodes

How to list the free compute nodes (lower-case ‘L’)

pbsnodes -l free

Listing compute node properties

$ pbsnodes | sort | grep properties 

Example output:

properties = core,cfl,i9-10920x,ram32gb,net1gbe,iris_xe_max,dual_gpu

properties = core,cfl,i9-10920x,ram32gb,net1gbe,iris_xe_max,quad_gpu

properties = xeon,cfl,e-2176g,ram64gb,net1gbe,gpu,gen9 

properties = xeon,clx,ram192gb,net1gbe

properties = xeon,skl,gold6128,ram192gb,net1gbe,fpga,arria10,fpga_runtime

properties = xeon,skl,gold6128,ram192gb,net1gbe,jupyter,batch

properties = xeon,skl,gold6128,ram192gb,net1gbe,jupyter,batch,fpga_compile

properties = xeon,skl,plat8153,ram384gb,net1gbe,renderkit

The properties are used to describe various capabilities available on the compute nodes like: CPU type & name, accelerator type and name, available DRAM, type of interconnect, number of accelerator devices available and their type and intended or recommended use.

Some of the properties describe classes of devices:

  • core
  • fpga
  • gpu
  • xeon

Other properties describe the devices by name (includes nda):

  • arria10
  • stratix10
  • e-2176g
  • gen9
  • gold6128
  • i9-10920x
  • iris_xe_max
  • plat8153

Number of devices:

  • dual_gpu
  • quad_gpu

Intended use:

  • batch
  • fpga_compile
  • fpga_runtime
  • jupyter
  • renderkit
  • fpga_opencl_compile
  • fpga_opencl_runtime

Targeting specific compute nodes

For a full reference of PBS utilities please check this resource: TORQUE PBS - Commands Overview

As an example of how to target specific compute nodes on the DevCloud let’s look at the compute nodes equipped with Intel® Graphics cards. At the time of this writing, by running the pbsnodes utility, we observe the following list of compute node properties:

pbsnodes | sort | grep properties | grep gpu 

properties = xeon,cfl,e-2176g,ram64gb,net1gbe,gpu,gen9

properties = core,cfl,i9-10920x,ram32gb,net1gbe,gpu,iris_xe_max,dual_gpu

properties = core,cfl,i9-10920x,ram32gb,net1gbe,gpu,iris_xe_max,quad_gpu

How to target specific GPUs

The command for submitting a job to a compute node hosting a GPU is: 

qsub -l nodes=1:gpu:ppn=2 job_script.sh 

This will submit the job script to the first available compute node that hosts a GPU. That could be either the Intel® UHD Graphics P630 or the Intel® Iris® Xe MAX Graphics.

We can get more specific. In order to submit a job to an Intel® UHD Graphics P630 use the gen9 property:

qsub -l nodes=1:gen9:ppn=2 job_script.sh 

In order to submit a job to a compute node hosting Intel® Iris® Xe MAX Graphics cards use the iris_xe_max property:

qsub -l nodes=1:iris_xe_max:ppn=2 job_script.sh 

This command will issue either a dual or a quad Intel® Iris® Xe MAX Graphics compute node. In order to request a dual Intel® Iris® Xe MAX Graphics compute node, use both the iris_xe_max and dual_gpu properties at the same time:

qsub -l nodes=1:iris_xe_max:dual_gpu:ppn=2 job_script.sh 

Similarly, in order to request a quad Intel® Iris® Xe MAX Graphics compute node: 

qsub -l nodes=1:iris_xe_max:quad_gpu:ppn=2 job_script.sh 

How to target specific FPGA on FPGA Server

qsub -q batch@v-qsvr-fpga -I -l nodes=darby:ppn=2
qsub -q batch@v-qsvr-fpga -I -l nodes=arria10:ppn=2

Transferring Files

You can transfer files from the DevCloud to your local system, or transfer from your local system to the DevCloud.

Upload to DevCloud

Open a terminal on your local system that is not connected to the DevCloud.

To upload a file use the syntax:

scp <FILE_NAME> devcloud:<PATH_TO_DESTINATION>

First navigate to the folder where the target file is located. In the example below, the target file is in a folder titled My-Project and the file name is my-application.py. The command below will transfer my-application.py to the DevCloud root folder (~/).

If the transfer is successful, you will see output indicating that the transfer is 100% complete.

In a separate terminal, log in to DevCloud to verify the file transfer.

Download from DevCloud

Open a local terminal that is not connected to the DevCloud.

To download a file, use the syntax:

scp devcloud:<PATH_TO_FILE>/<FILE_NAME> .

First navigate to the folder where you want the file downloaded. In the example below, the target folder is titled My-Reports.

In this example, we will download the file my-report.txt from the root folder of DevCloud.

scp devcloud:~/my-report.txt

If the transfer is successful, you will see output indicating that the transfer is 100% complete and the file will be in your folder.