HPC Toolkit Getting Started Guide | Intel® Tiber™ Developer Cloud

Before you begin, please consider reading the Intel® oneAPI Base Toolkit document if you haven’t already done so. The document you are reading is a continuation of that with details about HPC samples.

The following will help you to get started with the Intel® oneAPI HPC Toolkit on the Intel® DevCloud through a basic sample, matrix-mul. This sample takes advantage of CPU and GPU accelerators using SYCL*, OpenMP* and Intel® Math Kernel Library (Intel® MKL).

Matrix Multiplication sample walkthrough

Connect to the DevCloud.
```
ssh devcloud
```

Download the samples.

git clone https://github.com/oneapi-src/oneAPI-samples.git

Go to the matrix_mul sample.

cd ~/oneAPI-samples/DirectProgramming/C++SYCL/DenseLinearAlgebra/matrix_mul

Build and run the sample in batch mode

The following describes the process of submitting build and run jobs to PBS.

A job is a script that is submitted to PBS through the qsub utility. By default, the qsub utility does not inherit the current environment variables or your current working directory. For this reason, it is necessary to submit jobs as scripts that handle the setup of the environment variables. In order to address the working directory issue, you can either use absolute paths or pass the -d <dir> option to qsub to set the working directory.

Create the job scripts

Create a build.sh script with the following contents.

#!/bin/bash
source /opt/intel/inteloneapi/setvars.sh > /dev/null 2>&1
make build_dpcpp
make build_omp

Create a run.sh script with the following contents for executing the sample.

#!/bin/bash
source /opt/intel/inteloneapi/setvars.sh > /dev/null 2>&1
make run_dpcpp
make run_omp

Build and run

Jobs submitted in batch mode are placed in a queue waiting for the necessary resources (compute nodes) to become available. The jobs will be executed on a first come basis on the first available node(s) having the requested property or label.

Build the sample on a gpu node.
```
qsub -l nodes=1:gpu:ppn=2 -d . build.sh
```
Note: -l nodes=1:gpu:ppn=2 (lower case L) is used to assign one full GPU node to the job.
Note: The -d . is used to configure the current folder as the working directory for the task.
In order to inspect the job progress, use the qstat utility.
```
watch -n 1 qstat -n -1
```
Note: The watch -n 1 command is used to run qstat -n -1 and display its results every second.
Run the sample on a gpu node after the build job completes successfully.
```
qsub -l nodes=1:gpu:ppn=2 -d . run.sh
```
Inspect the output of the sample.
```
cat run.sh.eXXXX
cat run.sh.oXXXX
```
Here XXXX is the job ID, which gets printed to the screen after each qsub command.
Remove the stdout and stderr files and clean-up the project files.
```
rm build.sh.*; rm run.sh.*; make clean
```
Disconnect from the Intel DevCloud.
```
exit
```

Build and run additional samples

Intel® oneAPI HPC Toolkit includes several sample programs, many of which can be compiled and run in a similar fashion to matrix-mul. Experiment with running the various samples on different kinds of compute nodes or adjust their source code to experiment with different workloads. The next sample we recommend is Nbody.

Intel® Tiber™ Developer Cloud for oneAPI

Matrix Multiplication sample walkthrough

Build and run the sample in batch mode

Create the job scripts

Build and run

Build and run additional samples