# Connect to DevCloud

Sign In to Connect to the DevCloud using SSH Clients.

# Hello World!

Get Started by running a simple sample on DevCloud.

# Run Base Toolkit Samples on DevCloud

Explore the samples already installed in Step 2.

# Direct Programming/C++

#### MandelbrotOMP sample

Sector

This sample demonstrates how to accelerate program performance with SIMD and parallelization using OpenMP*, in the context of calculating the Mandelbrot set.

View code on GitHub*#### openMP Reduction Sample

Sector

The openmp_reduction code sample is a simple program that calculates pi. This program is implemented using C++ and openMP for Intel CPU and accelerators.

View code on GitHub*#### ISO3DFD Open MP Offload Sample

Sector

The ISO3DFD sample refers to Three-Dimensional Finite-Difference Wave Propagation in Isotropic Media. It is a three-dimensional stencil to simulate a wave propagating in a 3D isotropic medium and shows some of the more common challenges and techniques when targeting OMP Offload devices (GPU) in more complex applications to achieve good performance.

View code on GitHub*# Direct Programming/DPC++

#### Vector-Add

Sector

This simple vector-add program in Data Parallel C++ (DPC++) supports FPGAs, GPUs, and CPUs.

View code on GitHub*#### Mandelbrot Sample

Sector

Mandelbrot is an infinitely complex fractal patterning that is derived from a simple formula. It demonstrates using DPC++ for offloading computations to a GPU (or other devices) and shows how processing time can be optimized and improved with parallelism.

View code on GitHub*#### Complex Multiplication Sample

Sector

Complex multiplication is a program that multiplies two large vectors of Complex numbers in parallel and verifies the results. It also implements a custom device selector to target a specific vendor device. This program is implemented using C++ and DPC++ language for Intel CPU and accelerators. The Complex class is a custom class, and this program shows how we can use custom types of classes in a DPC++ program.

View code on GitHub*#### Matrix Mul Sample

Sector

Matrix_mul is a simple program that multiplies together two large matrices and verifies the results. This program is implemented using two ways: 1. Data Parallel C++ (DPC++) 2. OpenMP (omp)

View code on GitHub*#### Simple add DPC++ Sample

Sector

Provides the simplest example of DPC++ while providing an example of using both buffers and Unified Shared Memory.

View code on GitHub*#### All Pairs Shortest Paths Sample

Sector

This sample uses the Floyd-Warshall algorithm to find the shortest paths between pairs of vertices in a graph. It uses a parallel blocked algorithm that enables the application to offload compute intensive work to the GPU efficiently.

View code on GitHub*#### Bitonic Sort Sample

Sector

This code sample demonstrates the implementation of bitonic sort using Intel Data Parallel C++ to offload the computation to a GPU. In this implementation, a random sequence of 2**n elements is given (n is a positive number) as input, and the algorithm sorts the sequence in parallel. The result sequence is in ascending order.

View code on GitHub*#### DPC++ Hidden Markov Model Sample

Sector

The HMM (Hidden Markov Model) sample presents a statistical model using a Markov process to present graphable nodes that are otherwise in an unobservable state or “hidden”. This technique helps with pattern recognition such as speech, handwriting, gesture recognition, part-of-speech tagging, partial discharges and bioinformatics. The sample offloads the complexity of the Markov process to the GPU.

View code on GitHub*#### Monte Carlo Pi Sample

Sector

Monte Carlo Simulation is a broad category of computation that utilizes statistical analysis to reach a result. This sample uses the Monte Carlo Procedure to estimate the value of pi. By inscribing a circle of radius 1 inside a 2x2 square and then sampling a large number of random coordinates falling uniformly within the square, the value of pi can be estimated using the ratio of samples that fall inside the circle divided by the total number of samples.

View code on GitHub*#### Nbody Sample

Sector

An N-body simulation is a simulation of a dynamical system of particles, usually under the influence of physical forces, such as gravity. This Nbody sample code is implemented using C++ and DPC++ language for Intel CPU and GPU.

View code on GitHub*#### DPC++ Open CLTM Interoperability Example

Sector

This example demonstrates how DPC++ can interact with OpenCL™. This code sample will show programmers to incrementally migrate from OpenCL to DPC++. Two usage scenarios are shown. First is a DPC++ program that compiles and runs an OpenCL kernel. The second program converts OpenCL objects to DPC++.

View code on GitHub*#### Prefix Sum Sample

Sector

This code sample demonstrates the implementation of parallel prefix sum using Intel$reg; oneAPI Data Parallel C++ (DPC++) to offload the computation to a GPU. In this implementation, a random sequence of 2**n elements is given (n is a positive number) as input. The algorithm computes the prefix sum in parallel. The result sequence is in ascending order.

View code on GitHub*#### DCP Reduce Sample

Sector

The dpc_reduce is a simple program that calculates pi. This program is implemented using C++ and Intel® oneAPI Data Parallel C++ (DPC++) for Intel CPU and accelerators. This code sample also demonstrates how to incorporate DPC++ into an MPI program.

View code on GitHub*#### Histogram Sample

Sector

This sample demonstrates a histogram that groups numbers together and provides the count of a particular number in the input. In this sample we are using dpstd APIs to offload the computation to the selected device.

View code on GitHub*#### Unrolling Loops Sample

Sector

The Loop Unroll demonstrates a simple example of unrolling loops to improve the throughput of a DPC++ program for GPU offload.

View code on GitHub*#### Sparse Matrix Vector Sample

Sector

Sparse Matrix Vector sample provides a parallel implementation of a merge based sparse matrix and vector multiplication algorithm using DPC++.

View code on GitHub*#### DPC++ Discrete Cosine Transform Sample

Sector

Discrete Cosine Transform (DCT) and Quantization are the first two steps in the JPEG compression standard. This sample demonstrates how DCT and Quantizing stages can be implemented to run faster using Data Parallel C++ (DPC++) by offloading image processing work to a GPU or other device.

View code on GitHub*#### 1D Heat Transfer Sample

Sector

This code sample demonstrates the simulation of a one-dimensional heat transfer process using Intel Data Parallel C++. Kernels in this example are implemented as a discretized differential equation with the second derivative in space and the first derivative in time

View code on GitHub*#### ISO2DFD Sample

Sector

The ISO2DFD sample refers to Two-Dimensional Finite-Difference Wave Propagation in Isotropic Media. It is a two-dimensional stencil to simulate a wave propagating in a 2D isotropic medium and illustrates the basics of the DPC++ programming language using direct programming

View code on GitHub*#### Water Molecule Diffusion Sample

Sector

This code sample implements a simple example of a Monte Carlo simulation of water molecules' diffusion in tissue. This kind of computational experiment can be used to simulate the acquisition of a diffusion signal for dMRI

View code on GitHub*#### IS03DFD Sample

Sector

The ISO3DFD sample refers to Three-Dimensional Finite-Difference Wave Propagation in Isotropic Media. It is a three-dimensional stencil to simulate a wave propagating in a 3D isotropic medium. It shows some of the more common challenges when targeting SYCL devices (GPU/CPU) in more complex applications.

View code on GitHub*# Direct Programming/DPC++ FPGA

#### CRR Binomial Tree Model for Option Pricing

Sector

This sample implements the Cox-Ross-Rubinstein (CRR) binomial tree model that is used in the finance field for American exercise options with five Greeks (delta, gamma, theta, vega and rho). The simple idea is to model all possible asset price paths using a binomial tree

View code on GitHub*#### GZIP Compression

Sector

This DPC++ reference design implements a compression algorithm. The implementation is optimized for the FPGA device

View code on GitHub*#### MVDR Beamforming

Sector

This reference design demonstrates IO streaming in DPC++ on an FPGA for a large system.

View code on GitHub*#### QR Decomposition of Matrices

Sector

This DPC++ reference design demonstrates high performance QR decomposition of complex matrices on FPGA

View code on GitHub*# Tools

#### Matrix Multiply Advisor

Sector

A sample containing multiple implementations of matrix multiplication code sample and is implemented using the DPC++ language for CPU and GPU

View code on GitHub*