Home Directory
Your files in the Intel Tiber Developer Cloud for oneAPI are stored in your home directory. The path to the home directory is /home/<user>. This directory is shared with the NFS protocol between the login node, the compute nodes, and the JupyterLab* servers. In other words, the files that you put in your home directly do not just live on the host where you created them, they are propagated to a centralized storage server and are accessible from all other hosts accessible to you in Intel Tiber Developer Cloud for oneAPI. As a consequence,
- The paths to the files in your home directory are the same on all hosts, including:
- the login node,
- compute nodes where your jobs are executed, and
- hosts where you run a JupyterLab* server;
- Any changes made to files in your home directory are automatically propagated to all other hosts;
- The data written by your jobs to the home directory persists after the job has ended;
- The file input/output operations to your files go through the 10 GbE network used by the NFS protocol. This may sometimes limit the rate at which you can access the files compared to local drive access. That is especially noticeable in situations where your applications concurrently write to your files from multiple hosts or where other users are consuming the network bandwidth.
In situations where you need to perform faster file I/O than NFS allows or need consistency in file I/O performance, you can use local file storage as described in the next section.
Local Storage for Jobs
When you submit a job to a compute node, the resource manager will create an empty scratch directory for you. The path to this directory is stored in the environment variable PBS_SCRATCHDIR.
- Unlike your home directory, the scratch directory is not NFS-shared with all compute nodes and exists only on the compute node that executes the job.
- Any data you put in the scratch directory will be deleted when the job is finished.
- The scratch directory is stored on the compute node's local drive (usually an SSD) and is backed by the operating system's disk cache. This means that you can usually get faster and more reproducible file I/O performance on the files in the scratch directory than on the files in your home directory.
The code below is a snippet of a job script that illustrates how you could use the scratch directory for an application sensitive to the file read peformance.
cd ${PBS_O_WORKDIR} # this is the directory inside of /home/<user> where the job is executed
cp -R ./my_dataset ${PBS_SCRATCHDIR}/ # copying the dataset from the home directory to the local directory
# Launching the application. In this example, it will read files from the local directory
# but write the results to the home directory. This is useful, e.g., in a DNN training application
# that needs to read the same files quickly multiple times but write the results only
# at the end of the job.
./my_app --path-to-input-dataset=${PBS_SCRATCHDIR}/my_dataset --output-path=${PBS_O_WORKDIR}
The code below illustrates the usage of the scratch directory for an application sensitive to the file write peformance.
cd ${PBS_O_WORKDIR} # this is the directory inside of /home/<user> where the job is executed
# Preparing an output directory
mkdir ${PBS_SCRATCHDIR}/output
# Launching the application. In this example, it will read files from the home directory
# but write the results to the local directory. This is useful, e.g., in an HPC application
# that needs to read only a small amount of data in configuration files but output
# a lot of data for every iteration of the calculation. To prevent file I/O from impacting
# the performance measurement, we will write to the fast local storage.
./my_app --path-to-input-dataset=./my_dataset --output-path=${PBS_SCRATCHDIR}/output
# Saving the output data to the home directory so that it persists after the job
cp -R ${PBS_SCRATCHDIR}/output ./ # copying the output from the local directory to the home directory