MATLAB
MATLAB is a high-level technical computing language and interactive environment for algorithm development, data visualization, data analysis, and numeric computation.
License
MATLAB is proprietary software.
Available
Puhti
Puhti has MATLAB installations for interactive use and batch jobs. The interactive MATLAB is intended for temporary, light pre- and postprocessing of data. It is available as follows:
- License: Academic
- Versions: R2023b
- Toolboxes: MATLAB Compiler, MATLAB Compiler SDK, Parallel Computing Toolbox. There are 2 licenses for each toolbox.
MATLAB Parallel Server (MPS) allows sending work as a batch job from a local MATLAB installation to Puhti. It is available as follows:
- License: Academic
- Versions: R2023b, R2023a
- Toolboxes: MATLAB Parallel Server. There is license for using upto 500 computing cores simultaneously. Furthermore, toolboxes that you have license on your local MATLAB license can also be used with MATLAB Parallel Server.
The academic license allows use only for the affiliates, that is staff and students, of Finnish higher education institutions. If you are a user from a commercial company or Finnish research institute, please contact CSC Service Desk for further instructions.
LUMI
LUMI has MATLAB an installation for interactive use.
- License: Academic
- Versions: R2023b
- Toolboxes: Simulink, Control System Toolbox, Curve Fitting Toolbox, Deep Learning Toolbox, Global Optimization Toolbox, Image Processing Toolbox, Optimization Toolbox, Parallel Computing Toolbox, Signal Processing Toolbox, Statistics and Machine Learning Toolbox, Wavelet Toolbox. There are 25 licenses of each toolbox.
The academic license allows use only for teaching and academic research at a degree-granting institute.
Using interactive MATLAB on Puhti and LUMI
Command-line interface
We can run an interactive MATLAB session on the command line. We first need to make a reservation using Slurm:
srun --account=project_id --partition=small --time=0:15:00 --cpus-per-task=1 --mem-per-cpu=4g --pty bash
Then, we need to load the MATLAB module:
module load matlab
On LUMI, we must add the module files under CSC's local directory to the module path before loading the module.
module use /appl/local/csc/modulefiles
module load matlab
Now matlab
, mbuild
, mex
and mcc
commands are available.
For example, we can open the MATLAB command line interface as follows:
matlab -nodisplay
We can also run MATLAB scripts using the batch mode as follows:
matlab -batch <script>
Web interface
We can also use the web interface for interactive MATLAB sessions. First, we need to log into www.puhti.csc.fi or www.lumi.csc.fi. Then, we have two options:
-
We can use MATLAB web application which opens a web version of the MATLAB graphical user interface.
-
We can use the Desktop application and click the MATLAB icon to open the desktop version of MATLAB graphical user interface.
On the LUMI Desktop Application, Matlab can be found via the menu button in the bottom left corner. Simply search for matlab and click the icon / drag it to the desktop to easily find it again.
We need to set atleast 4 GB of memory before launching the MATLAB application.
Parallel computing on MATLAB
In MATLAB, we can parallelize code using the high-level contructs from the Parallel Computing Toolbox.
Consider the following serial code written in funcSerial.m
file that pauses for one second n
times and measures the execution time:
function t = funcSerial(n)
t0 = tic;
for idx = 1:n
pause(1);
end
t = toc(t0);
end
The following serial execution should run for around two seconds:
funcSerial(2)
We can parallelize the function using the parallel for-loop construct, parfor
, written into funcParallel.m
file as follows:
function t = funcParallel(n)
t0 = tic;
parfor idx = 1:n
pause(1);
end
t = toc(t0);
end
To run parallel code, we need to create a parallel pool using processes or threads and then run the parallel code. We can create a parallel pool using two processes and run the parallel code with the same argument as serial but it should only take around one second:
pool = parpool('Processes', 2);
funcParallel(2)
delete(pool);
Same using parallel pool with threads:
pool = parpool('Threads', 2);
funcParallel(2)
delete(pool);
With MATLAB Parallel Server we can also create parallel pools to Puhti and run parallel code there.
Using MATLAB Parallel Server on Puhti
Configuring MPS on local MATLAB
Puhti's MATLAB Parallel Server (MPS) allows users to send batch jobs from a local MATLAB session to the Puhti cluster. Using Puhti MPS requires a local MATLAB installation with a supported MATLAB version and the Parallel Computing Toolbox and access to the Puhti cluster. We can configure MPS on a local computer using the following instructions.
- Log in and out to Puhti via SSH client to ensure you have a home directory.
- Download the MPS configuration scripts for Puhti.
- Unzip the downloaded archive into a chosen directory.
On Linux and macOS, MATLAB stores local configurations in
~/.matlab
directory. We can place the files there as follows:On Windows, we can use themkdir -p ~/.matlab unzip ~/Downloads/mps_puhti.zip -d ~/.matlab
%AppData%\Mathworks\MATLAB
directory to store the configurations. - Set the directory the MATLAB path using
addpath
andsavepath
functions in MATLAB as follows:addpath("~/.matlab/mps_puhti") savepath
- Configure your MATLAB to submit jobs to Puhti by calling
configCluster
in MATLAB and supply your username to the prompt as follows:configCluster % Username on Puhti (e.g. jdoe): >>username
Submitting serial jobs
Before submitting the batch job, we have to specify the resource reservation using parcluster
.
Because the parcluster
is stateful, it is safest to explicitly unset properties we don't use by setting them to the empty string ''
.
Furthermore, CPUsPerNode
is set automatically by the batch
command, thus we unset it.
For example, a simple CPU reservation looks as follows:
c = parcluster;
c.AdditionalProperties.ComputingProject = 'project_<id>';
c.AdditionalProperties.Partition = 'small';
c.AdditionalProperties.WallTime = '00:15:00';
c.AdditionalProperties.CPUsPerNode = '';
c.AdditionalProperties.MemPerCPU = '4g';
c.AdditionalProperties.GpuCard = '';
c.AdditionalProperties.GPUsPerNode = '';
c.AdditionalProperties.EmailAddress = '';
Now, we can use the batch
function to submit a job to Puhti.
It returns a job object which we can use to access the output of the submitted job.
The first time you submit a job, MATLAB will prompt you whether to use a password or an SSH key for authentication.
- If you choose to use a password, MATLAB will ask your password to Puhti.
- If you choose to use an SSH key, MATLAB will ask the path the your private key and whether the key requires a password. MATLAB stores the path to your key and will not ask for it later.
We can submit a simple test job that returns the current working directory as follows:
j = batch(c, @pwd, 1, {}, 'CurrentFolder', '.', 'AutoAddClientPath', false)
In the example, we set the working directory to the home directory by setting 'CurrentFolder'
to '.'
.
Also, we should disable MATLAB from adding the local MATLAB search path to the remote workers by setting 'AutoAddClientPath'
to false
.
Submitting parallel jobs
Let's create a reservation:
c = parcluster;
c.AdditionalProperties.ComputingProject = 'project_<id>';
c.AdditionalProperties.Partition = 'small';
c.AdditionalProperties.WallTime = '00:15:00';
c.AdditionalProperties.CPUsPerNode = '';
c.AdditionalProperties.MemPerCPU = '4g';
c.AdditionalProperties.GpuCard = '';
c.AdditionalProperties.GPUsPerNode = '';
c.AdditionalProperties.EmailAddress = '';
Now, we can use the batch command to create a parallel pool of workers by setting the 'Pool'
argument to the amount of cores we want to reserve.
For example, we can submit a parallel job to eight cores as follows:
j = batch(c, @funcParallel, 1, {8}, 'Pool', 8, 'CurrentFolder', '.', 'AutoAddClientPath', false)
Note that the parallel pool will always request one additional CPU core to manage the batch job and pool of cores. For example, a job that needs eight cores will consume nine CPU cores.
Submitting serial GPU jobs
We can create a GPU reservation by setting the appropriate values for the Partition
, GpuCard
, and GPUsPerNode
properties.
For example, a single GPU reservation looks as follows:
c = parcluster;
c.AdditionalProperties.ComputingProject = 'project_<id>';
c.AdditionalProperties.Partition = 'gpu';
c.AdditionalProperties.WallTime = '00:15:00';
c.AdditionalProperties.CPUsPerNode = 1;
c.AdditionalProperties.MemPerCPU = '4g';
c.AdditionalProperties.GpuCard = 'v100';
c.AdditionalProperties.GPUsPerNode = 1;
c.AdditionalProperties.EmailAddress = '';
Now, we can submit a simple GPU job that queries the available GPU device as follows:
j = batch(c, @gpuDevice, 1, {}, 'CurrentFolder', '.', 'AutoAddClientPath', false)
Querying jobs and output
To retrieve a list of currently running or completed jobs, use
c = parcluster;
c.Jobs
Get a handle to the job with sequence number 1
j = c.Jobs(1);
Once we have a handle to the cluster, we'll call the findJob
method to search for the job with the specified job ID, on example below ID = 11
.
j = findJob(c, 'ID', 11);
Once the job has been completed, we can fetch the function outputs as follows:
fetchOutputs(j)
Data that has been written to files on the cluster needs to be retrieved directly from the file system.