Getting started with Mahti

This is a quick start guide for Mahti users. It is assumed that you have previously used CSC supercomputing resources like Puhti, Sisu or Taito. If not, you can start by looking at overview of CSC supercomputers.

Go to MyCSC to apply for access to Mahti or view your projects and their project numbers if you already have access. You can also use the command csc-projects.

Getting started with Mahti

Connecting to Mahti

Connect using a normal ssh-client:

ssh yourcscusername@mahti.csc.fi

Where yourcscusername is the username you get from CSC.

Module system

Modules are set up in a hierarchical fashion, meaning you need to load a compiler before MPI and other libraries appear. CSC uses the Lmod module system. See more information about modules.

Default modules, which are loaded automatically, are gcc/11.2.0, openmpi/4.1.2 and openblas/0.3.18-omp.

Compilers and MPI

Currently, Mahti has GNU compiler suites (versions 11.2.0, 9.4.0 and 8.5.0), as well as AMD compiler suites. All compiler suites can be used with the mpicc (C), mpicxx (C++), or mpif90 (Fortran) wrappers. We recommend to start with the GNU compiler suite, but for some applications the other suites may provide better performance.

In Mahti, many applications benefit from hybrid MPI/OpenMP parallelization, so it is recommended to build a hybrid version if it is supported by your application.

See more information about compilers.

Note

You need to have the MPI module loaded when submitting your jobs.

High performance libraries

Mahti has several high performance libraries installed, see more information about libraries.

Applications

More information about specific applications can be found here. Note, the pre-installed selection is not as large as on Puhti.

Running jobs

Like Puhti, Mahti uses the Slurm batch job system. A description of the different Slurm partitions can be found here.

Instructions on how to submit jobs can be found here and example batch job scripts are found here.

Performance considerations

In Mahti, many applications benefit from hybrid MPI/OpenMP parallelization, however, the optimum ratio of MPI tasks and OpenMP threads depends a lot on the particular application as well as on particular input dataset. Mahti supports also simultaneous multithreading (SMT), i.e. two threads can be run on the same physical CPU core. Benefits of multithreading depend also on the application, in some cases it improves performance while in some cases performance becomes worse. Binding of threads to CPU cores can also have an impact on performance.

More information about controlling hybrid applications can be found here.

Storage

The project based shared storage can be found under /scratch/<project>. Note that this folder is shared by all users in a project. This folder is not meant for long term data storage and files that have not been used for 180 days will be automatically removed. Note that this policy is currently implemented only on Puhti.

The default quota for this folder is 1 TB. There is also a persistent project based storage with a default quota of 50 GB. It is located under /projappl/<project>. Each user can store up to 10 GB of data in their home directory ($HOME).

The disk areas for different supercomputers are separate, i.e. home, projappl and scratch in Puhti cannot be directly accessed from Mahti.

You can check your current disk usage with csc-workspaces. More detailed information about storage can be found here and a guideline on managing data on the /scratch disk here. See also the LUE tool for efficiently querying how much data/files you have in a directory.

Moving data between Mahti and Puhti

Data can be moved between supercomputers via Allas by first uploading the data from one supercomputer and then downloading it to the other. This is the recommended approach if the data should also be preserved for a longer time.

Data can also be moved directly between the supercomputers with the rsync command. For example, in order to copy my_results (which can be either a file or a directory) from Puhti to the directory /scratch/project_2002291 in Mahti, one can issue in Puhti the command:

rsync -azP my_results <username>@mahti.csc.fi:/scratch/project_2002291

See Using rsync for more detailed instructions for rsync.

How Mahti and Puhti differ?

If you are new to supercomputers, or the details below are unfamiliar, you likely should start with Puhti and some introductory tutorials first. In a nutshell, Mahti is meant for large parallel jobs, and Puhti for a wide variety of small to medium sized jobs including special resources.

Resource	Mahti	Puhti
Resources are granted	By full nodes	By finer detail (cores/memory/...)
Minimum job size	128 cores (1 node)	1 core (1/40 node)
Maxmimum job size (cores)	200 nodes* (25600)	26 nodes (1040)
Memory per node (average per core)	245 GB (2 GB)	192 - 1500 GB (4 - 37 GB)
GPUs	NVIDIA A100	NVIDIA V100
Fast local disk	(only on GPU nodes)	yes (NMVe)
Preinstalled applications	~30	~120

*and even more via Grand Challenge calls.