Technical details about Puhti

Compute

Puhti has a total of 682 CPU nodes, with a theoretical peak performance of 1,8 petaflops. Each node is equipped with two Intel Xeon processors, code name Cascade Lake, with 20 cores each running at 2,1 GHz. The cores support AVX-512 vector instructions and VNNI instructions for AI inference workloads. The interconnect is based on Mellanox HDR InfiniBand. The nodes are connected with a 100 Gbps HDR100 link, and the topology is a fat tree with a blocking factor of approximately 2:1.

The Puhti AI artificial intelligence partition has a total of 80 GPU nodes with a total peak performance of 2,7 petaflops. Each node has two latest generation Intel Xeon processors, code name Cascade Lake, with 20 cores each running at 2,1 GHz. They also have four Nvidia Volta V100 GPUs with 32 GB of memory each. The nodes are equipped with 384 GB of main memory and 3,6 TB of fast local storage. This partition is engineered to allow GPU-intensive workloads to scale well across multiple nodes. The interconnect is based on a dual-rail HDR100 interconnect network connectivity providing 200 Gbps of aggregate bandwidth in a non-blocking fat-tree topology.

Nodes

Name	Number of nodes	Compute	Cores	Memory	Local disk
M	484	Xeon Gold 6230	2 x 20 cores @ 2,1 GHz	192 GiB
M-IO	48	Xeon Gold 6230	2 x 20 cores @ 2,1 GHz	192 GiB	1490 GiB
L	92	Xeon Gold 6230	2 x 20 cores @ 2,1 GHz	384 GiB
L-IO	40	Xeon Gold 6230	2 x 20 cores @ 2,1 GHz	384 GiB	3600 GiB
XL	12	Xeon Gold 6230	2 x 20 cores @ 2,1 GHz	768 GiB	1490 GiB
BM-IO	6	Xeon Gold 6230	2 x 20 cores @ 2,1 GHz	1,5 TiB	5960 GiB
GPU	80	Xeon Gold 6230 Nvidia V100	2 x 20 cores @ 2,1 GHz 4 GPUs connected with NVLink	384 GiB 4 x 32 GB	3600 GiB

In addition to the compute nodes above, Puhti has two login nodes with 40 cores and 2900 GiB local disk each.

Storage

Puhti has a 4.8 PB Lustre parallel storage system providing space for home, project and scratch storages.

Current Lustre configuration for Puhti is:

Storage area	# OSTs	# MDTs
home	24	4
projappl	24	4
scratch	24	4

Please see Lustre documentation for the terminology.

All the OSTs and MDTs are shared across the storage areas, thus the performance should be similar between them. The peak I/O performance in Puhti is around ~50 GB/s with 64 compute nodes under dedicated access to the system.