PL

ICM computer systems

The table below contains essential characteristics of the ICM computer systems. Upon logging into the ICM access node (hpc.icm.edu.pl), it is possible to ssh further onto the other supercomputers and clusters (e.g. Okeanos of Rysy).

Info

Submitting jobs to the topola cluster is possible directly from the access node, hpc.icm.edu.pl. To submit jobs for other systems, an additional ssh login step is required (e.g. ssh okeanos or ssh rysy.

Name	Type	Architecture	No of compute nodes	Node parameters
Okeanos	Supercomputer	Intel Haswell Cray XC40	1084	24 cores, 128 GB RAM
Topola	HPC cluster, PL-Grid cluster	Intel Haswell Huawei E9000	223	28 cores, 64/128 GB RAM
Rysy/GPU	GPU cluster	Intel Skylake, NVIDIA Volta	6	36 cores, 380 GB RAM, 4x GPU V100 32GB
Rysy/GPU	GPU cluster	Intel Skylake, NVIDIA Volta	1	48 cores, 1500 GB RAM, 8x GPU V100 16GB
Rysy/GPU	Klaster GPU,	Intel Haswell, NVIDIA Titan X (Pascal)	1	24 rdzeni, 760 GB pamięci RAM, 8x GPU Titan X (Pascal) 12GB
Rysy/PBaran	Vector computer, NEC Aurora A300-8	Intel Skylake, NEC SX-Aurora Tsubasa	1	24 cores, 192 GB RAM / 8 x 8 cores, 8 x 48 GB RAM

Okeanos supercomputer

Since July 2016, ICM UW provides Okeanos supercomputer - Cray XC40 large-scale processing system. Okeanos has more than 1000 compute nodes, each with two 12-core Intel Xeon Haswell CPU and 128 GB of RAM. All the compute nodes are interconnected with Cray Aries network with Dragonfly topology.

Answering the ICM technological requirements, Cray Inc. has built an HPC system solution unique in the country. High power, scalability, and highly efficient graph data processing are among the few of its features. Okeanos is effectively used and dedicated to all kinds of large-scale computational tasks that require many thousands of CPU cores and dozens of terabytes of RAM in a single run.

Together with the analytics systems and data storage solutions, Okeanos is the main component of the OCEAN Competence Centre established in the new ICM Data Centre in Warsaw (Białołęka).

Okeanos

Okeanos supercomputer details

Installation:                 Cray XC40
Name:                         Okeanos
CPU:                          Intel Xeon E5-2690 v3
Architecture:                 x86_64
Data representation:          little-endian
CPU frequency:                2.6 GHz
No of CPUs per node:          2 x 12 rdzenie (Hyperthreading x2)
Sockets - Cores - Threads:    2-12-2
RAM per node:                 128 GB
Filesystem:                   Lustre (rozproszony system plików)
Operating system:             SUSE Linux Enterprise Server 15
Scheduling system:            slurm 19.05.4

Topola cluster

Topola cluster details

Installation:                 Cluster
Name:                         Topola
CPU:                          Intel(R) Xeon(R) CPU E5-2650 v3
Architecture:                 x86_64
Data representation:          little-endian
CPU frequency:                2.0 - 3.1GHz
No of CPUs per node:          28 rdzeni
Sockets - Cores - Threads:    2-14-1
RAM per node:                 64/128 GB
Filesystem:                   NFS/lustre/ext4
Operating system:             CentOS 7
Scheduling system:            slurm 20.11.8

Topola cluster nodes

CPU model	CPU frequency	Sockets: Cores: Threads	RAM	No of nodes	Name
Intel(R) Xeon(R) CPU E5-2697 v3	2.1GHz - 3.0GHz	2:14:1	128 GB	60	t1-[1-12], t[13-15]-[1-16]
Intel(R) Xeon(R) CPU E5-2697 v3	2.1GHz - 3.0GHz	2:14:1	64 GB	163	t1-[13-16], t[2-12]-[1-16]

Topola cluster nodes differ in RAM memory available only. The scheduling system assigns the node type according to the memory requirement submitted by the user.

Rysy cluster

Rysy cluster details

Installation:                 Cluster
Name:                         Rysy
CPU:                          Intel(R) Xeon(R) Gold 6154/6252 CPU
Architecture:                 x86_64
Data representation:          little-endian
CPU frequency:                2.1/3.0 - 3.7GHz
No of CPUs per node:          36 rdzeni
Sockets - Cores - Threads:    2-18-1
RAM per node:                 380/1500 GB
Filesystem:                   lustre/NVMe-oF
GPU:                          NVIDIA Tesla V100 16/32GB
Operating system:             CentOS 7
Scheduling system:            slurm 23.02.2

Rysy cluster nodes

CPU model	CPU frequency	Sockets: Cores: Threads	RAM	GPU	No of nodes	Name
Intel(R) Xeon(R) Gold 6252	3.0GHz - 3.7GHz	2:18:1	380 GB	4x NVIDIA Tesla V100 32GB	6	rysy-n[1-6]
Intel(R) Xeon(R) Gold 6154	2.1GHz - 3.7GHz	2:24:1	1500 GB	8x NVIDIA Tesla V100 16GB	1	rysy-n7
Intel(R) Xeon(R) E5-2670 v3	2.3GHz - 3.1GHz	2:12:1	760 GB	8x NVIDIA TITAN X (Pascal) 12GB	1	rysy-n9
Intel(R) Xeon(R) Gold 6126	2.6GHz - 3.7GHz	2:12:1	192 GB	8x NEC Vector Engine Type 10B 48GB	1	pbaran

deviceQuery Starting...

 CUDA Device Query (Runtime API) version (CUDART static linking)

Detected 1 CUDA Capable device(s)

Device 0: "Tesla V100-PCIE-32GB"
  CUDA Driver Version / Runtime Version          11.5 / 11.4
  CUDA Capability Major/Minor version number:    7.0
  Total amount of global memory:                 32510 MBytes (34089730048 bytes)
  (080) Multiprocessors, (064) CUDA Cores/MP:    5120 CUDA Cores
  GPU Max Clock rate:                            1380 MHz (1.38 GHz)
  Memory Clock rate:                             877 Mhz
  Memory Bus Width:                              4096-bit
  L2 Cache Size:                                 6291456 bytes
  Maximum Texture Dimension Size (x,y,z)         1D=(131072), 2D=(131072, 65536), 3D=(16384, 16384, 16384)
  Maximum Layered 1D Texture Size, (num) layers  1D=(32768), 2048 layers
  Maximum Layered 2D Texture Size, (num) layers  2D=(32768, 32768), 2048 layers
  Total amount of constant memory:               65536 bytes
  Total amount of shared memory per block:       49152 bytes
  Total shared memory per multiprocessor:        98304 bytes
  Total number of registers available per block: 65536
  Warp size:                                     32
  Maximum number of threads per multiprocessor:  2048
  Maximum number of threads per block:           1024
  Max dimension size of a thread block (x,y,z): (1024, 1024, 64)
  Max dimension size of a grid size    (x,y,z): (2147483647, 65535, 65535)
  Maximum memory pitch:                          2147483647 bytes
  Texture alignment:                             512 bytes
  Concurrent copy and kernel execution:          Yes with 7 copy engine(s)
  Run time limit on kernels:                     No
  Integrated GPU sharing Host Memory:            No
  Support host page-locked memory mapping:       Yes
  Alignment requirement for Surfaces:            Yes
  Device has ECC support:                        Enabled
  Device supports Unified Addressing (UVA):      Yes
  Device supports Managed Memory:                Yes
  Device supports Compute Preemption:            Yes
  Supports Cooperative Kernel Launch:            Yes
  Supports MultiDevice Co-op Kernel Launch:      Yes
  Device PCI Domain ID / Bus ID / location ID:   0 / 134 / 0
  Compute Mode:
     < Default (multiple host threads can use ::cudaSetDevice() with device simultaneously) >

deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 11.5, CUDA Runtime Version = 11.4, NumDevs = 1
Result = PASS

NVMe-oF Disks

On the Rysy supercomputer, NVMe-oF disks are available on the nodes with GPU. To use them, add the --gres=nvme:SIZE option to the task parameters. Then slurm will create a temporary directory /scratch/${SLURM_JOBID} with the quotation SIZE GB. The directory will available on computational node and will be deleted upon completion of the task. It is possible to (jointly) allocate up to 3500 GB on the nodes rysy-n[1-6] and 6100 GB on the node rysy-n7.

System information (hardware)

The following commands allow to display system information:

scontrol show partition <partition-name>        # partition characteristics
scontrol show node <node-name>                # node characteristics

cat /etc/os-release     # operating system version
df -Th                  # filesystem information

lscpu                   # CPU architecture (note: compute node architecture may be different than the access node)
sinfo -l -N             # number of nodes
sinfo -l -N | awk '{printf ("%1s %15s %15s %15s %10s %10s \n", $1, $2, $3, $5, $6, $7)}' # columns formatting
smap -i 2               # semi-graphical node-usage information