ICM computer systems
The table below contains essential characteristics of the ICM computer systems. Upon logging into the ICM access node (hpc.icm.edu.pl), it is possible to ssh further onto the other supercomputers and clusters (e.g. Okeanos of Rysy).
Info
Submitting jobs to the topola
cluster is possible directly from the access node, hpc.icm.edu.pl
.
To submit jobs for other systems, an additional ssh login step is required (e.g. ssh okeanos
or ssh rysy
.
Name | Type | Architecture | No of compute nodes | Node parameters |
---|---|---|---|---|
Okeanos | Supercomputer | Intel Haswell Cray XC40 | 1084 | 24 cores, 128 GB RAM |
Topola | HPC cluster, PL-Grid cluster | Intel Haswell Huawei E9000 | 223 | 28 cores, 64/128 GB RAM |
Rysy/GPU | GPU cluster | Intel Skylake, NVIDIA Volta | 6 | 36 cores, 380 GB RAM, 4x GPU V100 32GB |
Rysy/GPU | GPU cluster | Intel Skylake, NVIDIA Volta | 1 | 48 cores, 1500 GB RAM, 8x GPU V100 16GB |
Rysy/GPU | Klaster GPU, | Intel Haswell, NVIDIA Titan X (Pascal) | 1 | 24 rdzeni, 760 GB pamięci RAM, 8x GPU Titan X (Pascal) 12GB |
Rysy/PBaran | Vector computer, NEC Aurora A300-8 | Intel Skylake, NEC SX-Aurora Tsubasa | 1 | 24 cores, 192 GB RAM / 8 x 8 cores, 8 x 48 GB RAM |
Okeanos supercomputer
Since July 2016, ICM UW provides Okeanos supercomputer - Cray XC40 large-scale processing system. Okeanos has more than 1000 compute nodes, each with two 12-core Intel Xeon Haswell CPU and 128 GB of RAM. All the compute nodes are interconnected with Cray Aries network with Dragonfly topology.
Answering the ICM technological requirements, Cray Inc. has built an HPC system solution unique in the country. High power, scalability, and highly efficient graph data processing are among the few of its features. Okeanos is effectively used and dedicated to all kinds of large-scale computational tasks that require many thousands of CPU cores and dozens of terabytes of RAM in a single run.
Together with the analytics systems and data storage solutions, Okeanos is the main component of the OCEAN Competence Centre established in the new ICM Data Centre in Warsaw (Białołęka).
Okeanos supercomputer details
Installation: Cray XC40
Name: Okeanos
CPU: Intel Xeon E5-2690 v3
Architecture: x86_64
Data representation: little-endian
CPU frequency: 2.6 GHz
No of CPUs per node: 2 x 12 rdzenie (Hyperthreading x2)
Sockets - Cores - Threads: 2-12-2
RAM per node: 128 GB
Filesystem: Lustre (rozproszony system plików)
Operating system: SUSE Linux Enterprise Server 15
Scheduling system: slurm 19.05.4
Topola cluster
Topola cluster details
Installation: Cluster
Name: Topola
CPU: Intel(R) Xeon(R) CPU E5-2650 v3
Architecture: x86_64
Data representation: little-endian
CPU frequency: 2.0 - 3.1GHz
No of CPUs per node: 28 rdzeni
Sockets - Cores - Threads: 2-14-1
RAM per node: 64/128 GB
Filesystem: NFS/lustre/ext4
Operating system: CentOS 7
Scheduling system: slurm 20.11.8
Topola cluster nodes
CPU model | CPU frequency | Sockets: Cores: Threads | RAM | No of nodes | Name |
---|---|---|---|---|---|
Intel(R) Xeon(R) CPU E5-2697 v3 | 2.1GHz - 3.0GHz | 2:14:1 | 128 GB | 60 | t1-[1-12], t[13-15]-[1-16] |
Intel(R) Xeon(R) CPU E5-2697 v3 | 2.1GHz - 3.0GHz | 2:14:1 | 64 GB | 163 | t1-[13-16], t[2-12]-[1-16] |
Topola cluster nodes differ in RAM memory available only. The scheduling system assigns the node type according to the memory requirement submitted by the user.
Rysy cluster
Rysy cluster details
Installation: Cluster
Name: Rysy
CPU: Intel(R) Xeon(R) Gold 6154/6252 CPU
Architecture: x86_64
Data representation: little-endian
CPU frequency: 2.1/3.0 - 3.7GHz
No of CPUs per node: 36 rdzeni
Sockets - Cores - Threads: 2-18-1
RAM per node: 380/1500 GB
Filesystem: lustre/NVMe-oF
GPU: NVIDIA Tesla V100 16/32GB
Operating system: CentOS 7
Scheduling system: slurm 23.02.2
Rysy cluster nodes
CPU model | CPU frequency | Sockets: Cores: Threads | RAM | GPU | No of nodes | Name |
---|---|---|---|---|---|---|
Intel(R) Xeon(R) Gold 6252 | 3.0GHz - 3.7GHz | 2:18:1 | 380 GB | 4x NVIDIA Tesla V100 32GB | 6 | rysy-n[1-6] |
Intel(R) Xeon(R) Gold 6154 | 2.1GHz - 3.7GHz | 2:24:1 | 1500 GB | 8x NVIDIA Tesla V100 16GB | 1 | rysy-n7 |
Intel(R) Xeon(R) E5-2670 v3 | 2.3GHz - 3.1GHz | 2:12:1 | 760 GB | 8x NVIDIA TITAN X (Pascal) 12GB | 1 | rysy-n9 |
Intel(R) Xeon(R) Gold 6126 | 2.6GHz - 3.7GHz | 2:12:1 | 192 GB | 8x NEC Vector Engine Type 10B 48GB | 1 | pbaran |
deviceQuery Starting...
CUDA Device Query (Runtime API) version (CUDART static linking)
Detected 1 CUDA Capable device(s)
Device 0: "Tesla V100-PCIE-32GB"
CUDA Driver Version / Runtime Version 11.5 / 11.4
CUDA Capability Major/Minor version number: 7.0
Total amount of global memory: 32510 MBytes (34089730048 bytes)
(080) Multiprocessors, (064) CUDA Cores/MP: 5120 CUDA Cores
GPU Max Clock rate: 1380 MHz (1.38 GHz)
Memory Clock rate: 877 Mhz
Memory Bus Width: 4096-bit
L2 Cache Size: 6291456 bytes
Maximum Texture Dimension Size (x,y,z) 1D=(131072), 2D=(131072, 65536), 3D=(16384, 16384, 16384)
Maximum Layered 1D Texture Size, (num) layers 1D=(32768), 2048 layers
Maximum Layered 2D Texture Size, (num) layers 2D=(32768, 32768), 2048 layers
Total amount of constant memory: 65536 bytes
Total amount of shared memory per block: 49152 bytes
Total shared memory per multiprocessor: 98304 bytes
Total number of registers available per block: 65536
Warp size: 32
Maximum number of threads per multiprocessor: 2048
Maximum number of threads per block: 1024
Max dimension size of a thread block (x,y,z): (1024, 1024, 64)
Max dimension size of a grid size (x,y,z): (2147483647, 65535, 65535)
Maximum memory pitch: 2147483647 bytes
Texture alignment: 512 bytes
Concurrent copy and kernel execution: Yes with 7 copy engine(s)
Run time limit on kernels: No
Integrated GPU sharing Host Memory: No
Support host page-locked memory mapping: Yes
Alignment requirement for Surfaces: Yes
Device has ECC support: Enabled
Device supports Unified Addressing (UVA): Yes
Device supports Managed Memory: Yes
Device supports Compute Preemption: Yes
Supports Cooperative Kernel Launch: Yes
Supports MultiDevice Co-op Kernel Launch: Yes
Device PCI Domain ID / Bus ID / location ID: 0 / 134 / 0
Compute Mode:
< Default (multiple host threads can use ::cudaSetDevice() with device simultaneously) >
deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 11.5, CUDA Runtime Version = 11.4, NumDevs = 1
Result = PASS
NVMe-oF Disks
On the Rysy supercomputer, NVMe-oF disks are available on the nodes with GPU.
To use them, add the --gres=nvme:SIZE
option to the task parameters.
Then slurm will create a temporary directory /scratch/${SLURM_JOBID}
with the quotation SIZE GB.
The directory will available on computational node and will be deleted upon completion of the task.
It is possible to (jointly) allocate up to 3500 GB on the nodes rysy-n[1-6] and 6100 GB on the node rysy-n7.
System information (hardware)
The following commands allow to display system information:
scontrol show partition <partition-name> # partition characteristics
scontrol show node <node-name> # node characteristics
cat /etc/os-release # operating system version
df -Th # filesystem information
lscpu # CPU architecture (note: compute node architecture may be different than the access node)
sinfo -l -N # number of nodes
sinfo -l -N | awk '{printf ("%1s %15s %15s %15s %10s %10s \n", $1, $2, $3, $5, $6, $7)}' # columns formatting
smap -i 2 # semi-graphical node-usage information