(EN) Basic usage
Please note that a PDF version of the materials contained herein (including SOL) is also available.
This document aims to provide basic information on how to use the NEC SX-Aurora Tsubasa system available at ICM UW computational facility. The contents herein are based on a number of documents, as referenced in the text, to provide a concise quick start guide and suggest further reading material for the ICM users.
To use the Tsubasa installation users must access the login node first
hpc.icm.edu.pl through SSH and then establish a further
connection to the Rysy cluster:
ssh firstname.lastname@example.org ssh rysy
-J command line option can be passed to the OpenSSH
client to specify a jump host (here the
hpc login node) through which
the connection to Rysy will be established (issue
man ssh command for
The system runs Slurm Workload Manager for job scheduling and Environment Modules to manage software. The single compute node (PBaran) of the ve partition can be used interactively – as shown below – or as a batch job (see further in the text).
srun -A GRANT_ID -p ve --gres=ve:1 --pty bash -l
Once the interactive shell session has started, the environmental
$VE NODE NUMBER is being automatically set to control which
VE card is to be used by the user programs. This variable can be read
and set manually with echo and export commands,
respectively. The software used to operate the VEs – including
binaries, libraries, header files, etc. – is installed in
directory. Its effective use requires modification of the
environmental variables, such as
$LD LIBRARY PATH and
others, which can be done conveniently with the source command:
Sourcing the variables makes various VE tools accessible within the
user environment. This includes the NEC compilers for C, C++, and
Fortran languages that can be invoked by
respectively, or by their respective MPI wrappers:
mpinfort. Please note that several compiler versions
are currently installed and it might be necessary to include a version
number in your command, e.g.
ncc-2.5.1. The general usage is
consistent with the GNU GCC:
<compiler> <options> <source file>. The
table below lists several standard options for the NEC compilers – see
documentation for details.
||create object file|
||output file name|
||include header files|
||enable syntax warnings|
||treat warnings as errors|
||use the profiler|
||enable execution analysis|
||provides traceback information|
||level of details for vector diagnostics|
The last four of them are used for performance analysis and allow for efficient software development. Some of these, apart from being used as command line options at compile time, also rely on dedicated environmental variables that need to be set at runtime. For a full list of performance-related options, variables, as well as their output description, see PROGINF/FTRACE User’s Guide and the compiler-specific documentation.
The binaries can be run directly by specifying the path or by using
the VE loader program (
ve exec) – a few examples including parallel
execution are listed below:
mpirun -v -np 2 -ve 0-1 ./program # enables the use of VE cards 0 and 1
For a full list of options available for
mpirun see the corresponding
manual page or issue
mpirun -h command.
Another, non-interactive, mode of operation is a batch mode which requires a script to be submitted to Slurm. An example job script is shown below.
#!/bin/bash -l #SBATCH -J name #SBATCH -N 1 #SBATCH --ntasks-per-node 1 #SBATCH --mem 1000 #SBATCH --time=1:00:00 #SBATCH -A <Grant ID> #SBATCH -p ve #SBATCH --gres=ve:1 #SBATCH --output=out ./program
It specifies the name of the job (
-J), requested number of nodes (
--ntasks-per-node), memory (
-mem; here in Megabytes), wall time
--time), grant ID (
-A), partition (
-p), generic resources
--gres), output file (
--output), and the actual commands to be
executed once the resources are granted. See Slurm documentation for
an extensive list of available options.
Below are few basic example commands used to work with job scrips:
submitting the job (
sbatch) which returns the ID number assigned to
the it by the queuing system, listing the user’s jobs along with their
squeue), listing the details of the specified job
scontrol), cancelling execution of the job (
scancel). Consult the
documentation for more.
sbatch job.sl # submits the job squeue -u $USER # lists the user’s current jobs scontrol show job <ID> # lists the details of the job specified by given <ID> scancel <ID> # cancels the job with given <ID>
Since there’s no dedicated filesystem to be used for calculations on the Rysy cluster, in contrast to other ICM systems, the jobs should be run from within the $HOME directory. The ve partition (PBaran compute node) is intended for jobs utilizing VE cards, and as such it should not be used for intensive CPU-consuming tasks.