Best Practices for Working with HPC Systems at ICM UW
Archiving and Compressing Data
- Regularly archive and compress files to free up disk space and reduce the number of files (important for file quota and disk quota limits).
- ⚠️ ICM UW does not create backups of user data – you are responsible for maintaining your own backups.
- Useful tools:
tar
,gzip
,bzip2
. - Compressing binary data is usually pointless (time-consuming and inefficient).
- A tar archive (without the
-z
option) is the fastest solution for handling a large number of small files.
📖 Instructions:
- Storage quota limits
- File compression – tar and gzip
Backup and Synchronization
- Regularly back up your data locally or in the cloud.
- Tools:
rsync
– for directory synchronization,rclone
– for working with Google Drive and other cloud services.
📖 Instructions:
- Archiving and compression
- Rclone documentation
Data Transfer
- Files can be uploaded to and downloaded from the HPC systems via Open OnDemand.
- Keep in mind that transfer speed is mostly limited by your local disk performance.
📖 Instruction:
- Open OnDemand – file transfer
Scalability Tests
- Run scalability tests before starting large production runs.
- Check if your program efficiently uses multiple cores and multiple nodes.
- This will help you avoid wasting both resources and queue time.
📖 Instruction:
- SLURM basics
Secure Login
- Try to configure SSH keys instead of passwords – this is both safer and faster.
- Configure MasterConnection (SSH multiplexing) to avoid entering OTP repeatedly.
📖 Instructions:
- Creating and using SSH keys
- SSH multiplexing – overview
Resource Usage
- Remember that large computations affect other users.
- Do not attempt to bypass queue system limits – such actions may result in account suspension.
- Unnecessary overuse of allocated resources decreases the efficiency of the entire system.
- Although ICM does not charge for unused computing hours, you should carefully plan your requests for resources.
- If your project requirements change, you may request additional resources within the same year.
📖 Instruction:
- Resource limits
Walltime and Job Queue
- Accurately estimate the required walltime – it directly affects queue wait time and optimal use of resources.
- The job scheduling system allows:
- setting the number of jobs,
- defining dependencies between jobs,
- controlling the execution sequence.
📖 Instruction: - SLURM Basics