Archive data
Archiving data not currently in use is mandatory to avoid compromising the cluster's functionality. The procedure to archive data folders is described below and it is the responsibility of each user to ensure the smooth operation of the system.
Which data
Archiving concerns data folders not currently in use, i.e. data that you want to keep in memory but do not plan to use 'soon' (like in one month, not one year). These must be compressed and stored in /scratch
partition.
Why mandatory
One must compresses data folders creating .tgz
archives since the current cluster architecture handles a high number of files with great difficulty, even if they occupy a few Mb of space. This means that it is preferable to have a single compressed folder instead of a normal folder with 1,000 small files in it.
How to do
The archiving procedure comprises two steps: compressing the folders into a .tgz format and storing the archive in its own folder on the /scratch
partition. To simplify the operation, the archiver utility was created for this task. The source code for the utility is located in /home/software/utils/archiver
in read-only mode.
Practical guide to using ''archiver''
- Create a folder in
/scratch/yourGroup/name.surnameNumber
(if it is not already there!), where yourGroup is the research group of affiliation and name.surnameNumber is the beginning of your institutional e-mail.$ mkdir /scratch/yourGroup/name.surnameNumber
- Load the cluster utilities by adding
module load utils
to the.bashrc
file in your home folder (do this just once).cd echo "module load utils" >> .bashrc
- Use archiver to archive and move individual folders in one shot. You need to specify (first argument) the path to where to store the compressed folder and (second argument) the path to the original uncompressed folder.
archiver /scratch/yourGroup/name.surnameNumber/foldername.tgz /path-to-folder-to-be-archived/folderName
Important notes:
Archiver generates a log file in the folder where it was launched that informs you of the progress of compression and does a check at the end to verify that the folder was compressed without errors. The utility runs on bladeRunner in the background. It is advisable to store folders that are not too large in order to be manageable (easy to move and decompress if necessary).