Archiving data not currently in use is mandatory to avoid compromising the cluster's functionality. The procedure to archive data folders is described below and it is the responsibility of each user to ensure the smooth operation of the system.
Archiving concerns data folders not currently in use, i.e. data that you want to keep in memory but do not plan to use 'soon' (like in one month, not one year). These must be compressed and stored in /scratch
partition.
One must compresses data folders creating .tgz
archives since the current cluster architecture handles a high number of files with great difficulty, even if they occupy a few Mb of space. This means that it is preferable to have a single compressed folder instead of a normal folder with 1,000 small files in it.
The archiving procedure comprises two steps: compressing the folders into a .tgz format and storing the archive in its own folder on the /scratch
partition. To simplify the operation, the archiver utility was created for this task. The source code for the utility is located in /home/software/utils/archiver
in read-only mode.
/scratch/yourGroup/name.surnameNumber
(if it is not already there!), where yourGroup is the research group of affiliation and name.surnameNumber is the beginning of your institutional e-mail.$ mkdir /scratch/yourGroup/name.surnameNumber
module load utils
to the .bashrc
file in your home folder (do this just once).cd echo "module load utils" >> .bashrc
archiver /scratch/yourGroup/name.surnameNumber/foldername.tgz /path-to-folder-to-be-archived/folderName
Archiver generates a log file in the folder where it was launched that informs you of the progress of compression and does a check at the end to verify that the folder was compressed without errors. The utility runs on bladeRunner in the background. It is advisable to store folders that are not too large in order to be manageable (easy to move and decompress if necessary).