Questa è una vecchia versione del documento!
Archive data
Archiving data not in current use is mandatory to not compromise the entire cluster's functionality. The procedure to archive data folders is described below and it is the responsibility of each cluster user to ensure the smooth operation of the system
Which data
Archiving concerns data folders not in current use, i.e. data that you want to keep in memory but do not plan to use in the time horizon of one month. These must be compressed and stored in the memory partition /scratch
.
Why mandatory
One mast compresses data folders creating a .tgz
archives since the current cluster architecture handles a high number of files with great difficulty, even if they occupy a few Mb of memory. This means that it is preferable to have a single compressed folder instead of a normal folder with 1,000 small files in it.
How to do
The archiving procedure comprises two steps: compressing the folders into a .tgz format and storing the archive in its own folder on the /scratch
partition. To simplify the operation, the archiver utility was created for this task. The source code for the utility is located in /home/software/utils/archiver
in read-only mode.
Practical guide to using ''archiver''
- Create a folder in
/scratch/yourGroup/name.surnameNumber
(if it is not already there!), where yourGroup is the research group of affiliation and name.surnameNumber is the beginning of your institutional e-mail.$ mkdir /scratch/yourGroup/name.surnameNumber
- Load the cluster utilities by adding
module load utils
to the.bashrc
file in your home folder (do this just once).cd; echo “module load utils” » .bashrc
- Use archiver to archive and move individual folders in one shot. You need to specify (first argument) the path to where to store the compressed folder and (second argument) the path to the original uncompressed folder.
archiver /scratch/yourGroup/name.surnameNumber/foldername.tgz /path-to-folder-to-be-archived/folderName
Important notes:
Archiver generates a log file in the folder where it was launched that informs us of the progress of compression and does a check at the end to verify that the folder was compressed without errors. The utility runs on bladeRunner in the background. It is advisable to store folders that are not too large in order to be manageable (easy to move and decompress if necessary).