oph:cluster:storage
Differenze
Queste sono le differenze tra la revisione selezionata e la versione attuale della pagina.
Entrambe le parti precedenti la revisioneRevisione precedenteProssima revisione | Revisione precedente | ||
oph:cluster:storage [2024/10/03 14:54] – carlo.cintolesi@unibo.it | oph:cluster:storage [2025/02/06 16:07] (versione attuale) – [/home/] nuovo sito per /home/web diego.zuccato@unibo.it | ||
---|---|---|---|
Linea 8: | Linea 8: | ||
The /home is the area where your home folders are stored, as well as other shared areas such as ''/ | The /home is the area where your home folders are stored, as well as other shared areas such as ''/ | ||
- | * **/ | + | * **/ |
- | * **/ | + | * **/ |
+ | * This space is web-accessible at '' | ||
+ | * web access is read-only and it is not possible to create dynamic pages. | ||
+ | * **Per-sector quota** of 1TB (soft) / 2TB (hard) except astro with 4.4TB/ | ||
+ | * Requires either index.html or .htaccess file with '' | ||
- | === Technical characteristics === | + | ==== Technical characteristics |
* NFS-mount via Ethernet (1Gbps which is not very fast but quite responsive); | * NFS-mount via Ethernet (1Gbps which is not very fast but quite responsive); | ||
Linea 21: | Linea 25: | ||
This is the fast Input/ | This is the fast Input/ | ||
- | + | :!: folders inside sectors areas **must** use the account as name or you won't get important mails => possibe data loss. | |
- | === Technical characteristics: | + | ==== Technical characteristics: |
* Parallel filesystem, for quick (SSD-backed) access to the data you are working on; | * Parallel filesystem, for quick (SSD-backed) access to the data you are working on; | ||
Linea 33: | Linea 37: | ||
This is the main archive area to be used for large files, big datasets or archives; it is designed to be a distributed storage area for long-term data preservation. Data in this area should be stored in the form of compressed folders because the presence of a large number of small files will compromise its functionality. Every sector or project has a dedicated area with an associated quota on ''/ | This is the main archive area to be used for large files, big datasets or archives; it is designed to be a distributed storage area for long-term data preservation. Data in this area should be stored in the form of compressed folders because the presence of a large number of small files will compromise its functionality. Every sector or project has a dedicated area with an associated quota on ''/ | ||
- | + | :!: folders inside sectors areas **must** use the account as name or you won't get important mails => possibe data loss. | |
- | === Best practices: === | + | ==== Technical characteristics: ==== |
* **To not be used to store a large number of small files**, this will compromise the functionality of the storage space eventually blocking all the reading/ | * **To not be used to store a large number of small files**, this will compromise the functionality of the storage space eventually blocking all the reading/ | ||
Linea 40: | Linea 44: | ||
* Read-only access from compute nodes, read/write only from frontends and filetransfer nodes | * Read-only access from compute nodes, read/write only from frontends and filetransfer nodes | ||
* Quota imposed (both on file size and number of files) per sector, with extra space allocated for specific projects (in / | * Quota imposed (both on file size and number of files) per sector, with extra space allocated for specific projects (in / | ||
- | + | * Currently ACLs (setfacl) are notsupported (cephfs exported via NFS-Ganesha does not allow to set/get ACLs) | |
- | + | ==== Monitoring system of space usage ==== | |
- | ====== Monitoring system of space usage ====== | + | |
To allow users/ | To allow users/ | ||
In particular: | In particular: | ||
+ | * every sector/ | ||
+ | * every sector/ | ||
+ | * individual users will receive an email only if their sector/ | ||
- | - every sector/ | ||
- | |||
- | - every sector/ | ||
- | |||
- | - individual users will receive an email only if their sector/ | ||
- | |||
- | |||
- | ====== Advanced: local node space ====== | ||
- | |||
- | Every node does have some available space on local storage in ''/ | ||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | ---- | ||
- | |||
- | |||
- | ==== $HOME ==== | ||
- | |||
- | * This is a storage area available from every node. Should be **used for source files, compiling code, jobs that do not need a lot of space**. | ||
- | |||
- | * NFS-mount via Ethernet (1Gbps): not very fast but quite responsive. | ||
- | |||
- | * Quota limit: 50GB (soft) / 100GB (hard); check how much you're using with '' | ||
- | |||
- | ==== /home/temp (deprecated, | ||
- | |||
- | <WRAP center round important 80%> | ||
- | Migrate data you want to keep to /scratch: /home/temp <wrap em>will be deleted soon</ | ||
- | </ | ||
- | |||
- | * **Temporary** replacement for /scratch | ||
- | * Do not use for jobs' output | ||
- | ==== /home/web ==== | ||
- | |||
- | * This share is **web-accessible (read-only)** at '' | ||
- | |||
- | ==== /home/work ==== | ||
- | |||
- | * Used as **work area for jobs that do not need very big datasets** or need to have lots of files in a single directory (**not recommended**, | ||
- | |||
- | ==== /archive ==== | ||
- | |||
- | * Main archive area. **To be used for large files, big datasets and archives**. <WRAP center round alert 80%> | ||
- | **IMPORTANT: | ||
- | If you need some big files (say a dataset) and your code do not allow for specifying a different path, just **use symlinks from** '' | + | ====== |
- | * Max size for a single file is **8TB**: when archiving big dataset use splitting (preferably chunks should be less than 1TB) | + | |
- | * **Readonly access** from compute nodes, read/write only from frontends and filetransfer nodes | + | |
- | * Quota (both on file size and number of files): **per-sector**, | + | |
- | ==== /scratch ==== | + | Every node does have some available space on local storage in '' |
- | * Parallel filesystem, for **quick** (SSD-backed) access to the data you're working on. | + | ==== Technical characteristics: ==== |
- | * No quota, but **files older than TBD((Decision to be finalized by the board)) FIXME days** will be automatically deleted without further notice. | + | |
- | * No cross-server redundancy, just local RAID: if (when) a server (or two disks in the same RAID) fails, all data becomes unavailable -- always keep a copy of your important data archived elsewhere (maybe /archive, but for very important data **offsite is better**) | + | |
- | ==== $TMPDIR ==== | + | * local space: not shared between multiple nodes, not even for a single multi-node job |
+ | * quite fast | ||
+ | * automatically cleaned when job ends | ||
- | Every node does have some available space on local storage. It's useful to store temporary files that do not need to be shared between nodes. Being local, latency is very low. But local disks aren't big. $TMPDIR can usually store around 200GB of data. It gets **automatically cleaned when the job terminates**: |
oph/cluster/storage.1727967284.txt.gz · Ultima modifica: 2024/10/03 14:54 da carlo.cintolesi@unibo.it