oph:cluster:resources
Differenze
Queste sono le differenze tra la revisione selezionata e la versione attuale della pagina.
Entrambe le parti precedenti la revisioneRevisione precedenteProssima revisione | Revisione precedente | ||
oph:cluster:resources [2023/07/25 06:44] – Reorganized page, added explanation about resources use diego.zuccato@unibo.it | oph:cluster:resources [2025/02/04 05:56] (versione attuale) – [General informations] diego.zuccato@unibo.it | ||
---|---|---|---|
Linea 3: | Linea 3: | ||
The hardware structure of the DIFA-OPH computing cluster is summarised in the table below, listing all the compute nodes currently available with their individual hostnames, their specific resources (number of cores and available RAM), and their access policies. | The hardware structure of the DIFA-OPH computing cluster is summarised in the table below, listing all the compute nodes currently available with their individual hostnames, their specific resources (number of cores and available RAM), and their access policies. | ||
- | In particular, | + | In particular, bldNN nodes are part of the **BladeRunner island**, |
- | **The nodes associated with the OPH project are open to all users** while nodes associated with other individual projects may be subject to **access restrictions**. Such restrictions are indicated | + | **The nodes associated with the OPH project are open to all users** while nodes associated with other individual projects may be subject to **access restrictions**. Such restrictions are indicated in the last column of the table. |
- | + | ||
- | *** GREEN: Shared nodes**, the access is available for all users | + | |
- | *** ORANGE: Shared nodes**, with regular weekly reservations for teaching purposes during which the nodes can be accessed only by students of specific DIFA courses for their laboratory activities | + | |
- | *** RED: Reserved nodes**, the access is restricted to users explicitly authorised by the corresponding project PI for a given period of time | + | |
- | + | ||
- | {{ : | + | |
+ | * **Shared** nodes, the access is available for all users | ||
+ | * **Teaching** nodes, reserved for teaching purposes: can be accessed only by students of specific DIFA courses for their laboratory activities | ||
+ | *** Reserved nodes**, the access is restricted to users explicitly authorised by the corresponding project PI till the given **exp**iration date. | ||
+ | <WRAP width=100%> | ||
+ | ^ Nodes ^ vCPUs/ | ||
+ | | bld[01-02] | 24 / 64G | - | OPH | Shared | ||
+ | | bld[03-04] | 32 / 64G | - | OPH | Shared | ||
+ | | bld05 | 32 / 128G | - | OPH | Shared | ||
+ | | bld[15-16] | 16 / 24G | - | OPH | Teaching | ||
+ | | bld[17-18] | 32 / 64G | - | OPH | Shared | ||
+ | | mtx[00-15] | 56 / 256G | - | OPH | Shared | ||
+ | | mtx[16-19] | 112 / 512G | - | ERC-Astero (Miglio) | ||
+ | | mtx20 | 112 / 1T | - | OPH (Di Sabatino) | ||
+ | | mtx[21-22] | 192 / 1T | - | SLIDE (Righi) | ||
+ | | mtx[23-25] | 112 / 512G | - | OPH (Marinacci) | ||
+ | | mtx26 | 112 / 512G | - | CAN (Bellini) | ||
+ | | mtx27 | 112 / 512G | - | FFHiggsTop (Peraro) | ||
+ | | mtx[28-29] | 112 / 1.5T | - | ::: | ::: | | ||
+ | | mtx30 | 64 / 1T | - | Trigger (Di Sabatino) | Reserved (2027-02) | | ||
+ | | mtx[31-32] | 64 / 1T | - | EcoGal (Testi) | ||
+ | | mtx[33-34] | 192 / 1.5T | - | ::: | ::: | | ||
+ | | mtx[35-36] | 192 / 512G | - | RED-CARDINAL (Belli) | ||
+ | | mtx[37-40] | 192 / 512G | - | ELSA (Talia) | ||
+ | | gpu00 | 64 / 1T | 2xA100 | VEO (Remondini) | ||
+ | | gpu[01-02] | 112 / 1T | 4xH100 | EcoGal (Testi) | ||
+ | | gpu03 | 112 / 1T | 4xH100 | ELSA (Talia) | ||
+ | </ | ||
====== Computing Resources ====== | ====== Computing Resources ====== | ||
- | Resources are nodes, CPUs, GPUs((Only on GPU nodes)), RAM and time. You'll have to select the resources you need for the job. Do not overstimate too much or you'll be " | + | Resources are nodes, CPUs, GPUs((Only on GPU nodes)), RAM and time. You'll have to select the resources you need for the job. Do not overstimate too much or you'll be " |
- | Nodes are grouped by partitions. | + | Nodes are grouped by partitions. |
- | To select a (set of) node(s) suitable for your job, use constraints. These include: | + | ===== Selecting nodes ===== |
+ | |||
+ | To select a (set of) node(s) suitable for your job, **use constraints**. These include: | ||
* **blade**: older nodes, usually for smaller/ | * **blade**: older nodes, usually for smaller/ | ||
- | * **matrix**: newer nodes, for bigger parallel jobs; allocated by "half node" units! | + | * **matrix**: newer nodes, for bigger parallel jobs; **allocated by "half node" units!** |
- | * **ib**: require IB-equipped nodes (**all** nodes in matrix are IB-equipped, no need to specify) | + | * **ib**: require IB-equipped nodes (**all** nodes in matrix are IB-equipped |
* **filetransfer**: | * **filetransfer**: | ||
* **intel**: require an Intel CPU | * **intel**: require an Intel CPU | ||
* **amd**: require an AMD CPU | * **amd**: require an AMD CPU | ||
* **avx**: require that the CPU supports AVX instructions | * **avx**: require that the CPU supports AVX instructions | ||
- | * **dev**: require that the node can be used to compile | + | * **dev**: require that the node can be used to compile |
* **dida**: require nodes used for lessons (obsolete) | * **dida**: require nodes used for lessons (obsolete) | ||
* **gpu**: require a GPU-equipped node | * **gpu**: require a GPU-equipped node | ||
- | Some nodes are reserved for specific projects (see table above). To be able to use 'em you have to be explicitly allowed by referer | + | ===== Reserved nodes ===== |
+ | |||
+ | Some nodes are reserved for specific projects (see table above). To be able to use 'em you have to be explicitly allowed by project manager | ||
+ | |||
+ | ^ Project | ||
+ | | CAN | Bellini | ||
+ | | ECOGAL | ||
+ | | ELSA | Talia | Str04109.13664-OPH-ELSA | ||
+ | | FFHiggsTop | ||
+ | | RedCardinal | Belli | Str04109.13664-OPH-RedCardinal | OPH-res-RedCardinal | prj-redcardinal | ||
+ | | SLIDE | Righi | Str04109.13664-OPH-SLIDE | ||
+ | | Trigger | ||
+ | | VEO | Remondini | ||
+ | |||
+ | ===== QualityOfService ===== | ||
+ | |||
+ | What other clusters call " | ||
+ | |||
+ | By default all jobs are queued as " | ||
+ | |||
+ | Each QoS offers different features: | ||
+ | |||
+ | ^ QoS ^ Max runtime ^ Priority ^ Note ^ | ||
+ | | normal | 24h | standard | Default | | ||
+ | | debug | 15' | ||
+ | | long | ||
+ | |||
+ | ====== Storage Resources ====== | ||
+ | |||
+ | Storage is detailed [[oph: | ||
+ | |||
+ | It's important to select the correct storage for the use you're going to do. |
oph/cluster/resources.1690267474.txt.gz · Ultima modifica: 2023/07/25 06:44 da diego.zuccato@unibo.it