oph:cluster:messages
Differenze
Queste sono le differenze tra la revisione selezionata e la versione attuale della pagina.
| Entrambe le parti precedenti la revisioneRevisione precedenteProssima revisione | Revisione precedente | ||
| oph:cluster:messages [2026/03/23 14:23] – [2026-03-23] diego.zuccato@unibo.it | oph:cluster:messages [2026/06/11 09:47] (versione attuale) – [2026-06-11] è necessario richiedere le GPU perché il job possa usarle diego.zuccato@unibo.it | ||
|---|---|---|---|
| Linea 5: | Linea 5: | ||
| <WRAP center round info> | <WRAP center round info> | ||
| To report issues, please write **only** to difa.csi@unibo.it including a clear description of the problem ("my job doesn' | To report issues, please write **only** to difa.csi@unibo.it including a clear description of the problem ("my job doesn' | ||
| + | </ | ||
| + | |||
| + | <WRAP center round alert> | ||
| + | Remember that bld[15-16] are reserved for courses during the day. Jobs launched while not in a lab lesson will be terminated without further notice. If you need to run jobs to prepare an exam, just add: | ||
| + | #SBATCH --exclude=bld[15-16] | ||
| + | to your job script. | ||
| </ | </ | ||
| ===== 2026 ===== | ===== 2026 ===== | ||
| + | |||
| + | ==== 2026-06-11 ==== | ||
| + | |||
| + | Reconfiguring GPU nodes: when requesting a GPU node you have to *also* specify --gpus=N to have N GPUs assigned to your job. Other restrictions still apply, including allocation by socket (max 2 jobs per node). | ||
| + | |||
| + | ==== 2026-05-19 ==== | ||
| + | |||
| + | AC is now OK, the cluster have already been resumed. | ||
| + | |||
| + | ==== 2026-05-06 ==== | ||
| + | |||
| + | Started resuming some nodes. The biggest conditioner is still broken but the others have been fixed and are currently working. Hope not to have to shutdown again. | ||
| + | |||
| + | ==== 2026-05-05 ==== | ||
| + | |||
| + | The server room is experiencing overtemperature due to a failed AC: many (not all) nodes are being drained and will be resumed ASAP. | ||
| + | |||
| + | |||
| + | ==== 2026-03-30 ==== | ||
| + | Possible (hopefully unlikely) service interruption due to removal of electrical bypass installed on 25/12. | ||
| + | |||
| + | In case of emergency, the cluster will be shut down without further notice between 08.00 and 09.00 and reopened as soon as possible. | ||
| ==== 2026-03-23 ==== | ==== 2026-03-23 ==== | ||
oph/cluster/messages.1774275806.txt.gz · Ultima modifica: da diego.zuccato@unibo.it
