Strumenti Utente

Strumenti Sito


oph:cluster:messages

Differenze

Queste sono le differenze tra la revisione selezionata e la versione attuale della pagina.

Link a questa pagina di confronto

Entrambe le parti precedenti la revisioneRevisione precedente
Prossima revisione
Revisione precedente
oph:cluster:messages [2025/04/18 13:15] – [2025] diego.zuccato@unibo.itoph:cluster:messages [2025/05/09 11:09] (versione attuale) – [2025-05-09] diego.zuccato@unibo.it
Linea 9: Linea 9:
  
 ===== 2025 ===== ===== 2025 =====
 +
 +==== 2025-05-09 ====
 +  * slowness resolved (backup completed)
 +
 +==== 2025-05-08 ====
 +  * generalized slowness: due to ongoing backup, access to /home is really slow. A concurrent check of the underlying RAID volume made it even worse. Check have been paused and backup is nearing completion, so the system should soon return to normality
 +  * mtx12 is offline again due to RAM issues
 +
 +==== 2025-04-22 ====
 +  * recreated missing reservations -- please check names with ''scontrol show res''
 +  * mtx12 is (temporarily, we hope) down due to RAM issues
  
 ==== 2025-04-18 ==== ==== 2025-04-18 ====
   * Maintenance (nearly) completed. Two nodes are still down (gpu01 and gpu02), and some jobs migh have failed when scheduled on misbehaving nodes (bld17 and bld18, now fixed)   * Maintenance (nearly) completed. Two nodes are still down (gpu01 and gpu02), and some jobs migh have failed when scheduled on misbehaving nodes (bld17 and bld18, now fixed)
 +  * **15:30 Update** all the nodes are currently working: don't break'em! Happy Easter!
 ==== 2025-04-10 ==== ==== 2025-04-10 ====
   * Created a reservation to avoid having running jobs during maintenance (**2025-04-16T10:00** to **2025-04-18T15:00**); we'll do our best to reduce downtime, so the cluster //might// come back online sooner than planned   * Created a reservation to avoid having running jobs during maintenance (**2025-04-16T10:00** to **2025-04-18T15:00**); we'll do our best to reduce downtime, so the cluster //might// come back online sooner than planned
oph/cluster/messages.1744982148.txt.gz · Ultima modifica: 2025/04/18 13:15 da diego.zuccato@unibo.it

Donate Powered by PHP Valid HTML5 Valid CSS Driven by DokuWiki