oph:cluster:messages
Differenze
Queste sono le differenze tra la revisione selezionata e la versione attuale della pagina.
| Entrambe le parti precedenti la revisioneRevisione precedenteProssima revisione | Revisione precedente | ||
| oph:cluster:messages [2024/12/17 09:41] – [2024-12-17] diego.zuccato@unibo.it | oph:cluster:messages [2026/01/12 12:05] (versione attuale) – [2026-01-12] diego.zuccato@unibo.it | ||
|---|---|---|---|
| Linea 7: | Linea 7: | ||
| </ | </ | ||
| - | <WRAP center round alert> | + | ===== 2026 ===== |
| - | New year' | + | |
| - | * On **Jan 13, 2025** / | + | ==== 2026-01-12 ==== |
| - | * On **Jan 15, 2025** the cluster will be **completely shut down** for about 2 weeks. | + | * /archive returned writable from frontends and bld18 |
| - | Operations will be resumed ASAP. | + | |
| - | </WRAP> | + | ==== 2026-01-07 ==== |
| + | * Recovered (partially) from the emergency shutdown on XMas. /archive is currently readonly and it' | ||
| + | |||
| + | ===== 2025 ===== | ||
| + | |||
| + | ==== 2025-12-25 ==== | ||
| + | * Emergency shutdown: data center temperature too high (>50°C) could cause long-lasting issues | ||
| + | |||
| + | ==== 2025-12-19 ==== | ||
| + | * Maintenance shutdown. The cluster will be unavailable till 2025-12-23 (if all goes well). | ||
| + | |||
| + | ==== 2025-12-02 ==== | ||
| + | | ||
| + | |||
| + | ==== 2025-10-15 ==== | ||
| + | |||
| + | * **/scratch is full**: please delete unneeded data, and archive | ||
| + | |||
| + | ==== 2025-08-25 ==== | ||
| + | * All nodes except bld17 (that was already down) should | ||
| + | |||
| + | ==== 2025-08-22 ==== | ||
| + | * Started power-on. Some disks are corrupt and require some work to be recovered. | ||
| + | * [22:42 GMT+1] Cluster is *mostly* operational, the downed nodes will be fixed in the next days | ||
| + | ==== 2025-08-21 ==== | ||
| + | | ||
| + | |||
| + | ==== 2025-05-09 ==== | ||
| + | | ||
| + | |||
| + | ==== 2025-05-08 ==== | ||
| + | * generalized slowness: due to ongoing backup, access to /home is really slow. A concurrent check of the underlying RAID volume made it even worse. Check have been paused and backup is nearing completion, so the system should soon return to normality | ||
| + | | ||
| + | |||
| + | ==== 2025-04-22 ==== | ||
| + | | ||
| + | * mtx12 is (temporarily, | ||
| + | |||
| + | ==== 2025-04-18 ==== | ||
| + | | ||
| + | * **15:30 Update** all the nodes are currently working: don't break' | ||
| + | ==== 2025-04-10 ==== | ||
| + | * Created a reservation to avoid having running jobs during maintenance (**2025-04-16T10: | ||
| + | |||
| + | ==== 2025-03-31 ==== | ||
| + | * < | ||
| + | |||
| + | ==== 2025-03-27 ==== | ||
| + | |||
| + | * <del>/scratch and many nodes are currently down due to technical issues. We're working on it.</del> [9.00] Everything appears OK | ||
| + | |||
| + | ==== 2025-01-13 ==== | ||
| + | |||
| + | * /archive is now **read-only** to avoid potential data loss during cluster move | ||
| ===== 2024 ===== | ===== 2024 ===== | ||
| + | |||
| + | ==== 2024-12-18 ==== | ||
| + | |||
| + | * Power sources are redundant again. There shouldn' | ||
| ==== 2024-12-17 ==== | ==== 2024-12-17 ==== | ||
oph/cluster/messages.1734428512.txt.gz · Ultima modifica: da diego.zuccato@unibo.it
