Questa è una vecchia versione del documento!
General informations
The hardware structure of the DIFA-OPH computing cluster is summarised in the table below, listing all the compute nodes currently available with their individual hostnames, their specific resources (number of cores and available RAM), and their access policies.
In particular, blue nodes (with names bldNN
) are part of the BladeRunner island, purple nodes (with names mtxNN
) are part of the Matrix island, and orange nodes (with names gpuNN
) are part of the GPU island.
The nodes associated with the OPH project are open to all users while nodes associated with other individual projects may be subject to access restrictions. Such restrictions are indicated by different colors in the last column of the table:
- GREEN: Shared nodes, the access is available for all users
- ORANGE: Shared nodes, with regular weekly reservations for teaching purposes during which the nodes can be accessed only by students of specific DIFA courses for their laboratory activities
- RED: Reserved nodes, the access is restricted to users explicitly authorised by the corresponding project PI for a given period of time
Computing Resources
Resources are nodes, CPUs, GPUs1), RAM and time. You'll have to select the resources you need for the job. Do not overstimate too much or you'll be “billed” too much. But don't understimate or your job won't be able to complete. When a job completes, you receive a mail with seff output: this should help a lot to optimize future requests.
Nodes are grouped by partitions. Usually there's no need to specify neither nodenames nor partitions.
To select a (set of) node(s) suitable for your job, use constraints. These include:
- blade: older nodes, usually for smaller/sequential jobs, quite heterogeneus
- matrix: newer nodes, for bigger parallel jobs; allocated by “half node” units!
- ib: require IB-equipped nodes (all nodes in matrix are IB-equipped, no need to specify)
- filetransfer: ask for a node with fast access to outside network to quickly tranfer big files
- intel: require an Intel CPU
- amd: require an AMD CPU
- avx: require that the CPU supports AVX instructions
- dev: require that the node can be used to compile
- dida: require nodes used for lessons (obsolete)
- gpu: require a GPU-equipped node
Some nodes are reserved for specific projects (see table above). To be able to use 'em you have to be explicitly allowed by project manager (= added to the project group via DSA interface). Once you're in the allowed group (check with id
) you can submit jobs specifying –reservation=prj-***
.
Project | Manager | AD group (DSA) | OPH group (id ) | Reservation to use |
---|---|---|---|---|
CAN | Bellini | Str04109.13664-OPH-CAN | OPH-res-CAN | prj-can |
ERC_astero | Miglio | Str04109.13664-OPH-erc_astero | OPH-res-erc_astero | prj-erc_astero |
FFHiggsTop | Peraro | Str04109.13664-OPH-FFHiggsTop | OPH-res-FFHiggsTop | prj-ffhiggstop |
SLIDE | Righi | Str04109.13664-OPH-SLIDE | OPH-res-SLIDE | prj-slide |
VEO | Remondini | Str04109.13664-OPH-VEO | OPH-res-VEO | prj-veo |