MBOT Cluster

This is a private cluster

  • PI: Nils Deppe, Physics
  • Group: nd357_0001

Hardware

  • Head node: mbot.cac.cornell.edu.
  • Access modes: ssh
  • OpenHPC 3 with Rocky Linux 9
  • 8 compute nodes (c0001-c0008) with dual 96-core AMD EPYC processors, 384GB of RAM
  • Hyperthreading is enabled on all nodes, i.e., each physical core is considered to consist of two logical CPUs
  • Interconnect is InfiniBand HDR from each node to the HDR switch
  • Submit HELP requests: help OR by sending an email to help@cac.cornell.edu?Subject=MBOT.

File Systems

Home Directories

  • Path: ~

  • Users home directories are located on a NFS export from the head node. Use your home directory (~) for archiving the data you wish to keep. Data in user's home directories are NOT backed up.

Scratch

  • Path: /tmp

  • Users should run jobs using local /tmp on the compute nodes and copy output to home directories at job end.

  • Users are encouraged to copy data files from \$HOME to /tmp at the beginning their job; run their job on the data in /tmp; and then copy the results from /tmp back to $HOME. This will result peek I/O job performance.

Backups

  • Home directories are NOT backed up.
  • Researchers are encouraged to keep copies of important files in github (or other repository locations).
  • Large data sets should be copied to a second location for safety. example locations are the CAC Archival storage or Amazon Glacier.

Scheduler/Queues

  • The cluster scheduler is Slurm. All nodes are configured to be in the "normal" partition with no time limits. See Slurm documentation page for details. The Slurm Quick Start guide is a great place to start.
  • Hyperthreading is enabled on the cluster so Slurm considers each physical core to consist of two logical CPUs.

Software

Environment Modules

Set up the working environment for each package using the module command.
The module command will activate dependent modules if there are any.

To show currently loaded modules: (These modules are loaded by default system configurations)

-bash-4.2$ module list

Currently Loaded Modules:
  1) autotools   3) gnu12/12.2.0   5) ucx/1.14.0         7) openmpi4/4.1.5
  2) prun/2.2    4) hwloc/2.9.0    6) libfabric/1.18.0   8) ohpc

To show all available modules:

-bash-4.2$ module avail
------------------- /opt/ohpc/pub/moduledeps/gnu12-openmpi4 --------------------
   adios2/2.8.3     netcdf-cxx/4.3.1        py3-scipy/1.5.4
   boost/1.81.0     netcdf-fortran/4.6.0    scalapack/2.2.0
   dimemas/5.4.2    netcdf/4.9.0            scalasca/2.5
   extrae/3.8.3     omb/6.1                 scorep/7.1
   fftw/3.3.10      opencoarrays/2.10.0     sionlib/1.7.7
   geopm/1.1.0      petsc/3.18.1            slepc/3.18.0
   hypre/2.18.1     phdf5/1.14.0            superlu_dist/6.4.0
   imb/2021.3       pnetcdf/1.12.3          tau/2.31.1
   mfem/4.4         ptscotch/7.0.1          trilinos/13.4.0
   mumps/5.2.1      py3-mpi4py/3.1.4

------------------------ /opt/ohpc/pub/moduledeps/gnu12 ------------------------
   R/4.2.1         mpich/3.4.3-ofi        pdtoolkit/3.25.1
   gsl/2.7.1       mpich/3.4.3-ucx (D)    plasma/21.8.29
   hdf5/1.14.0     mvapich2/2.3.7         py3-numpy/1.19.5
   likwid/5.2.2    openblas/0.3.21        scotch/6.0.6
   metis/5.1.0     openmpi4/4.1.5  (L)    superlu/5.2.1

-------------------------- /opt/ohpc/pub/modulefiles ---------------------------
   EasyBuild/4.7.2          hwloc/2.9.0      (L)    pmix/4.2.6
   autotools         (L)    libfabric/1.18.0 (L)    prun/2.2        (L)
   charliecloud/0.15        ohpc             (L)    ucx/1.14.0      (L)
   cmake/3.24.2             os                      valgrind/3.20.0
   gnu12/12.2.0      (L)    papi/6.0.0

  Where:
   D:  Default Module
   L:  Module is loaded

Help