HPC / SLURM

HPC / SLURM#

Run the LBM-Suite2p pipeline on a SLURM cluster (or locally) from a TOML config. Built on submitit.

Command	Description
`mbo hpc info`	Show partitions: nodes, CPUs, GPU usage, free memory
`mbo hpc init`	Write a commented `hpc.toml`
`mbo hpc check`	Verify the request fits the data and the partition
`mbo hpc run`	Submit the run (single / array / local)
`mbo hpc status`	Job state, an output dir’s timings, or your queue
`mbo hpc watch`	Follow a run’s `.err`/`.out` logs

Typical flow#

mbo hpc info                          # size a job: which partition has free GPUs
mbo hpc init /data/raw                # write /data/raw/hpc.toml (edit it)
mbo hpc check hpc.toml --mode array   # does the request fit the data + partition?
mbo hpc run hpc.toml --mode array     # submit array + dependent aggregate
mbo hpc watch hpc.toml                # follow the newest run's logs

Config#

mbo hpc init writes a commented hpc.toml. The four tables:

[io]
input  = "/lustre/.../raw"          # directory of ScanImage TIFFs
output = "/lustre/.../results"      # WRITABLE root; prefer scratch
name   = "s2p"                      # label for the dated output subfolder

[slurm]
partition         = "hpc_a100_a"    # see `mbo hpc info`
gres              = "gpu:a100:1"    # GPUs per job
cpus_per_task     = 16              # workers derive from this and F
mem_gb            = 128             # per job; size to the node, not a small default
time              = "24:00:00"
array_parallelism = 0               # max concurrent array tasks (0 = scheduler default)

[pipeline]
planes_per_gpu = 4                  # pack factor F: planes sharing one GPU
node_local     = true              # stage on node-local NVMe, copy results back

[parameters]                        # suite2p ops + pipeline knobs (keep_reg, diameter, ...)
algorithm = "cellpose"
keep_reg  = false

One job holds one GPU. planes_per_gpu (F) planes share it, each holding a movie in RAM, so peak RAM ≈ F × per-plane size. Set mem_gb to the node’s capacity — too low OOMs at the cgroup cap. mbo hpc check does this math for you.

Modes#

mbo hpc run hpc.toml                  # single (default)
mbo hpc run hpc.toml --mode array
mbo hpc run hpc.toml --local
mbo hpc run hpc.toml --dry-run        # print the job layout, submit nothing

Mode	What it does	Use when
`single` (default)	One GPU job over all planes (F packed per GPU), volumetric merge inline	The dataset fits one job’s wall-time limit
`array`	One array task per F-plane shard, then a dependent aggregate that merges the volume	Spreading shards across multiple nodes cuts wall time
`local`	Runs the compute inline in this process, off the scheduler	Testing, or a workstation with a GPU

--mode array only helps across multiple nodes. On a single-node partition all tasks pile onto one node, where cpus_per_task × tasks must fit the node’s CPUs and they share its GPUs — the same wall time as single. Use mbo hpc info to see a partition’s NODES count, and mbo hpc check --mode array to catch the single-node case before submitting.

run overrides (set config fields on the command line)

Option	Description
`--input` / `--output` / `--name`	Override the `[io]` fields
`--partition` / `--gres` / `--time`	Override the `[slurm]` fields
`--planes-per-gpu`	Override pack factor F
`--local`	Shortcut for `--mode local`
`--gpu`	Local-run CUDA device index (nvidia-smi order); `-1` = auto. Ignored under SLURM

Monitor#

mbo hpc status 5162141                # job state, exit code, failure diagnosis
mbo hpc status /data/results/2025_..  # timings.json summary for an output dir
mbo hpc status                        # the last run you launched
mbo hpc watch                         # follow the last run's logs
mbo hpc watch 5162141                 # follow logs by job id (shows state first)
mbo hpc watch hpc.toml -o             # follow a run's .out instead of .err

While watch follows a terminal: o/e switch out/err, n/p switch task logs, q quits.

Diagnostics

Command	Description
`mbo hpc check`	Memory math + structural fixes for a config vs. the partition

Shared environment (`mbo_server_configs`)#

Separate from the mbo hpc CLI above: the shared software under /lustre/fs8/mbo/scratch/mbo_soft (CLI tools, neovim, the mbo venv, repos) is exposed through MBO_* variables and navigation aliases. Source it once from ~/.bashrc:

source /lustre/fs8/mbo/scratch/mbo_soft/repos/mbo_server_configs/config/hpc/mbo.sh

Variables (defined in config/hpc/env.sh):

Variable	Value	Points to
`MBO_ROOT`	`/lustre/fs8/mbo`	lab root — change this only to move filesystems
`MBO_SCRATCH`	`$MBO_ROOT/scratch`	scratch root
`MBO_STORE`	`$MBO_ROOT/store`	long-term store
`MBO_SOFT`	`$MBO_SCRATCH/mbo_soft`	shared software root
`MBO_BIN`	`$MBO_SOFT/bin`	shared bin (on `PATH`)
`MBO_REPOS`	`$MBO_SOFT/repos`	shared repos
`MBO_NVIM`	`$MBO_SOFT/neovim`	neovim install
`MBO_ENVS`	`$MBO_SOFT/envs`	shared venvs dir
`MBO_ENV`	`$MBO_ENVS/mbo`	default shared venv
`MBO_DATA`	`$MBO_SCRATCH/mbo_data`	data root
`MBO_LBM`	`$MBO_DATA/lbm`	LBM data
`MBO_LSM`	`$MBO_DATA/lsm`	LSM data
`MBO_USER`	`$MBO_SCRATCH/$USER`	your personal scratch (override: set `MBO_USER` first)

Also sets UV_LINK_MODE=hardlink, UV_CACHE_DIR=$MBO_USER/.uv/cache, UV_PYTHON_INSTALL_DIR=$MBO_USER/.uv/python.

Navigation aliases and helpers (defined in config/hpc/mbo.sh):

Command	Action
`cdsoft` / `cdrepos` / `cdscratch` / `cdme`	cd to `$MBO_SOFT` / `$MBO_REPOS` / `$MBO_SCRATCH` / `$MBO_USER`
`cddata` / `cdlbm` / `cdlsm`	cd to `$MBO_DATA` / `$MBO_LBM` / `$MBO_LSM`
`mbo-activate [env]`	source a shared venv (default `mbo`)
`mbo-run <cmd> [args]`	run an executable from the shared `mbo` venv
`mbo-jobs` / `mbo-gpus`	`squeue --me` / list HPC GPU node availability
`mbo-gpu [part] [time] [n]` / `mbo-cpu [part] [time]`	interactive GPU / CPU shell via `srun`
`mbo-stage <path> [dest]` / `mbo-pull` / `mbo-push`	rsync data-transfer helpers
`mbo-nvim-setup` / `mbo-update`	install nvim tools / pull latest configs

Verify live values after login with env | grep '^MBO_'.