File Formats#

imread() and the GUI file dialogs are primarily designed to read ScanImage TIFFs and future raw filetypes supported at the Miller Brain Observatory.

The raw metadata is made OME and ImageJ/Fiji compatible when writing to disk, ensuring downstream tools can interpret volumetric and multi-channel data correctly.

Additional formats are made as-needed for specific tasks, e.g. Suite2p .bin and .h5 for microscope calibrations.

Dimensions#

imread() returns 5D TCZYX arrays. .shape is always length 5 and the order is fixed:

Axis

Index

Meaning

T

0

timepoints (frames)

C

1

channels

Z

2

z-planes

Y

3

image rows

X

4

image columns

A typical LBM volumetric scan (1574 frames, 14 z-planes, 550×448) reports shape == (1574, 1, 14, 550, 448) — one channel, fourteen z-planes.

Size-1 axes are kept, not dropped. To drop them for inspection or display, use arr.squeeze() or imread(path, squeeze=True) — a view that still holds the 5D array underneath for writers and the viewer.

BinArray is the one exception: it reports the rank you pass in (see BinArray).

Quick Reference#

Input

Returns

shape

Description

.tiff

↳ ScanImage raw

LBMArray

(T, C, Z, Y, X)

LBM with z-planes as channels

PiezoArray

(T, C, Z, Y, X)

Piezo z-stacks, optional averaging

LBMPiezoArray

(T, C, Z, Y, X)

LBM + piezo (pollen calibration)

SinglePlaneArray

(T, C, Z, Y, X) (Z=1)

Single-plane time series

↳ Standard/ImageJ

TiffArray

(T, C, Z, Y, X)

All TIFFs including ImageJ hyperstacks

.bin

BinArray

as-passed, e.g. (T, Y, X)

Suite2p binary (requires shape)

.h5

H5Array

(T, C, Z, Y, X)

HDF5 datasets

.zarr

ZarrArray

(T, C, Z, Y, X)

Zarr v3 / OME-Zarr

.npy

NumpyArray

(T, C, Z, Y, X)

Memory-mapped numpy

np.ndarray

NumpyArray

(T, C, Z, Y, X)

In-memory wrapper

Directory

ops.npy

Suite2pArray

(T, C, Z, Y, X) (Z=1)

Suite2p single-plane

planeXX/ops.npy

Suite2pArray

(T, C, Z, Y, X)

Suite2p volumetric

planeXX.tiff

TiffArray

(T, C, Z, Y, X)

Multi-plane TIFF volume

Detection Logic#

imread(path)
│
├── np.ndarray ───────────────────────────► NumpyArray (in-memory)
├── .npy ─────────────────────────────────► NumpyArray (mmap)
├── .h5 / .hdf5 ──────────────────────────► H5Array
├── .zarr ────────────────────────────────► ZarrArray
├── .bin (with ops.npy nearby) ───────────► Suite2pArray
├── .bin (no ops.npy) ────────────────────► BinArray (shape required)
│
├── .tif / .tiff
│   ├── ScanImage metadata?
│   │   ├── stack_type == "lbm" ──────────► LBMArray
│   │   ├── stack_type == "piezo" ────────► PiezoArray
│   │   ├── stack_type == "pollen" ───────► LBMPiezoArray
│   │   └── stack_type == "single_plane" ─► SinglePlaneArray
│   └── else ─────────────────────────────► TiffArray
│
└── Directory
    ├── *.zarr files ─────────────────────► ZarrArray
    ├── ops.npy ──────────────────────────► Suite2pArray
    ├── planeXX/ with ops.npy ────────────► Suite2pArray (volumetric)
    ├── planeXX.tiff files ───────────────► TiffArray (volumetric)
    └── ScanImage TIFFs ──────────────────► ScanImageArray subclass

Array Types#

ScanImage Arrays#

Returned when reading raw ScanImage TIFF files. imread() auto-detects the stack type:

This folder containing all tiffs for a single session can be passed to imread, as well as a single files or a list of files.

import mbo_utilities as mbo

arr = mbo.imread("/path/to/raw/*.tif")
print(type(arr).__name__)  # LBMArray, PiezoArray, SinglePlaneArray, or LBMPiezoArray
print(arr.stack_type)      # 'lbm', 'piezo', 'single_plane', or 'pollen'

All ScanImage arrays support:

  • ROI handling: arr.roi = None (stitch all), arr.roi = 1 (specific ROI), arr.roi = [1,2] (multiple)

  • Phase correction: arr.fix_phase = True/False

  • Metadata: arr.metadata["si"] contains raw ScanImage headers

  • Axial Registration: phase-correlation z-plane registration (see compute_axial_shifts)

LBMArray#

Light Beads Microscopy with z-planes interleaved as ScanImage channels.

arr = mbo.imread("/path/to/lbm_data.tif")
print(arr.shape)    # (T, C, Z, Y, X), e.g. (1574, 1, 14, 550, 448)
print(arr.num_planes) # number of z-planes (= shape[2])
# note: dz must be user-supplied (not in ScanImage metadata for LBM)

PiezoArray#

Aquisitions using the ScanImage Piezo hStackManager produce z-stacks with optional frame averaging.

arr = mbo.imread("/path/to/piezo_data.tif")
print(arr.shape)          # (T, C, Z, Y, X)
print(arr.frames_per_slice) # frames per z-position
print(arr.can_average)      # True if averaging possible

arr.average_frames = True   # toggle averaging based on `scanimage.logAverageFactor`

LBMPiezoArray#

Combined LBM + piezo, typically for pollen calibration. Each piezo step ends up as a z-plane in the canonical layout, and the LBM beamlets land on the channel axis.

arr = mbo.imread("/path/to/pollen_calibration.tif")
print(arr.stack_type)  # 'pollen'
print(arr.shape)     # (T, C, Z, Y, X) — C = beamlets, Z = piezo positions

SinglePlaneArray#

Single-plane time series (no z-stack).

arr = mbo.imread("/path/to/single_plane.tif")
print(arr.shape)  # (T, C, Z=1, Y, X)

TiffArray#

Universal TIFF reader for non-ScanImage files. Automatically handles standard TIFF stacks, ImageJ hyperstacks (interleaved TZYX), and multi-plane volumes (planeXX.tiff directories).

Whatever the file’s on-disk rank, .shape is 5D TCZYX with singleton T/C/Z filled in:

# 2D image
arr = mbo.imread("/path/to/single_image.tif")
print(arr.shape)  # (1, 1, 1, Y, X)

# 3D time series
arr = mbo.imread("/path/to/tyx_stack.tif")
print(arr.shape)  # (T, 1, 1, Y, X)

# ImageJ hyperstack (auto-detected)
arr = mbo.imread("/path/to/imagej_hyperstack.tif")
print(arr.shape)          # (T, 1, Z, Y, X)
print(arr.is_volumetric)  # True

# volumetric from directory of planeXX.tif files
vol = mbo.imread("/path/to/tiff_output/")
print(vol.shape)           # (T, 1, Z, Y, X)
print(vol.is_volumetric)   # True

Suite2pArray#

Suite2p binary files with full ops.npy context.

arr = mbo.imread("/path/to/suite2p/plane0")
print(arr.shape)     # (T, 1, 1, Y, X) — single plane
print(arr.raw_file)    # path to data_raw.bin
print(arr.reg_file)    # path to data.bin

arr.switch_channel(use_raw=True)  # toggle raw/registered

# volumetric (planeXX/ subdirs each with ops.npy)
vol = mbo.imread("/path/to/suite2p_output/")
print(vol.shape)  # (T, 1, Z, Y, X)

Note: frame count is computed from actual file size, not ops.npy (which may be stale).

BinArray#

Direct binary file access when no ops.npy context is available. The user supplies the shape explicitly, and the array reports exactly that rank as .shape — it is the one array type whose .shape is not 5D.

from mbo_utilities.arrays import BinArray

# requires explicit shape — any rank up to 5D
arr = BinArray("/path/to/data.bin", shape=(1000, 512, 512))
print(arr.shape)  # (1000, 512, 512) — exactly what you passed in
print(arr.nz)     # 1               — TCZYX sizes are still available

# read/write via memmap
arr[0] = new_frame
arr.close()

H5Array#

HDF5 datasets with auto-detection of common dataset names. Reads from /mov by default — same name mbo.imwrite(..., ext=".h5") writes to — falling back to /data or the first available dataset.

arr = mbo.imread("/path/to/data.h5")
print(arr.dataset_name)  # 'mov', 'data', or first available
print(arr.shape)       # (T, C, Z, Y, X)

# specify dataset explicitly
arr = mbo.imread("/path/to/data.h5", dataset="imaging_data")

ZarrArray#

Zarr v3 stores including OME-Zarr.

arr = mbo.imread("/path/to/data.zarr")
print(arr.shape)   # (T, C, Z, Y, X)
print(arr.metadata)  # OME-NGFF attributes if present

# multiple zarr stores stacked as z-planes
arr = mbo.imread(["/path/plane01.zarr", "/path/plane02.zarr"])

You can also pass a path to the inner zarr.json (e.g. from a file picker) and it will resolve to the parent .zarr store automatically.

NumpyArray#

Wraps .npy files (memory-mapped) or in-memory numpy arrays. Input of any rank up to 5D is accepted and presented as a 5D TCZYX array.

import numpy as np
import mbo_utilities as mbo

# from file (memory-mapped)
arr = mbo.imread("/path/to/data.npy")

# from in-memory array
data = np.random.randn(100, 512, 512).astype(np.float32)
arr = mbo.imread(data)
print(arr.shape)  # (100, 1, 1, 512, 512)

mbo.imwrite(arr, "output", ext=".zarr")  # imwrite to any format

Dimension labels#

Axes are declared with dims; when omitted they are chain-guessed from the rank:

Input rank

Inferred dims

.shape

2D

YX

(1, 1, 1, Y, X)

3D

TYX

(T, 1, 1, Y, X)

4D

TZYX

(T, 1, Z, Y, X)

5D

TCZYX

(T, C, Z, Y, X)

imread() never errors on dimensions: it logs the order it picked. If a declared order is unusable (wrong length, duplicate or unknown axis), it warns and falls back to the rank guess above rather than raising. If the guess is wrong (e.g. a 4D two-channel movie read as TZYX), declare the axes:

data = np.random.randn(100, 2, 512, 512).astype(np.float32)

mbo.imread(data).shape                 # (100, 1, 2, 512, 512)  -> the 2 is Z
mbo.imread(data, dims="TCYX").shape    # (100, 2, 1, 512, 512)  -> the 2 is C

dims describes the source axes (length == input ndim, chars from TCZYX). The array is canonicalized to 5D TCZYX, so .dims always reports ('T', 'C', 'Z', 'Y', 'X') and .shape places each source axis accordingly; the declared order is kept on .input_dims. Labels can be set after construction — this is reactive and updates the derived OME axes and voxel scale:

arr.dims = "TCYX"                                  # or
arr.metadata = {"dims": "TCYX"}                    # or
arr.metadata = {"dimension_names": ["t","c","y","x"]}   # NGFF lowercase form

Because the read step only guesses, labels must be correct before imwrite — the writer uses whatever dims resolved to for the OME-Zarr dimension_names. Set dims (or dimension_names) if the rank guess put your channel axis on Z.

Adding metadata#

.metadata is a plain dict you can read or replace. Set the frame rate and voxel size so they flow into OME-Zarr / ImageJ output and downstream tools:

arr = mbo.imread(data, dims="TZYX")
arr.metadata = {**arr.metadata, "fs": 9.6, "dz": 15.0, "dx": 1.0, "dy": 1.0}

Keys use OME-compatible names (fs, dx/dy/dz, PhysicalSizeX, …). A "dims" key in the dict is applied as the axis order.

Running through a pipeline#

Declare the axes once, write a canonical file, then hand the path to a pipeline. A TZYX volume becomes a multi-plane OME-Zarr that Suite2p runs per plane:

import lbm_suite2p_python as lsp

vol = np.random.randn(120, 2, 256, 256).astype(np.float32)   # T, Z, Y, X

arr = mbo.imread(vol, dims="TZYX")          # (120, 1, 2, 256, 256)
arr.metadata = {**arr.metadata, "fs": 10.0}

zarr_path = mbo.imwrite(arr, "out", ext=".zarr", overwrite=True)
lsp.run_volume(zarr_path, save_path="out/suite2p")   # one Suite2p run per z-plane

Correct labels mean the writer tags the OME-Zarr axes correctly (t, z, y, x) and the pipeline extracts the right number of planes.

Common Properties#

All array types provide:

Property

Description

.shape

5D (T, C, Z, Y, X) (BinArray: the rank you passed in)

.dtype

data type

.ndim

number of dims in .shape (5, except BinArray)

.dims

dim labels, e.g. ('T', 'C', 'Z', 'Y', 'X')

.nt .nc .nz .ny .nx

individual TCZYX sizes

.metadata

file/array metadata dict

.num_planes

number of z-planes (= .nz)

The .nt/.nc/.nz/.ny/.nx accessors give individual sizes and are correct for every array type, including BinArray.

Most array types also provide:

Property

Description

.close()

release file handles

ScanImage-specific:

Property

Description

.stack_type

‘lbm’, ‘piezo’, ‘single_plane’, or ‘pollen’

.num_rois

number of ROIs

.roi

ROI selection (None, int, or list)

.fix_phase

enable/disable phase correction

PiezoArray-specific:

Property

Description

.frames_per_slice

frames per z-position

.can_average

True if averaging possible

.average_frames

toggle frame averaging

Writing Data#

All array types support imwrite():

import mbo_utilities as mbo

arr = mbo.imread("/path/to/data.tif")

# write to different formats
mbo.imwrite(arr, "output", ext=".zarr")   # OME-Zarr v3
mbo.imwrite(arr, "output", ext=".tiff")   # BigTIFF
mbo.imwrite(arr, "output", ext=".h5")     # HDF5
mbo.imwrite(arr, "output", ext=".npy")    # NumPy
mbo.imwrite(arr, "output", ext=".bin")    # Suite2p binary

# subset selection
mbo.imwrite(arr, "output", ext=".zarr", frames=range(100))
mbo.imwrite(arr, "output", ext=".zarr", planes=[0, 2, 4])

# zarr options
mbo.imwrite(arr, "output", ext=".zarr", sharded=True, compression_level=1)

Metadata is automatically adjusted when subsetting (e.g., dz doubles when selecting every 2nd plane).

API Reference#

  • mbo_utilities.imread() - unified file reader

  • mbo_utilities.imwrite() - unified file writer

  • mbo_utilities.arrays - direct access to array classes