File Formats#
imread() and the GUI file dialogs are primarily designed to read ScanImage TIFFs
and future raw filetypes supported at the Miller Brain Observatory.
The raw metadata is made OME and ImageJ/Fiji compatible when writing to disk, ensuring downstream tools can interpret volumetric and multi-channel data correctly.
Additional formats are made as-needed for specific tasks, e.g. Suite2p .bin and .h5 for microscope calibrations.
The 5D contract#
Every array returned by imread() reports a canonical 5D TCZYX layout via .shape5d,
regardless of how many dimensions the underlying file actually has on disk.
The dimension order is fixed:
Axis |
Index |
Meaning |
|---|---|---|
T |
0 |
timepoints (frames) |
C |
1 |
channels |
Z |
2 |
z-planes |
Y |
3 |
image rows |
X |
4 |
image columns |
A typical LBM volumetric scan (1574 frames, 14 z-planes, 550×448) looks like
shape5d == (1574, 1, 14, 550, 448) — one channel, fourteen z-planes.
.shape may be the same as .shape5d (the default for ScanImage/Zarr/H5/Suite2p
arrays) or it may be a natural rank view that drops singleton/missing dims —
the only class that does this today is the generic TiffArray, which reports
the rank of whatever’s actually on disk (2D, 3D, 4D, or 5D). If you ever need
to reason about layout in a class-agnostic way, use .shape5d and .dims
(('T', 'C', 'Z', 'Y', 'X')) — they’re stable.
Quick Reference#
The shape5d column shows what .shape5d reports for a typical file. The
.shape column shows what plain .shape returns (only differs from .shape5d
for the natural-rank TiffArray).
Input |
Returns |
|
|
Description |
|---|---|---|---|---|
|
||||
↳ ScanImage raw |
|
|
same |
LBM with z-planes as channels |
|
|
same |
Piezo z-stacks, optional averaging |
|
|
|
same |
LBM + piezo (pollen calibration) |
|
|
|
same |
Single-plane time series |
|
↳ Standard/ImageJ |
|
|
natural rank (2D–5D) |
All TIFFs including ImageJ hyperstacks |
|
|
|
user-supplied |
Suite2p binary (requires shape) |
|
|
|
same |
HDF5 datasets |
|
|
|
same |
Zarr v3 / OME-Zarr |
|
|
|
same |
Memory-mapped numpy |
|
|
|
same |
In-memory wrapper |
Directory |
||||
↳ |
|
|
same |
Suite2p single-plane |
↳ |
|
|
same |
Suite2p volumetric |
↳ |
|
|
natural rank |
Multi-plane TIFF volume |
Detection Logic#
imread(path)
│
├── np.ndarray ───────────────────────────► NumpyArray (in-memory)
├── .npy ─────────────────────────────────► NumpyArray (mmap)
├── .h5 / .hdf5 ──────────────────────────► H5Array
├── .zarr ────────────────────────────────► ZarrArray
├── .bin (with ops.npy nearby) ───────────► Suite2pArray
├── .bin (no ops.npy) ────────────────────► BinArray (shape required)
│
├── .tif / .tiff
│ ├── ScanImage metadata?
│ │ ├── stack_type == "lbm" ──────────► LBMArray
│ │ ├── stack_type == "piezo" ────────► PiezoArray
│ │ ├── stack_type == "pollen" ───────► LBMPiezoArray
│ │ └── stack_type == "single_plane" ─► SinglePlaneArray
│ └── else ─────────────────────────────► TiffArray
│
└── Directory
├── *.zarr files ─────────────────────► ZarrArray
├── ops.npy ──────────────────────────► Suite2pArray
├── planeXX/ with ops.npy ────────────► Suite2pArray (volumetric)
├── planeXX.tiff files ───────────────► TiffArray (volumetric)
└── ScanImage TIFFs ──────────────────► ScanImageArray subclass
Array Types#
ScanImage Arrays#
Returned when reading raw ScanImage TIFF files. imread() auto-detects the stack type:
This folder containing all tiffs for a single session can be passed to imread, as well as a single files or a list of files.
import mbo_utilities as mbo
arr = mbo.imread("/path/to/raw/*.tif")
print(type(arr).__name__) # LBMArray, PiezoArray, SinglePlaneArray, or LBMPiezoArray
print(arr.stack_type) # 'lbm', 'piezo', 'single_plane', or 'pollen'
All ScanImage arrays support:
ROI handling:
arr.roi = None(stitch all),arr.roi = 1(specific ROI),arr.roi = [1,2](multiple)Phase correction:
arr.fix_phase = True/FalseMetadata:
arr.metadata["si"]contains raw ScanImage headersAxial Registration: phase-correlation z-plane registration (see
compute_axial_shifts)
LBMArray#
Light Beads Microscopy with z-planes interleaved as ScanImage channels.
arr = mbo.imread("/path/to/lbm_data.tif")
print(arr.shape5d) # (T, C, Z, Y, X), e.g. (1574, 1, 14, 550, 448)
print(arr.num_planes) # number of z-planes (= shape5d[2])
# note: dz must be user-supplied (not in ScanImage metadata for LBM)
PiezoArray#
Aquisitions using the ScanImage Piezo hStackManager produce z-stacks with optional frame averaging.
arr = mbo.imread("/path/to/piezo_data.tif")
print(arr.shape5d) # (T, C, Z, Y, X)
print(arr.frames_per_slice) # frames per z-position
print(arr.can_average) # True if averaging possible
arr.average_frames = True # toggle averaging based on `scanimage.logAverageFactor`
LBMPiezoArray#
Combined LBM + piezo, typically for pollen calibration. Each piezo step ends up as a z-plane in the canonical layout, and the LBM beamlets land on the channel axis.
arr = mbo.imread("/path/to/pollen_calibration.tif")
print(arr.stack_type) # 'pollen'
print(arr.shape5d) # (T, C, Z, Y, X) — C = beamlets, Z = piezo positions
SinglePlaneArray#
Single-plane time series (no z-stack).
arr = mbo.imread("/path/to/single_plane.tif")
print(arr.shape5d) # (T, C, Z=1, Y, X)
TiffArray#
Universal TIFF reader for non-ScanImage files. Automatically handles standard TIFF stacks, ImageJ hyperstacks (interleaved TZYX), and multi-plane volumes (planeXX.tiff directories).
TiffArray is the only array class that reports a natural rank .shape —
it returns whatever the file actually has on disk (2D, 3D, 4D, or 5D). Use
.shape5d to always get the canonical TCZYX layout.
# 2D tif → natural rank 2
arr = mbo.imread("/path/to/single_image.tif")
print(arr.shape, arr.shape5d)
# (Y, X) (1, 1, 1, Y, X)
# 3D stack → natural rank 3
arr = mbo.imread("/path/to/tyx_stack.tif")
print(arr.shape, arr.shape5d)
# (T, Y, X) (T, 1, 1, Y, X)
# ImageJ hyperstack → natural rank 4 (auto-detected)
arr = mbo.imread("/path/to/imagej_hyperstack.tif")
print(arr.shape, arr.shape5d)
# (T, Z, Y, X) (T, 1, Z, Y, X)
print(arr.is_volumetric) # True
# volumetric from directory of planeXX.tif files
vol = mbo.imread("/path/to/tiff_output/")
print(vol.shape5d) # (T, 1, Z, Y, X)
print(vol.is_volumetric) # True
Suite2pArray#
Suite2p binary files with full ops.npy context.
arr = mbo.imread("/path/to/suite2p/plane0")
print(arr.shape5d) # (T, 1, 1, Y, X) — single plane
print(arr.raw_file) # path to data_raw.bin
print(arr.reg_file) # path to data.bin
arr.switch_channel(use_raw=True) # toggle raw/registered
# volumetric (planeXX/ subdirs each with ops.npy)
vol = mbo.imread("/path/to/suite2p_output/")
print(vol.shape5d) # (T, 1, Z, Y, X)
Note: frame count is computed from actual file size, not ops.npy (which may be stale).
BinArray#
Direct binary file access when no ops.npy context is available. The user
supplies the shape explicitly; the array adopts that as .shape and pads
out to 5D for .shape5d.
from mbo_utilities.arrays import BinArray
# requires explicit shape — any rank up to 5D
arr = BinArray("/path/to/data.bin", shape=(1000, 512, 512))
print(arr.shape) # (1000, 512, 512) — what you passed in
print(arr.shape5d) # (1000, 1, 1, 512, 512) — canonical TCZYX
# read/write via memmap
arr[0] = new_frame
arr.close()
H5Array#
HDF5 datasets with auto-detection of common dataset names. Reads from
/mov by default — same name mbo.imwrite(..., ext=".h5") writes to —
falling back to /data or the first available dataset.
arr = mbo.imread("/path/to/data.h5")
print(arr.dataset_name) # 'mov', 'data', or first available
print(arr.shape5d) # (T, C, Z, Y, X)
# specify dataset explicitly
arr = mbo.imread("/path/to/data.h5", dataset="imaging_data")
ZarrArray#
Zarr v3 stores including OME-Zarr.
arr = mbo.imread("/path/to/data.zarr")
print(arr.shape5d) # (T, C, Z, Y, X)
print(arr.metadata) # OME-NGFF attributes if present
# multiple zarr stores stacked as z-planes
arr = mbo.imread(["/path/plane01.zarr", "/path/plane02.zarr"])
You can also pass a path to the inner zarr.json (e.g. from a file picker)
and it will resolve to the parent .zarr store automatically.
NumpyArray#
Wraps .npy files (memory-mapped) or in-memory numpy arrays. Numpy input
of any rank up to 5D is accepted; the missing dims are inferred from shape
heuristics and recorded on .shape5d.
# from file
arr = mbo.imread("/path/to/data.npy")
# from in-memory array
import numpy as np
data = np.random.randn(100, 512, 512).astype(np.float32)
arr = mbo.imread(data)
print(arr.shape, arr.shape5d)
# (100, 512, 512) (100, 1, 1, 512, 512)
# enables imwrite to any format
mbo.imwrite(arr, "output", ext=".zarr")
Common Properties#
All array types provide:
Property |
Description |
|---|---|
|
array dimensions (natural rank for |
|
always 5D |
|
data type |
|
number of dims in |
|
dim labels, e.g. |
|
file/array metadata dict |
|
number of z-planes (= |
Most array types also provide:
Property |
Description |
|---|---|
|
release file handles |
The convention: if you’re writing code that needs to work across array types,
always reach for .shape5d and .dims rather than .shape and .ndim.
TiffArray is the one class that varies — its .shape reflects whatever
the file actually has on disk, which is convenient for ad-hoc inspection
but breaks code that assumes a fixed rank.
ScanImage-specific:
Property |
Description |
|---|---|
|
‘lbm’, ‘piezo’, ‘single_plane’, or ‘pollen’ |
|
number of ROIs |
|
ROI selection (None, int, or list) |
|
enable/disable phase correction |
PiezoArray-specific:
Property |
Description |
|---|---|
|
frames per z-position |
|
True if averaging possible |
|
toggle frame averaging |
Writing Data#
All array types support imwrite():
import mbo_utilities as mbo
arr = mbo.imread("/path/to/data.tif")
# write to different formats
mbo.imwrite(arr, "output", ext=".zarr") # OME-Zarr v3
mbo.imwrite(arr, "output", ext=".tiff") # BigTIFF
mbo.imwrite(arr, "output", ext=".h5") # HDF5
mbo.imwrite(arr, "output", ext=".npy") # NumPy
mbo.imwrite(arr, "output", ext=".bin") # Suite2p binary
# subset selection
mbo.imwrite(arr, "output", ext=".zarr", frames=range(100))
mbo.imwrite(arr, "output", ext=".zarr", planes=[0, 2, 4])
# zarr options
mbo.imwrite(arr, "output", ext=".zarr", sharded=True, compression_level=1)
Metadata is automatically adjusted when subsetting (e.g., dz doubles when selecting every 2nd plane).
API Reference#
mbo_utilities.imread()- unified file readermbo_utilities.imwrite()- unified file writermbo_utilities.arrays- direct access to array classes