File Formats#
imread() and the GUI file dialogs are primarily designed to read ScanImage TIFFs
and future raw filetypes supported at the Miller Brain Observatory.
The raw metadata is made OME and ImageJ/Fiji compatible when writing to disk, ensuring downstream tools can interpret volumetric and multi-channel data correctly.
Additional formats are made as-needed for specific tasks, e.g. Suite2p .bin and .h5 for microscope calibrations.
Dimensions#
imread() returns 5D TCZYX arrays. .shape is always length 5 and the
order is fixed:
Axis |
Index |
Meaning |
|---|---|---|
T |
0 |
timepoints (frames) |
C |
1 |
channels |
Z |
2 |
z-planes |
Y |
3 |
image rows |
X |
4 |
image columns |
A typical LBM volumetric scan (1574 frames, 14 z-planes, 550×448) reports
shape == (1574, 1, 14, 550, 448) — one channel, fourteen z-planes.
Size-1 axes are kept, not dropped. To drop them for inspection or display,
use arr.squeeze() or imread(path, squeeze=True) — a view that still holds
the 5D array underneath for writers and the viewer.
BinArray is the one exception: it reports the rank you pass in
(see BinArray).
Quick Reference#
Input |
Returns |
|
Description |
|---|---|---|---|
|
|||
↳ ScanImage raw |
|
|
LBM with z-planes as channels |
|
|
Piezo z-stacks, optional averaging |
|
|
|
LBM + piezo (pollen calibration) |
|
|
|
Single-plane time series |
|
↳ Standard/ImageJ |
|
|
All TIFFs including ImageJ hyperstacks |
|
|
as-passed, e.g. |
Suite2p binary (requires shape) |
|
|
|
HDF5 datasets |
|
|
|
Zarr v3 / OME-Zarr |
|
|
|
Memory-mapped numpy |
|
|
|
In-memory wrapper |
Directory |
|||
↳ |
|
|
Suite2p single-plane |
↳ |
|
|
Suite2p volumetric |
↳ |
|
|
Multi-plane TIFF volume |
Detection Logic#
imread(path)
│
├── np.ndarray ───────────────────────────► NumpyArray (in-memory)
├── .npy ─────────────────────────────────► NumpyArray (mmap)
├── .h5 / .hdf5 ──────────────────────────► H5Array
├── .zarr ────────────────────────────────► ZarrArray
├── .bin (with ops.npy nearby) ───────────► Suite2pArray
├── .bin (no ops.npy) ────────────────────► BinArray (shape required)
│
├── .tif / .tiff
│ ├── ScanImage metadata?
│ │ ├── stack_type == "lbm" ──────────► LBMArray
│ │ ├── stack_type == "piezo" ────────► PiezoArray
│ │ ├── stack_type == "pollen" ───────► LBMPiezoArray
│ │ └── stack_type == "single_plane" ─► SinglePlaneArray
│ └── else ─────────────────────────────► TiffArray
│
└── Directory
├── *.zarr files ─────────────────────► ZarrArray
├── ops.npy ──────────────────────────► Suite2pArray
├── planeXX/ with ops.npy ────────────► Suite2pArray (volumetric)
├── planeXX.tiff files ───────────────► TiffArray (volumetric)
└── ScanImage TIFFs ──────────────────► ScanImageArray subclass
Array Types#
ScanImage Arrays#
Returned when reading raw ScanImage TIFF files. imread() auto-detects the stack type:
This folder containing all tiffs for a single session can be passed to imread, as well as a single files or a list of files.
import mbo_utilities as mbo
arr = mbo.imread("/path/to/raw/*.tif")
print(type(arr).__name__) # LBMArray, PiezoArray, SinglePlaneArray, or LBMPiezoArray
print(arr.stack_type) # 'lbm', 'piezo', 'single_plane', or 'pollen'
All ScanImage arrays support:
ROI handling:
arr.roi = None(stitch all),arr.roi = 1(specific ROI),arr.roi = [1,2](multiple)Phase correction:
arr.fix_phase = True/FalseMetadata:
arr.metadata["si"]contains raw ScanImage headersAxial Registration: phase-correlation z-plane registration (see
compute_axial_shifts)
LBMArray#
Light Beads Microscopy with z-planes interleaved as ScanImage channels.
arr = mbo.imread("/path/to/lbm_data.tif")
print(arr.shape) # (T, C, Z, Y, X), e.g. (1574, 1, 14, 550, 448)
print(arr.num_planes) # number of z-planes (= shape[2])
# note: dz must be user-supplied (not in ScanImage metadata for LBM)
PiezoArray#
Aquisitions using the ScanImage Piezo hStackManager produce z-stacks with optional frame averaging.
arr = mbo.imread("/path/to/piezo_data.tif")
print(arr.shape) # (T, C, Z, Y, X)
print(arr.frames_per_slice) # frames per z-position
print(arr.can_average) # True if averaging possible
arr.average_frames = True # toggle averaging based on `scanimage.logAverageFactor`
LBMPiezoArray#
Combined LBM + piezo, typically for pollen calibration. Each piezo step ends up as a z-plane in the canonical layout, and the LBM beamlets land on the channel axis.
arr = mbo.imread("/path/to/pollen_calibration.tif")
print(arr.stack_type) # 'pollen'
print(arr.shape) # (T, C, Z, Y, X) — C = beamlets, Z = piezo positions
SinglePlaneArray#
Single-plane time series (no z-stack).
arr = mbo.imread("/path/to/single_plane.tif")
print(arr.shape) # (T, C, Z=1, Y, X)
TiffArray#
Universal TIFF reader for non-ScanImage files. Automatically handles standard TIFF stacks, ImageJ hyperstacks (interleaved TZYX), and multi-plane volumes (planeXX.tiff directories).
Whatever the file’s on-disk rank, .shape is 5D TCZYX with singleton T/C/Z
filled in:
# 2D image
arr = mbo.imread("/path/to/single_image.tif")
print(arr.shape) # (1, 1, 1, Y, X)
# 3D time series
arr = mbo.imread("/path/to/tyx_stack.tif")
print(arr.shape) # (T, 1, 1, Y, X)
# ImageJ hyperstack (auto-detected)
arr = mbo.imread("/path/to/imagej_hyperstack.tif")
print(arr.shape) # (T, 1, Z, Y, X)
print(arr.is_volumetric) # True
# volumetric from directory of planeXX.tif files
vol = mbo.imread("/path/to/tiff_output/")
print(vol.shape) # (T, 1, Z, Y, X)
print(vol.is_volumetric) # True
Suite2pArray#
Suite2p binary files with full ops.npy context.
arr = mbo.imread("/path/to/suite2p/plane0")
print(arr.shape) # (T, 1, 1, Y, X) — single plane
print(arr.raw_file) # path to data_raw.bin
print(arr.reg_file) # path to data.bin
arr.switch_channel(use_raw=True) # toggle raw/registered
# volumetric (planeXX/ subdirs each with ops.npy)
vol = mbo.imread("/path/to/suite2p_output/")
print(vol.shape) # (T, 1, Z, Y, X)
Note: frame count is computed from actual file size, not ops.npy (which may be stale).
BinArray#
Direct binary file access when no ops.npy context is available. The user
supplies the shape explicitly, and the array reports exactly that rank as
.shape — it is the one array type whose .shape is not 5D.
from mbo_utilities.arrays import BinArray
# requires explicit shape — any rank up to 5D
arr = BinArray("/path/to/data.bin", shape=(1000, 512, 512))
print(arr.shape) # (1000, 512, 512) — exactly what you passed in
print(arr.nz) # 1 — TCZYX sizes are still available
# read/write via memmap
arr[0] = new_frame
arr.close()
H5Array#
HDF5 datasets with auto-detection of common dataset names. Reads from
/mov by default — same name mbo.imwrite(..., ext=".h5") writes to —
falling back to /data or the first available dataset.
arr = mbo.imread("/path/to/data.h5")
print(arr.dataset_name) # 'mov', 'data', or first available
print(arr.shape) # (T, C, Z, Y, X)
# specify dataset explicitly
arr = mbo.imread("/path/to/data.h5", dataset="imaging_data")
ZarrArray#
Zarr v3 stores including OME-Zarr.
arr = mbo.imread("/path/to/data.zarr")
print(arr.shape) # (T, C, Z, Y, X)
print(arr.metadata) # OME-NGFF attributes if present
# multiple zarr stores stacked as z-planes
arr = mbo.imread(["/path/plane01.zarr", "/path/plane02.zarr"])
You can also pass a path to the inner zarr.json (e.g. from a file picker)
and it will resolve to the parent .zarr store automatically.
NumpyArray#
Wraps .npy files (memory-mapped) or in-memory numpy arrays. Input of any
rank up to 5D is accepted and presented as a 5D TCZYX array.
import numpy as np
import mbo_utilities as mbo
# from file (memory-mapped)
arr = mbo.imread("/path/to/data.npy")
# from in-memory array
data = np.random.randn(100, 512, 512).astype(np.float32)
arr = mbo.imread(data)
print(arr.shape) # (100, 1, 1, 512, 512)
mbo.imwrite(arr, "output", ext=".zarr") # imwrite to any format
Dimension labels#
Axes are declared with dims; when omitted they are chain-guessed from the
rank:
Input rank |
Inferred |
|
|---|---|---|
2D |
|
|
3D |
|
|
4D |
|
|
5D |
|
|
imread() never errors on dimensions: it logs the order it picked. If a
declared order is unusable (wrong length, duplicate or unknown axis), it
warns and falls back to the rank guess above rather than raising. If the
guess is wrong (e.g. a 4D two-channel movie read as TZYX), declare the axes:
data = np.random.randn(100, 2, 512, 512).astype(np.float32)
mbo.imread(data).shape # (100, 1, 2, 512, 512) -> the 2 is Z
mbo.imread(data, dims="TCYX").shape # (100, 2, 1, 512, 512) -> the 2 is C
dims describes the source axes (length == input ndim, chars from
TCZYX). The array is canonicalized to 5D TCZYX, so .dims always reports
('T', 'C', 'Z', 'Y', 'X') and .shape places each source axis
accordingly; the declared order is kept on .input_dims. Labels can be set
after construction — this is reactive and updates the derived OME axes and
voxel scale:
arr.dims = "TCYX" # or
arr.metadata = {"dims": "TCYX"} # or
arr.metadata = {"dimension_names": ["t","c","y","x"]} # NGFF lowercase form
Because the read step only guesses, labels must be correct before
imwrite — the writer uses whatever dims resolved to for the OME-Zarr
dimension_names. Set dims (or dimension_names) if the rank guess put
your channel axis on Z.
Adding metadata#
.metadata is a plain dict you can read or replace. Set the frame rate and
voxel size so they flow into OME-Zarr / ImageJ output and downstream tools:
arr = mbo.imread(data, dims="TZYX")
arr.metadata = {**arr.metadata, "fs": 9.6, "dz": 15.0, "dx": 1.0, "dy": 1.0}
Keys use OME-compatible names (fs, dx/dy/dz, PhysicalSizeX, …). A
"dims" key in the dict is applied as the axis order.
Running through a pipeline#
Declare the axes once, write a canonical file, then hand the path to a
pipeline. A TZYX volume becomes a multi-plane OME-Zarr that Suite2p runs
per plane:
import lbm_suite2p_python as lsp
vol = np.random.randn(120, 2, 256, 256).astype(np.float32) # T, Z, Y, X
arr = mbo.imread(vol, dims="TZYX") # (120, 1, 2, 256, 256)
arr.metadata = {**arr.metadata, "fs": 10.0}
zarr_path = mbo.imwrite(arr, "out", ext=".zarr", overwrite=True)
lsp.run_volume(zarr_path, save_path="out/suite2p") # one Suite2p run per z-plane
Correct labels mean the writer tags the OME-Zarr axes correctly (t, z, y, x) and the pipeline extracts the right number of planes.
Common Properties#
All array types provide:
Property |
Description |
|---|---|
|
5D |
|
data type |
|
number of dims in |
|
dim labels, e.g. |
|
individual TCZYX sizes |
|
file/array metadata dict |
|
number of z-planes (= |
The .nt/.nc/.nz/.ny/.nx accessors give individual sizes and are
correct for every array type, including BinArray.
Most array types also provide:
Property |
Description |
|---|---|
|
release file handles |
ScanImage-specific:
Property |
Description |
|---|---|
|
‘lbm’, ‘piezo’, ‘single_plane’, or ‘pollen’ |
|
number of ROIs |
|
ROI selection (None, int, or list) |
|
enable/disable phase correction |
PiezoArray-specific:
Property |
Description |
|---|---|
|
frames per z-position |
|
True if averaging possible |
|
toggle frame averaging |
Writing Data#
All array types support imwrite():
import mbo_utilities as mbo
arr = mbo.imread("/path/to/data.tif")
# write to different formats
mbo.imwrite(arr, "output", ext=".zarr") # OME-Zarr v3
mbo.imwrite(arr, "output", ext=".tiff") # BigTIFF
mbo.imwrite(arr, "output", ext=".h5") # HDF5
mbo.imwrite(arr, "output", ext=".npy") # NumPy
mbo.imwrite(arr, "output", ext=".bin") # Suite2p binary
# subset selection
mbo.imwrite(arr, "output", ext=".zarr", frames=range(100))
mbo.imwrite(arr, "output", ext=".zarr", planes=[0, 2, 4])
# zarr options
mbo.imwrite(arr, "output", ext=".zarr", sharded=True, compression_level=1)
Metadata is automatically adjusted when subsetting (e.g., dz doubles when selecting every 2nd plane).
API Reference#
mbo_utilities.imread()- unified file readermbo_utilities.imwrite()- unified file writermbo_utilities.arrays- direct access to array classes