Lazy Array Types#
Understanding what imread() returns and when to use each array type.
Overview#
mbo_utilities.imread() is a smart file reader that automatically detects the file type and returns the appropriate lazy array class. All array types provide:
Lazy loading: Data is read on-demand, not loaded entirely into memory
NumPy-like indexing: Standard slicing syntax (
arr[0],arr[10:20, :, 100:200])_imwrite()support: All arrays can be written to any output format viaimwrite()Metadata: Accessible via
.metadataproperty
Quick Reference#
Input |
Returns |
Shape |
Use Case |
|---|---|---|---|
|
|
(T, Z, Y, X) |
Multi-ROI volumetric data with phase correction |
|
|
(T, 1, Y, X) |
Standard TIFF files, lazy page access |
|
|
(T, Z, Y, X) |
Dask-backed multi-file TIFF |
Directory with |
|
(T, Z, Y, X) |
Multi-plane TIFF volume |
|
|
(T, Y, X) |
Direct binary file manipulation |
Directory with |
|
(T, Y, X) |
Suite2p workflow integration |
Directory with |
|
(T, Z, Y, X) |
Multi-plane Suite2p output |
|
|
varies |
HDF5 datasets |
|
|
(T, Z, Y, X) |
Zarr v3 / OME-Zarr stores |
|
|
varies |
NumPy memory-mapped files |
|
|
varies |
Neurodata Without Borders files |
|
|
varies |
Wrap numpy arrays for imwrite support |
Isoview lightsheet directory |
|
(T, Z, V, Y, X) |
Multi-view lightsheet data |
Array Type Details#
MboRawArray#
Returned when: Reading raw ScanImage TIFF files with multi-ROI metadata
import mbo_utilities as mbo
# Raw ScanImage TIFFs
scan = mbo.imread("/path/to/raw/*.tif")
# Returns: MboRawArray
print(type(scan)) # <class 'MboRawArray'>
print(scan.shape) # (T, Z, Y, X) - e.g., (10000, 14, 456, 896)
print(scan.num_rois) # Number of ROIs
print(scan.num_planes) # Alias for num_channels (Z planes)
# ROI handling
scan.roi = None # Stitch all ROIs horizontally (default)
scan.roi = 0 # Split into separate ROIs (returns tuple)
scan.roi = 1 # Use only ROI 1 (1-indexed)
scan.roi = [1, 2] # Select specific ROIs
# Phase correction settings
scan.fix_phase = True # Enable bidirectional scan-phase correction
scan.use_fft = True # Use FFT-based phase correction
scan.phasecorr_method = "mean" # "mean", "median", "max"
scan.border = 3 # Border pixels to exclude
scan.upsample = 5 # Subpixel upsampling factor
scan.max_offset = 4 # Maximum phase offset to search
Key Features:
Automatic ROI stitching/splitting via
roipropertyBidirectional scan-phase correction (configurable methods)
Multi-plane volumetric data support
ROI position extraction from ScanImage metadata
Stores all ScanImage-metadata in array.metadata[“si”]
TiffArray#
Returned when: Reading processed TIFF file(s) without ScanImage metadata
# Single or multiple standard TIFF files
arr = mbo.imread("/path/to/processed.tif")
arr = mbo.imread(["/path/file1.tif", "/path/file2.tif"])
# Returns: TiffArray
print(type(arr)) # <class 'TiffArray'>
print(arr.shape) # (T, 1, Y, X) - always 4D with Z=1
print(arr.dtype) # Data type from TIFF
# Lazy frame reading
frame = arr[0] # Read first frame
subset = arr[10:20] # Read frames 10-19 (only those pages are loaded)
# Dtype conversion
arr32 = arr.astype(np.float32)
Key Features:
Uses
TiffFilehandles for lazy page accessAuto-detects frame count from JSON metadata, IFD estimation, or page counting
Multi-file support (concatenated along time axis)
Thread-safe page reading
Always outputs (T, 1, Y, X) format
MBOTiffArray#
Returned when: Reading TIFFs with MBO-specific metadata (uses Dask backend)
# MBO-processed TIFFs with metadata
arr = mbo.imread("/path/to/processed/*.tif")
# Returns: MBOTiffArray if MBO metadata detected
print(type(arr)) # <class 'MBOTiffArray'>
print(arr.shape) # (T, Z, Y, X) - Dask infers shape
print(arr.dask) # Access underlying dask.Array
# Lazy operations via Dask
mean_proj = arr[:100].mean(axis=0).compute()
Key Features:
Dask-backed for truly lazy, chunked access
Uses
tifffile.imread(aszarr=True)for memory-mapped accessAutomatic dimension handling (2D → TZYX, 3D → TZYX, 4D passthrough)
Preserves file tags from filenames
TiffVolumeArray#
Returned when: Directory contains files matching planeXX.tiff pattern
# Directory with plane TIFF files
arr = mbo.imread("/path/to/tiff_output/")
# Detects plane01.tiff, plane02.tiff, etc.
# Returns: TiffVolumeArray
print(type(arr)) # <class 'TiffVolumeArray'>
print(arr.shape) # (T, Z, Y, X) - e.g., (10000, 14, 512, 512)
print(arr.num_planes) # 14
print(len(arr.planes)) # 14 TiffArray objects
# Access specific plane
plane7_data = arr[:, 6] # All frames from plane 7 (0-indexed)
# Close file handles when done
arr.close()
Key Features:
Stacks individual plane TIFFs into a 4D volume
Each plane is loaded lazily via
TiffArrayAuto-sorts by plane number from filename
Validates consistent spatial shapes across planes
BinArray#
Returned when: Explicitly reading a .bin file path
# Reading a specific binary file
arr = mbo.imread("path/to/data_raw.bin")
# Returns: BinArray
print(type(arr)) # <class 'BinArray'>
print(arr.shape) # (nframes, Ly, Lx)
print(arr.nframes) # Number of frames
print(arr.Ly, arr.Lx) # Spatial dimensions
# Access data like numpy array
frame = arr[0] # First frame
subset = arr[0:100] # First 100 frames
# Write access (if opened for writing)
arr[0] = new_frame
# Context manager support
with BinArray("data.bin", shape=(100, 512, 512)) as arr:
arr[:] = data
# Close when done
arr.close()
Key Features:
Direct binary file access via
np.memmapAuto-infers shape from adjacent
ops.npyif presentCan provide shape manually:
BinArray("file.bin", shape=(1000, 512, 512))Read/write access (supports
__setitem__)Useful for creating new binary files from scratch
When to use:
Reading/writing specific binary files in a Suite2p workflow
Creating new binary files from scratch
When you want to work with the file directly, not through Suite2p’s abstraction
Suite2pArray#
Returned when: Reading a directory containing ops.npy, or ops.npy directly
# Reading a Suite2p directory
arr = mbo.imread("/path/to/suite2p/plane0")
arr = mbo.imread("/path/to/suite2p/plane0/ops.npy")
# Returns: Suite2pArray
print(type(arr)) # <class 'Suite2pArray'>
print(arr.shape) # (nframes, Ly, Lx) - 3D
print(arr.metadata) # Full ops.npy contents
# File paths
print(arr.raw_file) # Path to data_raw.bin (unregistered)
print(arr.reg_file) # Path to data.bin (registered)
print(arr.active_file) # Currently active file
# Switch between raw and registered
arr.switch_channel(use_raw=True) # Use data_raw.bin
arr.switch_channel(use_raw=False) # Use data.bin (default)
# Visualization with both channels
iw = arr.imshow() # Shows raw and registered side-by-side if both exist
Key Features:
Full Suite2p context (metadata from
ops.npy)Access to both raw (
data_raw.bin) and registered (data.bin) dataMemory-mapped via
np.memmapfor lazy loadingFile size validation against ops metadata
Integrates with Suite2p’s processing pipeline
Suite2pVolumeArray#
Returned when: Directory contains multiple planeXX/ subdirectories with ops.npy
# Multi-plane Suite2p output
arr = mbo.imread("/path/to/suite2p_output/")
# Detects plane01_stitched/, plane02_stitched/, etc.
# Returns: Suite2pVolumeArray
print(type(arr)) # <class 'Suite2pVolumeArray'>
print(arr.shape) # (T, Z, Y, X) - e.g., (10000, 14, 512, 512)
print(arr.num_planes) # 14
# Access specific plane
plane7_data = arr[:, 6] # All frames from plane 7 (0-indexed)
# Switch all planes between raw and registered
arr.switch_channel(use_raw=True)
# Close all memory-mapped files
arr.close()
Key Features:
Stacks individual
Suite2pArrayobjects into 4D volumeAuto-sorts by plane number from directory name
switch_channel()applies to all planesValidates consistent shapes across planes
H5Array#
Returned when: Reading HDF5 files (.h5, .hdf5)
# HDF5 dataset
arr = mbo.imread("/path/to/data.h5")
# Returns: H5Array
print(type(arr)) # <class 'H5Array'>
print(arr.shape) # Dataset shape
print(arr.dataset_name) # Auto-detected: 'mov', 'data', or first available
# Optionally specify dataset name
arr = mbo.imread("/path/to/data.h5", dataset="imaging_data")
# Access data
frame = arr[0]
subset = arr[10:20, :, 100:200]
# File-level metadata
print(arr.metadata) # HDF5 file attributes
# Close file handle
arr.close()
Key Features:
Auto-detects common dataset names:
'mov','data','scan_corrections'Lazy loading via
h5py.DatasetSupports ellipsis indexing (
arr[..., 100:200])File-level attributes exposed via
.metadata
ZarrArray#
Returned when: Reading Zarr stores (.zarr directories)
# Zarr store (standard or OME-Zarr)
arr = mbo.imread("/path/to/data.zarr")
# Returns: ZarrArray
print(type(arr)) # <class 'ZarrArray'>
print(arr.shape) # (T, Z, Y, X) - always 4D
# Read multiple zarr stores as z-planes
arr = mbo.imread(["/path/plane01.zarr", "/path/plane02.zarr"])
# Access pre-computed statistics (if available in OME metadata)
print(arr.zstats) # {'mean': [...], 'std': [...], 'snr': [...]}
# Access metadata (OME-NGFF attributes if present)
print(arr.metadata)
Key Features:
Supports both standard Zarr arrays and OME-Zarr groups
Auto-detects OME-Zarr structure (looks for
"0"subarray in groups)Multi-store support (stacked along Z axis)
Zarr v3 compatible
Exposes OME-NGFF metadata via
.metadata
NumpyArray#
Returned when: Reading .npy files OR passing an in-memory numpy array to imread()
This is the most versatile array type - it wraps any numpy array and provides full imwrite() support.
From .npy Files (Memory-Mapped)#
# Read .npy file - memory-mapped for lazy loading
arr = mbo.imread("/path/to/data.npy")
# Returns: NumpyArray
print(type(arr)) # <class 'NumpyArray'>
print(arr.shape) # (T, Y, X) or (T, Z, Y, X)
print(arr.dims) # 'TYX' or 'TZYX' (auto-inferred)
From In-Memory Numpy Arrays#
import numpy as np
import mbo_utilities as mbo
# Create or load a numpy array from anywhere
data = np.random.randn(100, 512, 512).astype(np.float32)
# Wrap with imread - returns NumpyArray
arr = mbo.imread(data)
print(arr)
# NumpyArray(shape=(100, 512, 512), dtype=float32, dims='TYX' (in-memory))
# Now you have full imwrite support with all features
mbo.imwrite(arr, "output", ext=".zarr") # Zarr v3 with chunking/sharding
mbo.imwrite(arr, "output", ext=".tiff") # BigTIFF
mbo.imwrite(arr, "output", ext=".bin") # Suite2p binary + ops.npy
mbo.imwrite(arr, "output", ext=".h5") # HDF5
mbo.imwrite(arr, "output", ext=".npy") # NumPy format
4D Volumetric Data#
# 4D arrays are automatically detected as (T, Z, Y, X)
volume = np.random.randn(100, 15, 512, 512).astype(np.float32)
arr = mbo.imread(volume)
print(arr.dims) # 'TZYX'
print(arr.num_planes) # 15
# Write specific planes
mbo.imwrite(arr, "output", ext=".zarr", planes=[1, 7, 14])
Key Features:
Automatic dimension inference (
TYX,TZYX,YX, etc.)Memory-mapped for
.npyfiles (lazy loading)Full
imwrite()support with all output formatsChunked reduction operations (
mean,std,max,min)Metadata auto-generation from array shape
NWBArray#
Returned when: Reading NWB (Neurodata Without Borders) files
# NWB file with TwoPhotonSeries
arr = mbo.imread("/path/to/experiment.nwb")
# Returns: NWBArray
print(type(arr)) # <class 'NWBArray'>
print(arr.shape) # Shape from TwoPhotonSeries data
# Access data
frame = arr[0]
Key Features:
Reads
TwoPhotonSeriesacquisition data from NWB filesRequires
pynwbpackage (pip install pynwb)Exposes underlying NWB data object
IsoviewArray#
Returned when: Manually instantiated for isoview lightsheet microscopy data
from mbo_utilities.arrays import IsoviewArray
# Isoview lightsheet data (multi-timepoint)
arr = IsoviewArray("/path/to/output")
# Shape: (T, Z, Views, Y, X) - 5D
print(arr.shape) # (10, 543, 4, 2048, 2048)
print(arr.views) # [(0, 0), (1, 0), (2, 1), (3, 1)] - (camera, channel) pairs
print(arr.num_views) # 4
# Access specific view
frame = arr[0, 100, 0] # timepoint 0, z=100, view 0 (camera 0, channel 0)
# Get view index for camera/channel
idx = arr.view_index(camera=1, channel=0)
# Access labels and projections (consolidated structure only)
labels = arr.get_labels(timepoint=0, camera=0, label_type='segmentation')
proj = arr.get_projection(timepoint=0, camera=0, proj_type='xy')
Key Features:
Supports two data structures:
Consolidated:
data_TM000000_SPM00.zarr/camera_0/0/Separate:
SPM00_TM000000_CM00_CHN01.zarr
Multi-view (camera/channel combinations)
5D shape:
(T, Z, Views, Y, X)or 4D(Z, Views, Y, X)for single timepointAccess to segmentation labels and projections
Lazy loading via Zarr
Decision Tree: What Will imread() Return?#
imread(input)
│
├─ isinstance(input, np.ndarray)?
│ └─ Yes → NumpyArray (in-memory wrapper)
│
├─ Is input a Path or string?
│ │
│ ├─ .npy file?
│ │ └─ NumpyArray (memory-mapped)
│ │
│ ├─ .nwb file?
│ │ └─ NWBArray
│ │
│ ├─ .h5 / .hdf5 file?
│ │ └─ H5Array
│ │
│ ├─ .zarr directory?
│ │ └─ ZarrArray
│ │
│ ├─ .bin file (direct path)?
│ │ └─ BinArray
│ │
│ ├─ .tif / .tiff file(s)?
│ │ ├─ Has ScanImage ROI metadata? → MboRawArray
│ │ ├─ Has MBO metadata (multiple files)? → MBOTiffArray
│ │ └─ Standard TIFF → TiffArray
│ │
│ ├─ Directory?
│ │ ├─ Contains planeXX.tiff files? → TiffVolumeArray
│ │ ├─ Contains planeXX/ subdirs with ops.npy? → Suite2pVolumeArray
│ │ ├─ Contains ops.npy? → Suite2pArray
│ │ └─ Contains raw ScanImage TIFFs? → MboRawArray
│ │
│ └─ ops.npy file directly?
│ └─ Suite2pArray
│
└─ List of paths?
├─ All .tif files → TiffArray or MBOTiffArray
└─ All .zarr stores → ZarrArray (stacked along Z)
Common Properties Across All Array Types#
All lazy array types provide these standard properties:
Property |
Type |
Description |
|---|---|---|
|
|
Array dimensions |
|
|
Data type |
|
|
Number of dimensions |
|
|
Array/file metadata |
|
|
Source file paths |
|
method |
Write to any output format |
Most array types also provide:
Property |
Type |
Description |
|---|---|---|
|
|
Number of Z-planes |
|
|
Number of ROIs (MboRawArray) |
|
method |
Release file handles |
API Reference#
mbo_utilities.imread()- Smart file readermbo_utilities.imwrite()- Universal file writermbo_utilities.arrays- Direct access to array classes