1. I/O#
Functions to help with loading and saving data.
- mbo_utilities.imwrite(lazy_array, outpath: str | Path, ext: str = '.tiff', planes: list | tuple | None = None, num_frames: int | None = None, register_z: bool = False, roi: int | Sequence[int] | None = None, metadata: dict | None = None, overwrite: bool = False, order: list | tuple = None, target_chunk_mb: int = 20, progress_callback: Callable | None = None, debug: bool = False, shift_vectors: ndarray | None = None, output_name: str | None = None, **kwargs)[source]#
 Write a supported lazy imaging array (Suite2p, HDF5, TIFF, etc.) to disk.
This function handles writing multi-dimensional imaging data to various formats, with support for ROI selection, z-plane registration, chunked streaming, and format conversion. Use with imread() to load and convert imaging data.
- Parameters:
 - lazy_arrayobject
 One of the supported lazy array readers providing .shape, .metadata, and _imwrite() methods:
MboRawArray : Raw ScanImage/ScanMultiROI TIFF files with phase correction
Suite2pArray : Memory-mapped binary (data.bin or data_raw.bin) + ops.npy
MBOTiffArray : Multi-file TIFF reader using Dask backend
TiffArray : Single or multi-TIFF reader
H5Array : HDF5 dataset wrapper (h5py.File[dataset])
ZarrArray : Collection of z-plane .zarr stores
NpyArray : Single .npy memory-mapped NumPy file
NWBArray : NWB file with “TwoPhotonSeries” acquisition dataset
- outpathstr or Path
 Target directory to write output files. Will be created if it doesn’t exist. Files are named automatically based on plane/ROI (e.g., plane01_roi1.tiff).
- extstr, default=”.tiff”
 Output format extension. Supported formats: - .tiff, .tif : Multi-page TIFF (BigTIFF for >4GB) - .bin : Suite2p-compatible binary format with ops.npy metadata - .zarr : Zarr v3 array store - .h5, .hdf5 : HDF5 format
- planeslist | tuple | int | None, optional
 Z-planes to export (1-based indexing). Options: - None (default) : Export all planes - int : Single plane, e.g. planes=7 exports only plane 7 - list/tuple : Specific planes, e.g. planes=[1, 7, 14]
- roiint | Sequence[int] | None, optional
 ROI selection for multi-ROI data (e.g., MboRawArray from ScanImage). Options: - None (default) : Stitch/fuse all ROIs horizontally into single FOV - 0 : Split all ROIs into separate files (one file per ROI per plane) - int > 0 : Export specific ROI, e.g. roi=1 exports only ROI 1 - list/tuple : Export specific ROIs, e.g. roi=[1, 3] exports ROIs 1 and 3
- num_framesint, optional
 Number of frames to export. If None (default), exports all frames. Useful for testing or exporting subsets: num_frames=1000
- register_zbool, default=False
 Perform z-plane registration using Suite3D before writing. When True: - Computes rigid shifts between z-planes - Validates registration results (checks summary.npy for valid plane_shifts) - Applies shifts during write to align planes - Requires Suite3D and CuPy installed: pip install mbo_utilities[suite3d,cuda12] - Creates/reuses s3d job directory in outpath
- shift_vectorsnp.ndarray, optional
 Pre-computed z-shift vectors with shape (n_planes, 2) for [dy, dx] shifts. Use this to apply previously computed registration without re-running Suite3D. Example: shift_vectors=np.array([[0, 0], [2, -1], [1, 3]])
- metadatadict, optional
 Additional metadata to merge into output file headers/attributes. Merged with existing metadata from the source array.
- overwritebool, default=False
 Whether to overwrite existing output files. If False, skips existing files with a warning.
- orderlist | tuple, optional
 Reorder planes before writing. Must have same length as planes. Example: planes=[1,2,3], order=[2,0,1] writes planes in order [3,1,2]
- target_chunk_mbint, optional
 Target chunk size in MB for streaming writes. Larger chunks may be faster but use more memory. Adjust based on available RAM. Default is 20.
- progress_callbackCallable, optional
 Callback function for progress updates: callback(progress, current_plane). Receives progress as float 0-1 and current plane index.
- debugbool, default=False
 Enable verbose logging to terminal for troubleshooting.
- output_namestr, optional
 Filename for binary output when ext=”.bin”. Common options: - “data_raw.bin” : Raw, unregistered data (default for BinArray) - “data.bin” : Registered data (typical after Suite2p registration) If None, defaults to “data_raw.bin” for new binaries or preserves existing name when reading from BinArray/Suite2pArray. Ignored for non-binary output formats.
- omebool, default=False
 Write OME-Zarr metadata when ext=”.zarr”. Creates OME-NGFF v0.5 compliant metadata including multiscales, axes, and coordinate transformations. Enables compatibility with OME-Zarr viewers and analysis tools. If True and ext is not “.zarr”, this parameter is ignored.
- **kwargs
 Additional format-specific options passed to writer backends.
- Returns:
 - Path
 Path to the output directory containing written files.
- Raises:
 - TypeError
 If lazy_array type is unsupported or incompatible with specified options.
- ValueError
 If outpath parent doesn’t exist, metadata is malformed, or parameters are invalid.
- FileNotFoundError
 If expected companion files (e.g., ops.npy, summary.npy) are missing.
- KeyError
 If registration is requested but plane_shifts is missing from summary.
See also
imreadLoad imaging data from various formats
register_zplanes_s3dCompute z-plane registration using Suite3D
validate_s3d_registrationValidate Suite3D registration results
Notes
File Naming Convention: - Single ROI or stitched: plane{Z:02d}_stitched.{ext} - Multiple ROIs: plane{Z:02d}_roi{R}.{ext} - Binary format: plane{Z:02d}_roi{R}/data_raw.bin + ops.npy
Registration (register_z=True): - Validates existing registration by checking summary/summary.npy for valid
plane_shifts array with shape (n_planes, 2)
Only reruns Suite3D if validation fails or no existing job found
Registration shifts are applied during write to align planes spatially
Output files are padded to accommodate all shifts
Memory Management: - Data is streamed in chunks (controlled by target_chunk_mb) - Only one chunk is held in memory at a time - Large files (>4GB) automatically use BigTIFF format
Phase Correction (MboRawArray only): - Set lazy_array.fix_phase = True before calling imwrite - Corrects bidirectional scanning artifacts - Methods: ‘mean’, ‘median’, ‘max’ (set via lazy_array.phasecorr_method)
Examples
Basic Usage - Stitch ROIs and save all planes as TIFF:
>>> from mbo_utilities import imread, imwrite >>> data = imread("path/to/raw/*.tiff") >>> imwrite(data, "output/session1", roi=None) # Stitches all ROIs
Save specific planes only (first, middle, last for 14-plane volume):
>>> imwrite(data, "output/session1", planes=[1, 7, 14]) # Creates: plane01_stitched.tiff, plane07_stitched.tiff, plane14_stitched.tiff
Split all ROIs into separate files:
>>> imwrite(data, "output/session1", roi=0) # Creates: plane01_roi1.tiff, plane01_roi2.tiff, ..., plane14_roi1.tiff, ...
Save specific ROIs only:
>>> imwrite(data, "output/session1", roi=[1, 3]) # Only ROIs 1 and 3 >>> imwrite(data, "output/session1", roi=2) # Only ROI 2
Z-plane registration with Suite3D:
>>> data = imread("path/to/raw/*.tiff") >>> imwrite(data, "output/registered", register_z=True, roi=None) # Computes and applies rigid shifts to align z-planes spatially
Use pre-computed registration shifts:
>>> shifts = np.load("previous_job/summary/summary.npy", allow_pickle=True).item() >>> shift_vectors = shifts['plane_shifts'] # shape: (n_planes, 2) >>> imwrite(data, "output/registered", shift_vectors=shift_vectors)
Convert to Suite2p binary format:
>>> data = imread("path/to/raw/*.tiff") >>> imwrite(data, "output/suite2p", ext=".bin", roi=0) # Creates: plane01_roi1/data_raw.bin, plane01_roi1/ops.npy, ...
Export subset of frames for testing:
>>> imwrite(data, "output/test", num_frames=1000, planes=[1, 7, 14]) # Exports only first 1000 frames of planes 1, 7, and 14
Save to Zarr format with compression:
>>> imwrite(data, "output/zarr_store", ext=".zarr", roi=0) # Creates: output/zarr_store/plane01_roi1.zarr, ...
Enable phase correction (for raw ScanImage data):
>>> data = imread("path/to/raw/*.tiff") >>> data.fix_phase = True >>> data.phasecorr_method = "mean" # or "median", "max" >>> data.use_fft = True # Use FFT-based correction (faster) >>> imwrite(data, "output/corrected", roi=None)
Overwrite existing files:
>>> imwrite(data, "output/session1", planes=[1, 2, 3], overwrite=True)
Custom metadata:
>>> custom_meta = {"experimenter": "MBO-User", "Date": "2025-01-15"} >>> imwrite(data, "output/session1", metadata=custom_meta)
Reorder planes:
>>> imwrite(data, "output/session1", planes=[3, 2, 1], order=[2, 1, 0]) # Writes plane 3 first, then plane 2, then plane 1
Progress callback reports per-zplane completion % for UIs:
>>> def progress_handler(progress, plane): ... print(f"Plane {plane}: {progress*100:.1f}% complete") >>> imwrite(data, "output/session1", progress_callback=progress_handler)
Save as OME-Zarr with NGFF v0.5 metadata:
>>> imwrite(data, "output/session1", ext=".zarr", ome=True) # Creates OME-Zarr stores with multiscales, axes, and coordinate transformations # Compatible with OME-Zarr viewers (napari, vizarr, etc.)
- mbo_utilities.imread(inputs: str | Path | Sequence[str | Path], **kwargs)[source]#
 Lazy load imaging data from supported file types.
Currently supported file types: - .bin: Suite2p binary files (.bin + ops.npy) - .tif/.tiff: TIFF files (BigTIFF, OME-TIFF and raw ScanImage TIFFs) - .h5: HDF5 files - .zarr: Zarr v3
- Parameters:
 - inputsstr, Path, ndarray, MboRawArray, or sequence of str/Path
 Input source. Can be: - Path to a file or directory - List/tuple of file paths - An existing lazy array
- **kwargs
 Extra keyword arguments passed to specific array readers.
- Returns:
 - array_like
 One of Suite2pArray, TiffArray, MboRawArray, MBOTiffArray, H5Array, or the input ndarray.
- Examples
 
>>> from mbo_utilities import imread ..
>>> arr = imread("/data/raw") # directory with supported files, for full filename ..
- mbo_utilities.get_files(base_dir, str_contains='', max_depth=1, sort_ascending=True, exclude_dirs=None) list | Path[source]#
 Recursively search for files in a specified directory whose names contain a given substring, limiting the search to a maximum subdirectory depth. Optionally, the resulting list of file paths is sorted in ascending order using numeric parts of the filenames when available.
- Parameters:
 - base_dirstr or Path
 The base directory where the search begins. This path is expanded (e.g., ‘~’ is resolved) and converted to an absolute path.
- str_containsstr, optional
 A substring that must be present in a file’s name for it to be included in the result. If empty, all files are matched.
- max_depthint, optional
 The maximum number of subdirectory levels (relative to the base directory) to search. Defaults to 1. If set to 0, it is automatically reset to 1.
- sort_ascendingbool, optional
 If True (default), the matched file paths are sorted in ascending alphanumeric order. The sort key extracts numeric parts from filenames so that, for example, “file2” comes before “file10”.
- exclude_dirsiterable of str or Path, optional
 An iterable of directories to exclude from the resulting list of file paths. By default will exclude “.venv/”, “__pycache__/”, “.git” and “.github”].
- Returns:
 - list of str
 A list of full file paths (as strings) for files within the base directory (and its subdirectories up to the specified depth) that contain the provided substring.
- Raises:
 - FileNotFoundError
 If the base directory does not exist.
- NotADirectoryError
 If the specified base_dir is not a directory.
Examples
>>> import mbo_utilities as mbo >>> # Get all files that contain "ops.npy" in their names by searching up to 3 levels deep: >>> ops_files = mbo.get_files("path/to/files", "ops.npy", max_depth=3) >>> # Get only files containing "tif" in the current directory (max_depth=1): >>> tif_files = mbo.get_files("path/to/files", "tif")
- mbo_utilities.files_to_dask(files: list[str | Path], astype=None, chunk_t=250)[source]#
 Lazily build a Dask array or list of arrays depending on filename tags.
“plane”, “z”, or “chan” → stacked along Z (TZYX)
“roi” → list of 3D (T,Y,X) arrays, one per ROI
otherwise → concatenate all files in time (T)
- mbo_utilities.get_metadata(file, z_step=None, verbose=False)[source]#
 Extract metadata from a TIFF file or directory of TIFF files produced by ScanImage.
This function handles single files, lists of files, or directories containing TIFF files. When given a directory, it automatically finds and processes all TIFF files in natural sort order. For multiple files, it calculates frames per file accounting for z-planes.
- Parameters:
 - fileos.PathLike, str, or list
 Single file path: processes that file
Directory path: processes all TIFF files in the directory
List of file paths: processes all files in the list
- z_stepfloat, optional
 The z-step size in microns. If provided, it will be included in the returned metadata.
- verbosebool, optional
 If True, returns extended metadata including all ScanImage attributes. Default is False.
- Returns:
 - dict
 A dictionary containing extracted metadata. For multiple files, includes: - ‘frames_per_file’: list of frame counts per file (accounting for z-planes) - ‘total_frames’: total frames across all files - ‘file_paths’: list of processed file paths - ‘tiff_pages_per_file’: raw TIFF page counts per file
- Raises:
 - ValueError
 If no recognizable metadata is found or no TIFF files found in directory.
Examples
>>> # Single file >>> meta = get_metadata("path/to/rawscan_00001.tif") >>> print(f"Frames: {meta['num_frames']}")
>>> # Directory of files >>> meta = get_metadata("path/to/scan_directory/") >>> print(f"Files processed: {len(meta['file_paths'])}") >>> print(f"Frames per file: {meta['frames_per_file']}")
>>> # List of specific files >>> files = ["scan_00001.tif", "scan_00002.tif", "scan_00003.tif"] >>> meta = get_metadata(files)
- mbo_utilities.expand_paths(paths: str | Path | Sequence[str | Path]) list[Path][source]#
 Expand a path, list of paths, or wildcard pattern into a sorted list of actual files.
This is a handy wrapper for loading images or data files when you’ve got a folder, some wildcards, or a mix of both.
- Parameters:
 - pathsstr, Path, or list of (str or Path)
 Can be a single path, a wildcard pattern like “*.tif”, a folder, or a list of those.
- Returns:
 - list of Path
 Sorted list of full paths to matching files.
Examples
>>> expand_paths("data/\*.tif") [Path("data/img_000.tif"), Path("data/img_001.tif"), ...]
>>> expand_paths(Path("data")) [Path("data/img_000.tif"), Path("data/img_001.tif"), ...]
>>> expand_paths(["data/\*.tif", Path("more_data")]) [Path("data/img_000.tif"), Path("more_data/img_050.tif"), ...]
- mbo_utilities.get_mbo_dirs() dict[source]#
 Ensure ~/mbo and its subdirectories exist.
Returns a dict with paths to the root, settings, and cache directories.