1. I/O#

Functions to help with loading and saving data.

mbo_utilities.imwrite(lazy_array, outpath: str | Path, planes: list | tuple = None, roi: int | Sequence[int] | None = None, metadata: dict = None, overwrite: bool = False, ext: str = '.tiff', order: list | tuple = None, target_chunk_mb: int = 20, progress_callback: Callable = None, register_z: bool = False, debug: bool = False, shift_vectors: ndarray = None, **kwargs)[source]#
mbo_utilities.imread(inputs: str | Path | Sequence[str | Path], **kwargs)[source]#

Lazy load imaging data from supported file types.

Currently supported file types: - .bin: Suite2p binary files (.bin + ops.npy) - .tif/.tiff: TIFF files (BigTIFF, OME-TIFF and raw ScanImage TIFFs) - .h5: HDF5 files - .zarr: Zarr v3

Parameters:
inputsstr, Path, ndarray, MboRawArray, or sequence of str/Path

Input source. Can be: - Path to a file or directory - List/tuple of file paths - An existing lazy array

**kwargs

Extra keyword arguments passed to specific array readers.

Returns:
array_like

One of Suite2pArray, TiffArray, MboRawArray, MBOTiffArray, H5Array, or the input ndarray.

Examples
>>> from mbo_utilities import imread
    ..
>>> arr = imread("/data/raw")  # directory with supported files, for full filename
    ..
mbo_utilities.get_files(base_dir, str_contains='', max_depth=1, sort_ascending=True, exclude_dirs=None) list | Path[source]#

Recursively search for files in a specified directory whose names contain a given substring, limiting the search to a maximum subdirectory depth. Optionally, the resulting list of file paths is sorted in ascending order using numeric parts of the filenames when available.

Parameters:
base_dirstr or Path

The base directory where the search begins. This path is expanded (e.g., ‘~’ is resolved) and converted to an absolute path.

str_containsstr, optional

A substring that must be present in a file’s name for it to be included in the result. If empty, all files are matched.

max_depthint, optional

The maximum number of subdirectory levels (relative to the base directory) to search. Defaults to 1. If set to 0, it is automatically reset to 1.

sort_ascendingbool, optional

If True (default), the matched file paths are sorted in ascending alphanumeric order. The sort key extracts numeric parts from filenames so that, for example, “file2” comes before “file10”.

exclude_dirsiterable of str or Path, optional

An iterable of directories to exclude from the resulting list of file paths. By default will exclude “.venv/”, “__pycache__/”, “.git” and “.github”].

Returns:
list of str

A list of full file paths (as strings) for files within the base directory (and its subdirectories up to the specified depth) that contain the provided substring.

Raises:
FileNotFoundError

If the base directory does not exist.

NotADirectoryError

If the specified base_dir is not a directory.

Examples

>>> import mbo_utilities as mbo
>>> # Get all files that contain "ops.npy" in their names by searching up to 3 levels deep:
>>> ops_files = mbo.get_files("path/to/files", "ops.npy", max_depth=3)
>>> # Get only files containing "tif" in the current directory (max_depth=1):
>>> tif_files = mbo.get_files("path/to/files", "tif")
mbo_utilities.npy_to_dask(files, name='', axis=1, astype=None)[source]#

Creates a Dask array that lazily stacks multiple .npy files along a specified axis without fully loading them into memory.

Taken from suite3d for convenience alihaydaroglu/suite3d To avoid the unnessessary import. Very nice function, thanks Ali!

Parameters:
fileslist of str or Path

A list of file paths pointing to .npy files containing array data. Each file must have the same shape except possibly along the concatenation axis.

namestr, optional

A string to be appended to a base name (“from-npy-stack-”) to label the resulting Dask array. Default is an empty string.

axisint, optional

The axis along which to stack/concatenate the arrays from the provided files. Default is 1.

astypenumpy.dtype, optional

If provided, the resulting Dask array will be cast to this data type. Otherwise, the data type is inferred from the first file.

Returns:
dask.array.Array

Examples

>>> # https://www.fastplotlib.org/
>>> import fastplotlib as fpl
>>> import mbo_utilities as mbo
>>> files = mbo.get_files("path/to/images/", 'fused', 3) # suite3D output
>>> arr = npy_to_dask(files, name="stack", axis=1)
>>> print(arr.shape)
(nz, nt, ny, nx )
>>> # Optionally, cast the array to float32
>>> arr = npy_to_dask(files, axis=1, astype=np.float32)
>>> fpl.ImageWidget(arr.transpose(1, 0, 2, 3)).show()
mbo_utilities.expand_paths(paths: str | Path | Sequence[str | Path]) list[Path][source]#

Expand a path, list of paths, or wildcard pattern into a sorted list of actual files.

This is a handy wrapper for loading images or data files when you’ve got a folder, some wildcards, or a mix of both.

Parameters:
pathsstr, Path, or list of (str or Path)

Can be a single path, a wildcard pattern like “*.tif”, a folder, or a list of those.

Returns:
list of Path

Sorted list of full paths to matching files.

Examples

>>> expand_paths("data/\*.tif")
[Path("data/img_000.tif"), Path("data/img_001.tif"), ...]
>>> expand_paths(Path("data"))
[Path("data/img_000.tif"), Path("data/img_001.tif"), ...]
>>> expand_paths(["data/\*.tif", Path("more_data")])
[Path("data/img_000.tif"), Path("more_data/img_050.tif"), ...]
mbo_utilities.get_mbo_dirs() dict[source]#

Ensure ~/mbo and its subdirectories exist.

Returns a dict with paths to the root, settings, and cache directories.

mbo_utilities.load_ops(ops_input: str | Path | list[str | Path])[source]#

Simple utility load a suite2p npy file

mbo_utilities.write_ops(metadata, raw_filename)[source]#

Write metadata to an ops file alongside the given filename. metadata must contain ‘shape’ ‘pixel_resolution’, ‘frame_rate’ keys.

mbo_utilities.get_metadata(file, z_step=None, verbose=False)[source]#

Extract metadata from a TIFF file or directory of TIFF files produced by ScanImage.

This function handles single files, lists of files, or directories containing TIFF files. When given a directory, it automatically finds and processes all TIFF files in natural sort order. For multiple files, it calculates frames per file accounting for z-planes.

Parameters:
fileos.PathLike, str, or list
  • Single file path: processes that file

  • Directory path: processes all TIFF files in the directory

  • List of file paths: processes all files in the list

z_stepfloat, optional

The z-step size in microns. If provided, it will be included in the returned metadata.

verbosebool, optional

If True, returns extended metadata including all ScanImage attributes. Default is False.

Returns:
dict

A dictionary containing extracted metadata. For multiple files, includes: - ‘frames_per_file’: list of frame counts per file (accounting for z-planes) - ‘total_frames’: total frames across all files - ‘file_paths’: list of processed file paths - ‘tiff_pages_per_file’: raw TIFF page counts per file

Raises:
ValueError

If no recognizable metadata is found or no TIFF files found in directory.

Examples

>>> # Single file
>>> meta = get_metadata("path/to/rawscan_00001.tif")
>>> print(f"Frames: {meta['num_frames']}")
>>> # Directory of files
>>> meta = get_metadata("path/to/scan_directory/")
>>> print(f"Files processed: {len(meta['file_paths'])}")
>>> print(f"Frames per file: {meta['frames_per_file']}")
>>> # List of specific files
>>> files = ["scan_00001.tif", "scan_00002.tif", "scan_00003.tif"]
>>> meta = get_metadata(files)