1. I/O#
Functions to help with loading and saving data.
- mbo_utilities.imread(inputs: str | Path | np.ndarray | Sequence[str | Path], **kwargs)[source]#
Lazy load imaging data from supported file types.
Currently supported file types: - .bin: Suite2p binary files (.bin + ops.npy) - .tif/.tiff: TIFF files (BigTIFF, OME-TIFF and raw ScanImage TIFFs) - .h5: HDF5 files - .zarr: Zarr v3 - .npy: NumPy arrays - np.ndarray: In-memory numpy arrays (wrapped as NumpyArray)
- Parameters:
- inputsstr, Path, ndarray, or sequence of str/Path
Input source. Can be: - Path to a file or directory - List/tuple of file paths - A numpy array (will be wrapped as NumpyArray for full imwrite support) - An existing lazy array (passed through unchanged)
- **kwargs
Extra keyword arguments passed to specific array readers.
- Returns:
- array_like
A lazy array appropriate for the input format. Use mbo formats CLI command to list all supported formats and their array types.
Examples
>>> from mbo_utilities import imread, imwrite >>> arr = imread("/data/raw") # directory with supported files >>> arr = imread("data.tiff") # single file >>> arr = imread(["file1.tiff", "file2.tiff"]) # multiple files
>>> # Wrap numpy array for imwrite compatibility >>> data = np.random.randn(100, 512, 512) >>> arr = imread(data) # Returns NumpyArray >>> imwrite(arr, "output", ext=".zarr") # Full write support
- mbo_utilities.get_files(base_dir, str_contains='', max_depth=1, sort_ascending=True, exclude_dirs=None) list | Path[source]#
Recursively search for files in a specified directory whose names contain a given substring, limiting the search to a maximum subdirectory depth. Optionally, the resulting list of file paths is sorted in ascending order using numeric parts of the filenames when available.
This function intelligently handles zarr stores: it stops recursing into leaf .zarr directories (those that don’t contain nested .zarr subdirs) to avoid traversing thousands of internal chunk directories.
- Parameters:
- base_dirstr or Path
The base directory where the search begins. This path is expanded (e.g., ‘~’ is resolved) and converted to an absolute path.
- str_containsstr, optional
A substring that must be present in a file’s name for it to be included in the result. If empty, all files are matched.
- max_depthint, optional
The maximum number of subdirectory levels (relative to the base directory) to search. Defaults to 1. If set to 0, it is automatically reset to 1.
- sort_ascendingbool, optional
If True (default), the matched file paths are sorted in ascending alphanumeric order. The sort key extracts numeric parts from filenames so that, for example, “file2” comes before “file10”.
- exclude_dirsiterable of str or Path, optional
An iterable of directories to exclude from the resulting list of file paths. By default will exclude “.venv/”, “__pycache__/”, “.git” and “.github”].
- Returns:
- list of str
A list of full file paths (as strings) for files within the base directory (and its subdirectories up to the specified depth) that contain the provided substring.
- Raises:
- FileNotFoundError
If the base directory does not exist.
- NotADirectoryError
If the specified base_dir is not a directory.
Examples
>>> import mbo_utilities as mbo >>> # Get all files that contain "ops.npy" in their names by searching up to 3 levels deep: >>> ops_files = mbo.get_files("path/to/files", "ops.npy", max_depth=3) >>> # Get only files containing "tif" in the current directory (max_depth=1): >>> tif_files = mbo.get_files("path/to/files", "tif")
- mbo_utilities.files_to_dask(files: list[str | Path], astype=None, chunk_t=250)[source]#
Lazily build a Dask array or list of arrays depending on filename tags.
“plane”, “z”, or “chan” -> stacked along Z (TZYX)
“roi” -> list of 3D (T,Y,X) arrays, one per ROI
otherwise -> concatenate all files in time (T)
- mbo_utilities.get_metadata(file, dx: float | None = None, dy: float | None = None, dz: float | None = None, z_step: float | None = None)[source]#
Extract metadata from a TIFF file or directory of TIFF files produced by ScanImage.
This function handles single files, lists of files, or directories containing TIFF files. When given a directory, it automatically finds and processes all TIFF files in natural sort order. For multiple files, it calculates frames per file accounting for z-planes.
- Parameters:
- fileos.PathLike, str, or list
Single file path: processes that file
Directory path: processes all TIFF files in the directory
List of file paths: processes all files in the list
- dxfloat, optional
X pixel resolution in micrometers. Overrides extracted value.
- dyfloat, optional
Y pixel resolution in micrometers. Overrides extracted value.
- dzfloat, optional
Z step size in micrometers. Overrides extracted value. Also available as
z_stepfor backward compatibility.- verbosebool, optional
If True, returns extended metadata including all ScanImage attributes. Default is False.
- z_stepfloat, optional
Alias for
dz(backward compatibility).
- Returns:
- dict
A dictionary containing extracted metadata with normalized resolution aliases: - dx, dy, dz: canonical resolution values in micrometers - pixel_resolution: (dx, dy) tuple - voxel_size: (dx, dy, dz) tuple - umPerPixX, umPerPixY, umPerPixZ: legacy format - PhysicalSizeX, PhysicalSizeY, PhysicalSizeZ: OME format
For multiple files, also includes: - ‘frames_per_file’: list of frame counts per file (accounting for z-planes) - ‘total_frames’: total frames across all files - ‘file_paths’: list of processed file paths - ‘tiff_pages_per_file’: raw TIFF page counts per file
- Raises:
- ValueError
If no recognizable metadata is found or no TIFF files found in directory.
Examples
>>> # Single file with z-resolution >>> meta = get_metadata("path/to/rawscan_00001.tif", dz=5.0) >>> print(f"Voxel size: {meta['voxel_size']}")
>>> # Directory of files >>> meta = get_metadata("path/to/scan_directory/") >>> print(f"Files processed: {len(meta['file_paths'])}") >>> print(f"Frames per file: {meta['frames_per_file']}")
>>> # List of specific files >>> files = ["scan_00001.tif", "scan_00002.tif", "scan_00003.tif"] >>> meta = get_metadata(files, dz=5.0)
- mbo_utilities.expand_paths(paths: str | Path | Sequence[str | Path]) list[Path][source]#
Expand a path, list of paths, or wildcard pattern into a sorted list of actual files.
This is a handy wrapper for loading images or data files when you’ve got a folder, some wildcards, or a mix of both.
- Parameters:
- pathsstr, Path, or list of (str or Path)
Can be a single path, a wildcard pattern like “\*.tif”, a folder, or a list of those.
- Returns:
- list of Path
Sorted list of full paths to matching files.
Examples
>>> expand_paths("data/\\*.tif") [Path("data/img_000.tif"), Path("data/img_001.tif"), ...]
>>> expand_paths(Path("data")) [Path("data/img_000.tif"), Path("data/img_001.tif"), ...]
>>> expand_paths(["data/\\*.tif", Path("more_data")]) [Path("data/img_000.tif"), Path("more_data/img_050.tif"), ...]
- mbo_utilities.get_mbo_dirs() dict[source]#
Ensure ~/.mbo and its subdirectories exist.
Returns a dict with paths to the root, settings, and cache directories.
- mbo_utilities.load_ops(ops_input: str | Path | list[str | Path])[source]#
Simple utility to load a suite2p npy file.
- mbo_utilities.write_ops(metadata, raw_filename, **kwargs)[source]#
Write metadata to an ops file alongside the given filename.
This creates a Suite2p-compatible ops.npy file from the provided metadata. The ops file is used by Suite2p for processing configuration.
- Parameters:
- metadatadict
Must contain ‘shape’ key with (T, Y, X) dimensions. Optional keys: ‘pixel_resolution’, ‘frame_rate’, ‘fs’, ‘dx’, ‘dy’, ‘dz’.
- raw_filenamestr or Path
Path to the data file (e.g., data_raw.bin). The ops.npy will be written to the same directory.
- **kwargs
Additional arguments. ‘structural=True’ indicates channel 2 data.