Skip to content

DataLoader

Bases: Iterator[tuple[npt.NDArray, Coordinate, Coordinate]]

A data loader is an iterator that yields batches from the dataset. The data loader is used by the pipeline to fetch the batches for inference.

Notes
  • A batch contains the data, the minimum x coordinates and the minimum y coordinates of a batch of tiles
  • The data loader uses multiple threads to fetch the samples from the dataset
  • The data loader can prefetch multiple batches

Examples:

Assume the dataset is already created.

You can create a data loader and iterate over the batches.

>>> data_loader = DataLoader(
...     dataset=dataset,
...     batch_size=1,
...     num_workers=4,
...     num_prefetched_batches=1,
... )
...
>>> for data, x_min, y_min in data_loader:
...     ...
PARAMETER DESCRIPTION
dataset

dataset

TYPE: Dataset

batch_size

batch size

TYPE: int DEFAULT: 1

num_workers

number of workers

TYPE: int DEFAULT: 4

num_prefetched_batches

number of prefetched batches

TYPE: int DEFAULT: 1