datasets.base_dataset¶
-
class
mmf.datasets.base_dataset.
BaseDataset
(dataset_name, config, dataset_type='train', *args, **kwargs)[source]¶ Base class for implementing a dataset. Inherits from PyTorch’s Dataset class but adds some custom functionality on top. Processors mentioned in the configuration are automatically initialized for the end user.
Parameters: - dataset_name (str) – Name of your dataset to be used a representative in text strings
- dataset_type (str) – Type of your dataset. Normally, train|val|test
- config (DictConfig) – Configuration for the current dataset
-
load_item
(idx)[source]¶ Implement if you need to separately load the item and cache it.
Parameters: idx (int) – Index of the sample to be loaded.
-
prepare_batch
(batch)[source]¶ Can be possibly overridden in your child class
Prepare batch for passing to model. Whatever returned from here will be directly passed to model’s forward function. Currently moves the batch to proper device.
Parameters: batch (SampleList) – sample list containing the currently loaded batch Returns: - Returns a sample representing current
- batch loaded
Return type: sample_list (SampleList)