datasets.base_dataset¶
-
class
mmf.datasets.base_dataset.
BaseDataset
(dataset_name, config, dataset_type='train', *args, **kwargs)[source]¶ Base class for implementing a dataset. Inherits from PyTorch’s Dataset class but adds some custom functionality on top. Processors mentioned in the configuration are automatically initialized for the end user.
Parameters: - dataset_name (str) – Name of your dataset to be used a representative in text strings
- dataset_type (str) – Type of your dataset. Normally, train|val|test
- config (DictConfig) – Configuration for the current dataset
-
load_item
(idx)[source]¶ Implement if you need to separately load the item and cache it.
Parameters: idx (int) – Index of the sample to be loaded.
-
prepare_batch
(batch)[source]¶ Can be possibly overridden in your child class. Not supported w Lightning trainer
Prepare batch for passing to model. Whatever returned from here will be directly passed to model’s forward function. Currently moves the batch to proper device.
Parameters: batch (SampleList) – sample list containing the currently loaded batch Returns: - Returns a sample representing current
- batch loaded
Return type: sample_list (SampleList)