allennlp.data.dataset_readers.sharded_dataset_reader#

ShardedDatasetReader#

ShardedDatasetReader(
    self,
    base_reader: allennlp.data.dataset_readers.dataset_reader.DatasetReader,
    kwargs,
) -> None

Wraps another dataset reader and uses it to read from multiple input files. Note that in this case the file_path passed to read() should be a glob, and that the dataset reader will return instances from all files matching the glob.

The order the files are processed in is deterministic to enable the instances to be filtered according to worker rank in the distributed case.

Parameters

  • base_reader : DatasetReader Reader with a read method that accepts a single file.

text_to_instance#

ShardedDatasetReader.text_to_instance(
    self,
    args,
    kwargs,
) -> allennlp.data.instance.Instance

Just delegate to the base reader text_to_instance.