class typing.Dict[str,] = None, domain_identifier: str = None, lazy: bool = False) → None[source]


This DatasetReader is designed to read in the English OntoNotes v5.0 data for semantic role labelling. It returns a dataset of instances with the following fields:

tokens : TextField
The tokens in the sentence.
verb_indicator : SequenceLabelField
A sequence of binary indicators for whether the word is the verb for this frame.
tags : SequenceLabelField
A sequence of Propbank tags for the given verb in a BIO format.
token_indexers : Dict[str, TokenIndexer], optional

We similarly use this for both the premise and the hypothesis. See TokenIndexer. Default is {"tokens": SingleIdTokenIndexer()}.

domain_identifier: ``str``, (default = None)

A string denoting a sub-domain of the Ontonotes 5.0 dataset to use. If present, only conll files under paths containing this domain identifier will be processed.

A ``Dataset`` of ``Instances`` for Semantic Role Labelling.
classmethod from_params(params: allennlp.common.params.Params) →[source]
text_to_instance(tokens: typing.List[], verb_label: typing.List[int], tags: typing.List[str] = None) →[source]

We take pre-tokenized input here, along with a verb label. The verb label should be a one-hot binary vector, the same length as the tokens, indicating the position of the verb to find arguments for.