allennlp.data.dataset_readers.atis

class allennlp.data.dataset_readers.atis.AtisDatasetReader(token_indexers: typing.Dict[str, allennlp.data.token_indexers.token_indexer.TokenIndexer] = None, lazy: bool = False, tokenizer: allennlp.data.tokenizers.tokenizer.Tokenizer = None, database_directory: str = None) → None[source]

Bases: allennlp.data.dataset_readers.dataset_reader.DatasetReader

This DatasetReader takes json files and converts them into Instances for the AtisSemanticParser.

Each line in the file is a JSON object that represent an interaction in the ATIS dataset that has the following keys and values: ` "id": The original filepath in the LDC corpus "interaction": <list where each element represents a turn in the interaction> "scenario": A code that refers to the scenario that served as the prompt for this interaction "ut_date": Date of the interaction "zc09_path": Path that was used in the original paper `Learning Context-Dependent Mappings from Sentences to Logical Form <https://www.semanticscholar.org/paper/Learning-Context-Dependent-Mappings-from-Sentences-Zettlemoyer-Collins/44a8fcee0741139fa15862dc4b6ce1e11444878f>'_ by Zettlemoyer and Collins (ACL/IJCNLP 2009) `

Each element in the interaction list has the following keys and values: ` "utterance": Natural language input "sql": A list of SQL queries that the utterance maps to, it could be multiple SQL queries or none at all. `

Parameters:
token_indexers : Dict[str, TokenIndexer], optional

Token indexers for the utterances. Will default to {"tokens": SingleIdTokenIndexer()}.

lazy : bool (optional, default=False)

Passed to DatasetReader. If this is True, training will start sooner, but will take longer per batch.

tokenizer : Tokenizer, optional

Tokenizer to use for the utterances. Will default to WordTokenizer() with Spacy’s tagger enabled.

database_directory : str, optional

The directory to find the sqlite database file. We query the sqlite database to find the strings that are allowed.

text_to_instance(utterances: typing.List[str], sql_query: str = None) → allennlp.data.instance.Instance[source]
Parameters:
utterances: ``List[str]``, required.

List of utterances in the interaction, the last element is the current utterance.

sql_query: ``str``, optional

The SQL query, given as label during training or validation.