allennlp.models.semantic_parsing.nlvr

class allennlp.models.semantic_parsing.nlvr.nlvr_semantic_parser.NlvrSemanticParser(vocab: allennlp.data.vocabulary.Vocabulary, sentence_embedder: allennlp.modules.text_field_embedders.text_field_embedder.TextFieldEmbedder, action_embedding_dim: int, encoder: allennlp.modules.seq2seq_encoders.seq2seq_encoder.Seq2SeqEncoder, dropout: float = 0.0, rule_namespace: str = 'rule_labels') → None[source]

Bases: allennlp.models.model.Model

NlvrSemanticParser is a semantic parsing model built for the NLVR domain. This is an abstract class and does not have a forward method implemented. Classes that inherit from this class are expected to define their own logic depending on the kind of supervision they use. Accordingly, they should use the appropriate DecoderTrainer. This class provides some common functionality for things like defining an initial RnnStatelet, embedding actions, evaluating the denotations of completed logical forms, etc. There is a lot of overlap with WikiTablesSemanticParser here. We may want to eventually move the common functionality into a more general transition-based parsing class.

Parameters:
vocab : Vocabulary
sentence_embedder : TextFieldEmbedder

Embedder for sentences.

action_embedding_dim : int

Dimension to use for action embeddings.

encoder : Seq2SeqEncoder

The encoder to use for the input question.

dropout : float, optional (default=0.0)

Dropout on the encoder outputs.

rule_namespace : str, optional (default=rule_labels)

The vocabulary namespace to use for production rules. The default corresponds to the default used in the dataset reader, so you likely don’t need to modify this.

decode(output_dict: typing.Dict[str, torch.Tensor]) → typing.Dict[str, torch.Tensor][source]

This method overrides Model.decode, which gets called after Model.forward, at test time, to finalize predictions. We only transform the action string sequences into logical forms here.

forward()[source]

Defines the forward pass of the model. In addition, to facilitate easy training, this method is designed to compute a loss function defined by a user.

The input is comprised of everything required to perform a training update, including labels - you define the signature here! It is down to the user to ensure that inference can be performed without the presence of these labels. Hence, any inputs not available at inference time should only be used inside a conditional block.

The intended sketch of this method is as follows:

def forward(self, input1, input2, targets=None):
    ....
    ....
    output1 = self.layer1(input1)
    output2 = self.layer2(input2)
    output_dict = {"output1": output1, "output2": output2}
    if targets is not None:
        # Function returning a scalar torch.Tensor, defined by the user.
        loss = self._compute_loss(output1, output2, targets)
        output_dict["loss"] = loss
    return output_dict
Parameters:
inputs:

Tensors comprising everything needed to perform a training update, including labels, which should be optional (i.e have a default value of None). At inference time, simply pass the relevant inputs, not including the labels.

Returns:
output_dict: ``Dict[str, torch.Tensor]``

The outputs from the model. In order to train a model using the Trainer api, you must provide a “loss” key pointing to a scalar torch.Tensor representing the loss to be optimized.

class allennlp.models.semantic_parsing.nlvr.nlvr_coverage_semantic_parser.NlvrCoverageSemanticParser(vocab: allennlp.data.vocabulary.Vocabulary, sentence_embedder: allennlp.modules.text_field_embedders.text_field_embedder.TextFieldEmbedder, action_embedding_dim: int, encoder: allennlp.modules.seq2seq_encoders.seq2seq_encoder.Seq2SeqEncoder, attention: allennlp.modules.attention.attention.Attention, beam_size: int, max_decoding_steps: int, max_num_finished_states: int = None, dropout: float = 0.0, normalize_beam_score_by_length: bool = False, checklist_cost_weight: float = 0.6, dynamic_cost_weight: typing.Dict[str, typing.Union[int, float]] = None, penalize_non_agenda_actions: bool = False, initial_mml_model_file: str = None) → None[source]

Bases: allennlp.models.semantic_parsing.nlvr.nlvr_semantic_parser.NlvrSemanticParser

NlvrSemanticCoverageParser is an NlvrSemanticParser that gets around the problem of lack of annotated logical forms by maximizing coverage of the output sequences over a prespecified agenda. In addition to the signal from coverage, we also compute the denotations given by the logical forms and define a hybrid cost based on coverage and denotation errors. The training process then minimizes the expected value of this cost over an approximate set of logical forms produced by the parser, obtained by performing beam search.

Parameters:
vocab : Vocabulary

Passed to super-class.

sentence_embedder : TextFieldEmbedder

Passed to super-class.

action_embedding_dim : int

Passed to super-class.

encoder : Seq2SeqEncoder

Passed to super-class.

attention : Attention

We compute an attention over the input question at each step of the decoder, using the decoder hidden state as the query. Passed to the TransitionFunction.

beam_size : int

Beam size for the beam search used during training.

max_num_finished_states : int, optional (default=None)

Maximum number of finished states the trainer should compute costs for.

normalize_beam_score_by_length : bool, optional (default=False)

Should the log probabilities be normalized by length before renormalizing them? Edunov et al. do this in their work, but we found that not doing it works better. It’s possible they did this because their task is NMT, and longer decoded sequences are not necessarily worse, and shouldn’t be penalized, while we will mostly want to penalize longer logical forms.

max_decoding_steps : int

Maximum number of steps for the beam search during training.

dropout : float, optional (default=0.0)

Probability of dropout to apply on encoder outputs, decoder outputs and predicted actions.

checklist_cost_weight : float, optional (default=0.6)

Mixture weight (0-1) for combining coverage cost and denotation cost. As this increases, we weigh the coverage cost higher, with a value of 1.0 meaning that we do not care about denotation accuracy.

dynamic_cost_weight : Dict[str, Union[int, float]], optional (default=None)

A dict containing keys wait_num_epochs and rate indicating the number of steps after which we should start decreasing the weight on checklist cost in favor of denotation cost, and the rate at which we should do it. We will decrease the weight in the following way - checklist_cost_weight = checklist_cost_weight - rate * checklist_cost_weight starting at the apropriate epoch. The weight will remain constant if this is not provided.

penalize_non_agenda_actions : bool, optional (default=False)

Should we penalize the model for producing terminal actions that are outside the agenda?

initial_mml_model_file : str , optional (default=None)

If you want to initialize this model using weights from another model trained using MML, pass the path to the model.tar.gz file of that model here.

forward(sentence: typing.Dict[str, torch.LongTensor], worlds: typing.List[typing.List[allennlp.semparse.worlds.nlvr_world.NlvrWorld]], actions: typing.List[typing.List[allennlp.data.fields.production_rule_field.ProductionRule]], agenda: torch.LongTensor, identifier: typing.List[str] = None, labels: torch.LongTensor = None, epoch_num: typing.List[int] = None) → typing.Dict[str, torch.Tensor][source]

Decoder logic for producing type constrained target sequences that maximize coverage of their respective agendas, and minimize a denotation based loss.

get_metrics(reset: bool = False) → typing.Dict[str, float][source]

Returns a dictionary of metrics. This method will be called by allennlp.training.Trainer in order to compute and use model metrics for early stopping and model serialization. We return an empty dictionary here rather than raising as it is not required to implement metrics for a new model. A boolean reset parameter is passed, as frequently a metric accumulator will have some state which should be reset between epochs. This is also compatible with Metrics should be populated during the call to ``forward`, with the Metric handling the accumulation of the metric until this method is called.

class allennlp.models.semantic_parsing.nlvr.nlvr_direct_semantic_parser.NlvrDirectSemanticParser(vocab: allennlp.data.vocabulary.Vocabulary, sentence_embedder: allennlp.modules.text_field_embedders.text_field_embedder.TextFieldEmbedder, action_embedding_dim: int, encoder: allennlp.modules.seq2seq_encoders.seq2seq_encoder.Seq2SeqEncoder, attention: allennlp.modules.attention.attention.Attention, decoder_beam_search: allennlp.state_machines.beam_search.BeamSearch, max_decoding_steps: int, dropout: float = 0.0) → None[source]

Bases: allennlp.models.semantic_parsing.nlvr.nlvr_semantic_parser.NlvrSemanticParser

NlvrDirectSemanticParser is an NlvrSemanticParser that gets around the problem of lack of logical form annotations by maximizing the marginal likelihood of an approximate set of target sequences that yield the correct denotation. The main difference between this parser and NlvrCoverageSemanticParser is that while this parser takes the output of an offline search process as the set of target sequences for training, the latter performs search during training.

Parameters:
vocab : Vocabulary

Passed to super-class.

sentence_embedder : TextFieldEmbedder

Passed to super-class.

action_embedding_dim : int

Passed to super-class.

encoder : Seq2SeqEncoder

Passed to super-class.

attention : Attention

We compute an attention over the input question at each step of the decoder, using the decoder hidden state as the query. Passed to the TransitionFunction.

decoder_beam_search : BeamSearch

Beam search used to retrieve best sequences after training.

max_decoding_steps : int

Maximum number of steps for beam search after training.

dropout : float, optional (default=0.0)

Probability of dropout to apply on encoder outputs, decoder outputs and predicted actions.

forward(sentence: typing.Dict[str, torch.LongTensor], worlds: typing.List[typing.List[allennlp.semparse.worlds.nlvr_world.NlvrWorld]], actions: typing.List[typing.List[allennlp.data.fields.production_rule_field.ProductionRule]], identifier: typing.List[str] = None, target_action_sequences: torch.LongTensor = None, labels: torch.LongTensor = None) → typing.Dict[str, torch.Tensor][source]

Decoder logic for producing type constrained target sequences, trained to maximize marginal likelihod over a set of approximate logical forms.

get_metrics(reset: bool = False) → typing.Dict[str, float][source]

Returns a dictionary of metrics. This method will be called by allennlp.training.Trainer in order to compute and use model metrics for early stopping and model serialization. We return an empty dictionary here rather than raising as it is not required to implement metrics for a new model. A boolean reset parameter is passed, as frequently a metric accumulator will have some state which should be reset between epochs. This is also compatible with Metrics should be populated during the call to ``forward`, with the Metric handling the accumulation of the metric until this method is called.