allennlp.models.semantic_parsing.wikitables

class allennlp.models.semantic_parsing.wikitables.wikitables_semantic_parser.WikiTablesSemanticParser(vocab: allennlp.data.vocabulary.Vocabulary, question_embedder: allennlp.modules.text_field_embedders.text_field_embedder.TextFieldEmbedder, action_embedding_dim: int, encoder: allennlp.modules.seq2seq_encoders.seq2seq_encoder.Seq2SeqEncoder, entity_encoder: allennlp.modules.seq2vec_encoders.seq2vec_encoder.Seq2VecEncoder, max_decoding_steps: int, use_neighbor_similarity_for_linking: bool = False, dropout: float = 0.0, num_linking_features: int = 10, rule_namespace: str = 'rule_labels', tables_directory: str = '/wikitables/') → None[source]

Bases: allennlp.models.model.Model

A WikiTablesSemanticParser is a Model which takes as input a table and a question, and produces a logical form that answers the question when executed over the table. The logical form is generated by a type-constrained, transition-based parser. This is an abstract class that defines most of the functionality related to the transition-based parser. It does not contain the implementation for actually training the parser. You may want to train it using a learning-to-search algorithm, in which case you will want to use WikiTablesErmSemanticParser, or if you have a set of approximate logical forms that give the correct denotation, you will want to use WikiTablesMmlSemanticParser.

Parameters:
vocab : Vocabulary
question_embedder : TextFieldEmbedder

Embedder for questions.

action_embedding_dim : int

Dimension to use for action embeddings.

encoder : Seq2SeqEncoder

The encoder to use for the input question.

entity_encoder : Seq2VecEncoder

The encoder to used for averaging the words of an entity.

max_decoding_steps : int

When we’re decoding with a beam search, what’s the maximum number of steps we should take? This only applies at evaluation time, not during training.

use_neighbor_similarity_for_linking : bool, optional (default=False)

If True, we will compute a max similarity between a question token and the neighbors of an entity as a component of the linking scores. This is meant to capture the same kind of information as the related_column feature.

dropout : float, optional (default=0)

If greater than 0, we will apply dropout with this probability after all encoders (pytorch LSTMs do not apply dropout to their last layer).

num_linking_features : int, optional (default=10)

We need to construct a parameter vector for the linking features, so we need to know how many there are. The default of 8 here matches the default in the KnowledgeGraphField, which is to use all eight defined features. If this is 0, another term will be added to the linking score. This term contains the maximum similarity value from the entity’s neighbors and the question.

rule_namespace : str, optional (default=rule_labels)

The vocabulary namespace to use for production rules. The default corresponds to the default used in the dataset reader, so you likely don’t need to modify this.

tables_directory : str, optional (default=/wikitables/)

The directory to find tables when evaluating logical forms. We rely on a call to SEMPRE to evaluate logical forms, and SEMPRE needs to read the table from disk itself. This tells SEMPRE where to find the tables.

decode(output_dict: typing.Dict[str, torch.Tensor]) → typing.Dict[str, torch.Tensor][source]

This method overrides Model.decode, which gets called after Model.forward, at test time, to finalize predictions. This is (confusingly) a separate notion from the “decoder” in “encoder/decoder”, where that decoder logic lives in WikiTablesDecoderStep.

This method trims the output predictions to the first end symbol, replaces indices with corresponding tokens, and adds a field called predicted_tokens to the output_dict.

get_metrics(reset: bool = False) → typing.Dict[str, float][source]

We track three metrics here:

1. dpd_acc, which is the percentage of the time that our best output action sequence is in the set of action sequences provided by DPD. This is an easy-to-compute lower bound on denotation accuracy for the set of examples where we actually have DPD output. We only score dpd_acc on that subset.

2. denotation_acc, which is the percentage of examples where we get the correct denotation. This is the typical “accuracy” metric, and it is what you should usually report in an experimental result. You need to be careful, though, that you’re computing this on the full data, and not just the subset that has DPD output (make sure you pass “keep_if_no_dpd=True” to the dataset reader, which we do for validation data, but not training data).

3. lf_percent, which is the percentage of time that decoding actually produces a finished logical form. We might not produce a valid logical form if the decoder gets into a repetitive loop, or we’re trying to produce a super long logical form and run out of time steps, or something.

class allennlp.models.semantic_parsing.wikitables.wikitables_mml_semantic_parser.WikiTablesMmlSemanticParser(vocab: allennlp.data.vocabulary.Vocabulary, question_embedder: allennlp.modules.text_field_embedders.text_field_embedder.TextFieldEmbedder, action_embedding_dim: int, encoder: allennlp.modules.seq2seq_encoders.seq2seq_encoder.Seq2SeqEncoder, entity_encoder: allennlp.modules.seq2vec_encoders.seq2vec_encoder.Seq2VecEncoder, mixture_feedforward: allennlp.modules.feedforward.FeedForward, decoder_beam_search: allennlp.nn.decoding.beam_search.BeamSearch, max_decoding_steps: int, input_attention: allennlp.modules.attention.attention.Attention, training_beam_size: int = None, use_neighbor_similarity_for_linking: bool = False, dropout: float = 0.0, num_linking_features: int = 10, rule_namespace: str = 'rule_labels', tables_directory: str = '/wikitables/') → None[source]

Bases: allennlp.models.semantic_parsing.wikitables.wikitables_semantic_parser.WikiTablesSemanticParser

A WikiTablesMmlSemanticParser is a WikiTablesSemanticParser which is trained to maximize the marginal likelihood of an approximate set of logical forms which give the correct denotation. This is a re-implementation of the model used for the paper Neural Semantic Parsing with Type Constraints for Semi-Structured Tables, by Jayant Krishnamurthy, Pradeep Dasigi, and Matt Gardner (EMNLP 2017).

WORK STILL IN PROGRESS. We’ll iteratively improve it until we’ve reproduced the performance of the original parser.

Parameters:
vocab : Vocabulary
question_embedder : TextFieldEmbedder

Embedder for questions. Passed to super class.

action_embedding_dim : int

Dimension to use for action embeddings. Passed to super class.

encoder : Seq2SeqEncoder

The encoder to use for the input question. Passed to super class.

entity_encoder : Seq2VecEncoder

The encoder to used for averaging the words of an entity. Passed to super class.

decoder_beam_search : BeamSearch

When we’re not training, this is how we will do decoding.

max_decoding_steps : int

When we’re decoding with a beam search, what’s the maximum number of steps we should take? This only applies at evaluation time, not during training. Passed to super class.

input_attention : Attention

We compute an attention over the input question at each step of the decoder, using the decoder hidden state as the query. Passed to WikiTablesDecoderStep.

training_beam_size : int, optional (default=None)

If given, we will use a constrained beam search of this size during training, so that we use only the top training_beam_size action sequences according to the model in the MML computation. If this is None, we will use all of the provided action sequences in the MML computation.

use_neighbor_similarity_for_linking : bool, optional (default=False)

If True, we will compute a max similarity between a question token and the neighbors of an entity as a component of the linking scores. This is meant to capture the same kind of information as the related_column feature. Passed to super class.

dropout : float, optional (default=0)

If greater than 0, we will apply dropout with this probability after all encoders (pytorch LSTMs do not apply dropout to their last layer). Passed to super class.

num_linking_features : int, optional (default=10)

We need to construct a parameter vector for the linking features, so we need to know how many there are. The default of 10 here matches the default in the KnowledgeGraphField, which is to use all ten defined features. If this is 0, another term will be added to the linking score. This term contains the maximum similarity value from the entity’s neighbors and the question. Passed to super class.

rule_namespace : str, optional (default=rule_labels)

The vocabulary namespace to use for production rules. The default corresponds to the default used in the dataset reader, so you likely don’t need to modify this. Passed to super class.

tables_directory : str, optional (default=/wikitables/)

The directory to find tables when evaluating logical forms. We rely on a call to SEMPRE to evaluate logical forms, and SEMPRE needs to read the table from disk itself. This tells SEMPRE where to find the tables. Passed to super class.

forward(question: typing.Dict[str, torch.LongTensor], table: typing.Dict[str, torch.LongTensor], world: typing.List[allennlp.semparse.worlds.wikitables_world.WikiTablesWorld], actions: typing.List[typing.List[typing.Tuple[[str, bool], typing.Union[torch.Tensor, NoneType]]]], example_lisp_string: typing.List[str] = None, target_action_sequences: torch.LongTensor = None) → typing.Dict[str, torch.Tensor][source]

In this method we encode the table entities, link them to words in the question, then encode the question. Then we set up the initial state for the decoder, and pass that state off to either a DecoderTrainer, if we’re training, or a BeamSearch for inference, if we’re not.

Parameters:
question : Dict[str, torch.LongTensor]

The output of TextField.as_array() applied on the question TextField. This will be passed through a TextFieldEmbedder and then through an encoder.

table : Dict[str, torch.LongTensor]

The output of KnowledgeGraphField.as_array() applied on the table KnowledgeGraphField. This output is similar to a TextField output, where each entity in the table is treated as a “token”, and we will use a TextFieldEmbedder to get embeddings for each entity.

world : List[WikiTablesWorld]

We use a MetadataField to get the World for each input instance. Because of how MetadataField works, this gets passed to us as a List[WikiTablesWorld],

actions : List[List[ProductionRuleArray]]

A list of all possible actions for each World in the batch, indexed into a ProductionRuleArray using a ProductionRuleField. We will embed all of these and use the embeddings to determine which action to take at each timestep in the decoder.

example_lisp_string : List[str], optional (default=None)

The example (lisp-formatted) string corresponding to the given input. This comes directly from the .examples file provided with the dataset. We pass this to SEMPRE when evaluating denotation accuracy; it is otherwise unused.

target_action_sequences : torch.Tensor, optional (default=None)

A list of possibly valid action sequences, where each action is an index into the list of possible actions. This tensor has shape (batch_size, num_action_sequences, sequence_length).

classmethod from_params(vocab, params: allennlp.common.params.Params) → allennlp.models.semantic_parsing.wikitables.wikitables_mml_semantic_parser.WikiTablesMmlSemanticParser[source]
class allennlp.models.semantic_parsing.wikitables.wikitables_erm_semantic_parser.WikiTablesErmSemanticParser(vocab: allennlp.data.vocabulary.Vocabulary, question_embedder: allennlp.modules.text_field_embedders.text_field_embedder.TextFieldEmbedder, action_embedding_dim: int, encoder: allennlp.modules.seq2seq_encoders.seq2seq_encoder.Seq2SeqEncoder, entity_encoder: allennlp.modules.seq2vec_encoders.seq2vec_encoder.Seq2VecEncoder, mixture_feedforward: allennlp.modules.feedforward.FeedForward, input_attention: allennlp.modules.attention.attention.Attention, decoder_beam_size: int, decoder_num_finished_states: int, max_decoding_steps: int, normalize_beam_score_by_length: bool = False, checklist_cost_weight: float = 0.6, use_neighbor_similarity_for_linking: bool = False, dropout: float = 0.0, num_linking_features: int = 10, rule_namespace: str = 'rule_labels', tables_directory: str = '/wikitables/', initial_mml_model_file: str = None) → None[source]

Bases: allennlp.models.semantic_parsing.wikitables.wikitables_semantic_parser.WikiTablesSemanticParser

A WikiTablesErmSemanticParser is a WikiTablesSemanticParser that learns to search for logical forms that yield the correct denotations.

Parameters:
vocab : Vocabulary
question_embedder : TextFieldEmbedder

Embedder for questions. Passed to super class.

action_embedding_dim : int

Dimension to use for action embeddings. Passed to super class.

encoder : Seq2SeqEncoder

The encoder to use for the input question. Passed to super class.

entity_encoder : Seq2VecEncoder

The encoder to used for averaging the words of an entity. Passed to super class.

input_attention : Attention

We compute an attention over the input question at each step of the decoder, using the decoder hidden state as the query. Passed to WikiTablesDecoderStep.

decoder_beam_size : int

Beam size to be used by the ExpectedRiskMinimization algorithm.

decoder_num_finished_states : int

Number of finished states for which costs will be computed by the ExpectedRiskMinimization algorithm.

max_decoding_steps : int

Maximum number of steps the decoder should take before giving up. Used both during training and evaluation. Passed to super class.

normalize_beam_score_by_length : bool, optional (default=False)

Should we normalize the log-probabilities by length before renormalizing the beam? This was shown to work better for NML by Edunov et al., but that many not be the case for semantic parsing.

checklist_cost_weight : float, optional (default=0.6)

Mixture weight (0-1) for combining coverage cost and denotation cost. As this increases, we weigh the coverage cost higher, with a value of 1.0 meaning that we do not care about denotation accuracy.

use_neighbor_similarity_for_linking : bool, optional (default=False)

If True, we will compute a max similarity between a question token and the neighbors of an entity as a component of the linking scores. This is meant to capture the same kind of information as the related_column feature. Passed to super class.

dropout : float, optional (default=0)

If greater than 0, we will apply dropout with this probability after all encoders (pytorch LSTMs do not apply dropout to their last layer). Passed to super class.

num_linking_features : int, optional (default=10)

We need to construct a parameter vector for the linking features, so we need to know how many there are. The default of 10 here matches the default in the KnowledgeGraphField, which is to use all ten defined features. If this is 0, another term will be added to the linking score. This term contains the maximum similarity value from the entity’s neighbors and the question. Passed to super class.

rule_namespace : str, optional (default=rule_labels)

The vocabulary namespace to use for production rules. The default corresponds to the default used in the dataset reader, so you likely don’t need to modify this. Passed to super class.

tables_directory : str, optional (default=/wikitables/)

The directory to find tables when evaluating logical forms. We rely on a call to SEMPRE to evaluate logical forms, and SEMPRE needs to read the table from disk itself. This tells SEMPRE where to find the tables. Passed to super class.

initial_mml_model_file : str, optional (default=None)

If you want to initialize this model using weights from another model trained using MML, pass the path to the model.tar.gz file of that model here.

forward(question: typing.Dict[str, torch.LongTensor], table: typing.Dict[str, torch.LongTensor], world: typing.List[allennlp.semparse.worlds.wikitables_world.WikiTablesWorld], actions: typing.List[typing.List[typing.Tuple[[str, bool], typing.Union[torch.Tensor, NoneType]]]], agenda: torch.LongTensor, example_lisp_string: typing.List[str]) → typing.Dict[str, torch.Tensor][source]
Parameters:
question : Dict[str, torch.LongTensor]

The output of TextField.as_array() applied on the question TextField. This will be passed through a TextFieldEmbedder and then through an encoder.

table : Dict[str, torch.LongTensor]

The output of KnowledgeGraphField.as_array() applied on the table KnowledgeGraphField. This output is similar to a TextField output, where each entity in the table is treated as a “token”, and we will use a TextFieldEmbedder to get embeddings for each entity.

world : List[WikiTablesWorld]

We use a MetadataField to get the World for each input instance. Because of how MetadataField works, this gets passed to us as a List[WikiTablesWorld],

actions : List[List[ProductionRuleArray]]

A list of all possible actions for each World in the batch, indexed into a ProductionRuleArray using a ProductionRuleField. We will embed all of these and use the embeddings to determine which action to take at each timestep in the decoder.

example_lisp_string : List[str]

The example (lisp-formatted) string corresponding to the given input. This comes directly from the .examples file provided with the dataset. We pass this to SEMPRE when evaluating denotation accuracy; it is otherwise unused.

classmethod from_params(vocab, params: allennlp.common.params.Params) → allennlp.models.semantic_parsing.wikitables.wikitables_erm_semantic_parser.WikiTablesErmSemanticParser[source]
get_metrics(reset: bool = False) → typing.Dict[str, float][source]

The base class returns a dict with dpd accuracy, denotation accuracy, and logical form percentage metrics. We add the agenda coverage metric here.

class allennlp.models.semantic_parsing.wikitables.wikitables_decoder_state.WikiTablesDecoderState(batch_indices: typing.List[int], action_history: typing.List[typing.List[int]], score: typing.List[torch.Tensor], rnn_state: typing.List[allennlp.nn.decoding.rnn_state.RnnState], grammar_state: typing.List[allennlp.nn.decoding.grammar_state.GrammarState], action_embeddings: torch.Tensor, output_action_embeddings: torch.Tensor, action_biases: torch.Tensor, action_indices: typing.Dict[typing.Tuple[int, int], int], possible_actions: typing.List[typing.List[typing.Tuple[[str, bool], typing.Union[torch.Tensor, NoneType]]]], flattened_linking_scores: torch.FloatTensor, actions_to_entities: typing.Dict[typing.Tuple[int, int], int], entity_types: typing.Dict[int, int], world: typing.List[allennlp.semparse.worlds.wikitables_world.WikiTablesWorld] = None, example_lisp_string: typing.List[str] = None, checklist_state: typing.List[allennlp.nn.decoding.checklist_state.ChecklistState] = None, debug_info: typing.List = None) → None[source]

Bases: allennlp.nn.decoding.decoder_state.DecoderState

Parameters:
batch_indices : List[int]

Passed to super class; see docs there.

action_history : List[List[int]]

Passed to super class; see docs there.

score : List[torch.Tensor]

Passed to super class; see docs there.

rnn_state : List[RnnState]

An RnnState for every group element. This keeps track of the current decoder hidden state, the previous decoder output, the output from the encoder (for computing attentions), and other things that are typical seq2seq decoder state things.

grammar_state : List[GrammarState]

This hold the current grammar state for each element of the group. The GrammarState keeps track of which actions are currently valid.

action_embeddings : torch.Tensor

The global action embeddings tensor. Has shape (num_global_embeddable_actions, action_embedding_dim).

output_action_embeddings : torch.Tensor

The global output action embeddings tensor. Has shape (num_global_embeddable_actions, action_embedding_dim).

action_biases : torch.Tensor

A vector of biases for each action. Has shape (num_global_embeddable_actions, 1).

action_indices : Dict[Tuple[int, int], int]

A mapping from (batch_index, action_index) to global_action_index.

possible_actions : List[List[ProductionRuleArray]]

The list of all possible actions that was passed to model.forward(). We need this so we can recover production strings, which we need to update grammar states.

flattened_linking_scores : torch.FloatTensor

Linking scores between table entities and question tokens. The unflattened version has shape (batch_size, num_entities, num_question_tokens), though this version is flattened to have shape (batch_size * num_entities, num_question_tokens), for easier lookups with index_select.

actions_to_entities : Dict[Tuple[int, int], int]

A mapping from (batch_index, action_index) to batch_size * num_entities, for actions that are terminal entity productions.

entity_types : Dict[int, int]

A mapping from flattened entity indices (same as the values in the actions_to_entities dictionary) to entity type indices. This represents what type each entity has, which we will use for getting type embeddings in certain circumstances.

world : List[WikiTablesWorld], optional (default=None)

The worlds corresponding to elements in the batch. We store them here because they’re required for executing logical forms to determine costs while training, if we’re learning to search. Otherwise, they’re not required. Note that the worlds are batched, and they will be passed around unchanged during the decoding process.

example_lisp_string : List[str], optional (default=None)

The lisp strings that come from example files. They’re also required for evaluating logical forms only if we’re learning to search. These too are batched, and will be passed around unchanged.

checklist_state : List[ChecklistState], optional (default=None)

If you are using this state within a parser being trained for coverage, we need to store a ChecklistState which keeps track of the coverage information. Not needed if you are using a non-coverage based training algorithm.

classmethod combine_states(states: typing.List[_ForwardRef('WikiTablesDecoderState')]) → allennlp.models.semantic_parsing.wikitables.wikitables_decoder_state.WikiTablesDecoderState[source]
get_valid_actions() → typing.List[typing.List[int]][source]

Returns a list of valid actions for each element of the group.

is_finished() → bool[source]
print_action_history(group_index: int = None) → None[source]
class allennlp.models.semantic_parsing.wikitables.wikitables_decoder_step.WikiTablesDecoderStep(encoder_output_dim: int, action_embedding_dim: int, input_attention: allennlp.modules.attention.attention.Attention, num_start_types: int, num_entity_types: int, mixture_feedforward: allennlp.modules.feedforward.FeedForward = None, dropout: float = 0.0, unlinked_terminal_indices: typing.List[int] = None) → None[source]

Bases: allennlp.nn.decoding.decoder_step.DecoderStep

Parameters:
encoder_output_dim : int
action_embedding_dim : int
input_attention : Attention
num_start_types : int
num_entity_types : int
mixture_feedforward : FeedForward (optional, default=None)
dropout : float (optional, default=0.0)
unlinked_terminal_indices : List[int], (optional, default=None)

If we are training a parser to maximize coverage using a checklist, we need to know the global indices of the unlinked terminal productions to be able to compute the checklist corresponding to those terminals, and project a concatenation of the current hidden state, attended encoder input and the current checklist balance into the action space. This is not needed if we are training the parser using target action sequences.

attend_on_question(query: torch.Tensor, encoder_outputs: torch.Tensor, encoder_output_mask: torch.Tensor) → typing.Tuple[torch.Tensor, torch.Tensor][source]

Given a query (which is typically the decoder hidden state), compute an attention over the output of the question encoder, and return a weighted sum of the question representations given this attention. We also return the attention weights themselves.

This is a simple computation, but we have it as a separate method so that the forward method on the main parser module can call it on the initial hidden state, to simplify the logic in take_step.

take_step(state: allennlp.models.semantic_parsing.wikitables.wikitables_decoder_state.WikiTablesDecoderState, max_actions: int = None, allowed_actions: typing.List[typing.Set[int]] = None) → typing.List[allennlp.models.semantic_parsing.wikitables.wikitables_decoder_state.WikiTablesDecoderState][source]

The main method in the DecoderStep API. This function defines the computation done at each step of decoding and returns a ranked list of next states.

The input state is grouped, to allow for efficient computation, but the output states should all have a group_size of 1, to make things easier on the decoding algorithm. They will get regrouped later as needed.

Because of the way we handle grouping in the decoder states, constructing a new state is actually a relatively expensive operation. If you know a priori that only some of the states will be needed (either because you have a set of gold action sequences, or you have a fixed beam size), passing that information into this function will keep us from constructing more states than we need, which will greatly speed up your computation.

IMPORTANT: This method must returns states already sorted by their score, otherwise BeamSearch and other methods will break. For efficiency, we do not perform an additional sort in those methods.

Parameters:
state : DecoderState

The current state of the decoder, which we will take a step from. We may be grouping together computation for several states here. Because we can have several states for each instance in the original batch being evaluated at the same time, we use group_size for this kind of batching, and batch_size for the original batch in model.forward.

max_actions : int, optional

If you know that you will only need a certain number of states out of this (e.g., in a beam search), you can pass in the max number of actions that you need, and we will only construct that many states (for each batch instance - not for each group instance!). This can save a whole lot of computation if you have an action space that’s much larger than your beam size.

allowed_actions : List[Set], optional

If the DecoderTrainer has constraints on which actions need to be evaluated (e.g., maximum marginal likelihood only needs to evaluate action sequences in a given set), you can pass those constraints here, to avoid constructing state objects unnecessarily. If there are no constraints from the trainer, passing a value of None here will allow all actions to be considered.

This is a list because it is batched - every instance in the batch has a set of allowed actions. Note that the size of this list is the group_size in the DecoderState, not the batch_size of model.forward. The training algorithm needs to convert from the batched allowed action sequences that it has to a grouped allowed action sequence list.

Returns:
next_states : List[DecoderState]

A list of next states, ordered by score.