allennlp.semparse.worlds

exception allennlp.semparse.worlds.world.ExecutionError(message)[source]

Bases: Exception

This exception gets raised when you’re trying to execute a logical form that your executor does not understand. This may be because your logical form contains a function with an invalid name or a set of arguments whose types do not match those that the fuction expects.

exception allennlp.semparse.worlds.world.ParsingError(message)[source]

Bases: Exception

This exception gets raised when there is a parsing error during logical form processing. This might happen because you’re not handling the full set of possible logical forms, for instance, and having this error provides a consistent way to catch those errors and log how frequently this occurs.

class allennlp.semparse.worlds.world.World(constant_type_prefixes: typing.Dict[str, nltk.sem.logic.BasicType] = None, global_type_signatures: typing.Dict[str, nltk.sem.logic.Type] = None, global_name_mapping: typing.Dict[str, str] = None, num_nested_lambdas: int = 0) → None[source]

Bases: object

Base class for defining a world in a new domain. This class defines a method to translate a logical form as per a naming convention that works with NLTK’s LogicParser. The sub-classes can decide on the convention by overriding the _map_name method that does token level mapping. This class also defines methods for transforming logical form strings into parsed Expressions, and Expressions into action sequences.

Parameters:
constant_type_prefixes : Dict[str, BasicType] (optional)

If you have an unbounded number of constants in your domain, you are required to add prefixes to their names to denote their types. This is the mapping from prefixes to types.

global_type_signatures : Dict[str, Type] (optional)

A mapping from translated names to their types.

global_name_mapping : Dict[str, str] (optional)

A name mapping from the original names in the domain to the translated names.

num_nested_lambdas : int (optional)

Does the language used in this World permit lambda expressions? And if so, how many nested lambdas do we need to worry about? This is important when considering the space of all possible actions, which we need to enumerate a priori for the parser.

all_possible_actions() → typing.List[str][source]
get_action_sequence(expression: nltk.sem.logic.Expression) → typing.List[str][source]

Returns the sequence of actions (as strings) that resulted in the given expression.

get_basic_types() → typing.Set[nltk.sem.logic.Type][source]

Returns the set of basic types (types of entities) in the world.

get_logical_form(action_sequence: typing.List[str], add_var_function: bool = True) → str[source]

Takes an action sequence and constructs a logical form from it. This is useful if you want to get a logical form from a decoded sequence of actions generated by a transition based semantic parser.

Parameters:
action_sequence : List[str]

The sequence of actions as strings (eg.: ['{START_SYMBOL} -> t', 't -> <e,t>', ...]).

add_var_function : bool (optional)

var is a special function that some languages use within lambda functions to indicate the use of a variable (eg.: (lambda x (fb:row.row.year (var x)))). Due to the way constrained decoding is currently implemented, it is easier for the decoder to not produce these functions. In that case, setting this flag adds the function in the logical form even though it is not present in the action sequence.

get_name_mapping() → typing.Dict[str, str][source]
get_paths_to_root(action: str, max_path_length: int = 20, beam_size: int = 30, max_num_paths: int = 10) → typing.List[typing.List[str]][source]

For a given action, returns at most max_num_paths paths to the root (production with START_SYMBOL) that are not longer than max_path_length.

get_type_signatures() → typing.Dict[str, str][source]
get_valid_actions() → typing.Dict[str, typing.List[str]][source]
get_valid_starting_types() → typing.Set[nltk.sem.logic.Type][source]

Returns the set of all types t, such that actions {START_SYMBOL} -> t are valid. In other words, these are all the possible types of complete logical forms in this world.

is_terminal(symbol: str) → bool[source]

This function will be called on nodes of a logical form tree, which are either non-terminal symbols that can be expanded or terminal symbols that must be leaf nodes. Returns True if the given symbol is a terminal symbol.

parse_logical_form(logical_form: str, remove_var_function: bool = True) → nltk.sem.logic.Expression[source]

Takes a logical form as a string, maps its tokens using the mapping and returns a parsed expression.

Parameters:
logical_form : str

Logical form to parse

remove_var_function : bool (optional)

var is a special function that some languages use within lambda founctions to indicate the usage of a variable. If your language uses it, and you do not want to include it in the parsed expression, set this flag. You may want to do this if you are generating an action sequence from this parsed expression, because it is easier to let the decoder not produce this function due to the way constrained decoding is currently implemented.

allennlp.semparse.worlds.world.nltk_tree_to_logical_form(tree: nltk.tree.Tree) → str[source]

Given an nltk.Tree representing the syntax tree that generates a logical form, this method produces the actual (lisp-like) logical form, with all of the non-terminal symbols converted into the correct number of parentheses.

We store all the information related to a world (i.e. the context in which logical forms will be executed) here. For WikiTableQuestions, this includes a representation of a table, mapping from Sempre variables in all logical forms to NLTK variables, and the types of all predicates and entities.

class allennlp.semparse.worlds.wikitables_world.WikiTablesWorld(table_graph: allennlp.semparse.contexts.table_question_knowledge_graph.TableQuestionKnowledgeGraph) → None[source]

Bases: allennlp.semparse.worlds.world.World

World representation for the WikitableQuestions domain.

Parameters:
table_graph : TableQuestionKnowledgeGraph

Context associated with this world.

curried_functions = {<n,<n,<#1,<<#2,#1>,#1>>>>: 4, <#1,<#1,#1>>: 2, <n,<n,<n,d>>>: 3, <n,<n,n>>: 2}
get_agenda()[source]
get_basic_types() → typing.Set[nltk.sem.logic.Type][source]

Returns the set of basic types (types of entities) in the world.

get_valid_actions() → typing.Dict[str, typing.List[str]][source]
get_valid_starting_types() → typing.Set[nltk.sem.logic.Type][source]

Returns the set of all types t, such that actions {START_SYMBOL} -> t are valid. In other words, these are all the possible types of complete logical forms in this world.

is_table_entity(entity_name: str) → bool[source]

Returns True if the given entity is one of the entities in the table.

This module defines classes Object and Box (the two entities in the NLVR domain) and an NlvrWorld, which mainly contains an execution method and related helper methods.

class allennlp.semparse.worlds.nlvr_world.Box(objects_list: typing.List[typing.Dict[str, typing.Any]], box_id: int) → None[source]

Bases: object

This class represents each box containing objects in NLVR.

Parameters:
objects_list : List[JsonDict]

List of objects in the box, as given by the json file.

box_id : int

An integer identifying the box index (0, 1 or 2).

class allennlp.semparse.worlds.nlvr_world.NlvrWorld(world_representation: typing.List[typing.List[typing.Dict[str, typing.Any]]]) → None[source]

Bases: allennlp.semparse.worlds.world.World

Class defining the world representation of NLVR. Defines an execution logic for logical forms in NLVR. We just take the structured_rep from the JSON file to initialize this.

Parameters:
world_representation : JsonDict

structured_rep from the JSON file.

above(objects: typing.Set[allennlp.semparse.worlds.nlvr_world.Object]) → typing.Set[allennlp.semparse.worlds.nlvr_world.Object][source]

Returns the set of objects in the same boxes that are above the given objects. That is, if the input is a set of two objects, one in each box, we will return a union of the objects above the first object in the first box, and those above the second object in the second box.

below(objects: typing.Set[allennlp.semparse.worlds.nlvr_world.Object]) → typing.Set[allennlp.semparse.worlds.nlvr_world.Object][source]

Returns the set of objects in the same boxes that are below the given objects. That is, if the input is a set of two objects, one in each box, we will return a union of the objects below the first object in the first box, and those below the second object in the second box.

classmethod big(objects: typing.Set[allennlp.semparse.worlds.nlvr_world.Object]) → typing.Set[allennlp.semparse.worlds.nlvr_world.Object][source]
classmethod black(objects: typing.Set[allennlp.semparse.worlds.nlvr_world.Object]) → typing.Set[allennlp.semparse.worlds.nlvr_world.Object][source]
classmethod blue(objects: typing.Set[allennlp.semparse.worlds.nlvr_world.Object]) → typing.Set[allennlp.semparse.worlds.nlvr_world.Object][source]
bottom(objects: typing.Set[allennlp.semparse.worlds.nlvr_world.Object]) → typing.Set[allennlp.semparse.worlds.nlvr_world.Object][source]

Return the bottom most objects(i.e. maximum y_loc). The comparison is done separately for each box.

classmethod circle(objects: typing.Set[allennlp.semparse.worlds.nlvr_world.Object]) → typing.Set[allennlp.semparse.worlds.nlvr_world.Object][source]
curried_functions = {<b,<c,b>>: 2, <b,<s,b>>: 2, <b,<e,b>>: 2, <o,<c,t>>: 2, <o,<s,t>>: 2, <b,<e,t>>: 2, <o,<e,t>>: 2}
execute(logical_form: str) → bool[source]

Execute the logical form. The top level function is an assertion function (see below). We just parse the string into a list and pass the whole thing to _execute_assertion and let the method deal with it. This is because the dataset contains sentences (instead of questions), and they evaluate to either true or false.

The language we defined here contains six types of functions, five of which return sets, and one returns booleans.

1) Assertion Function : These occur only at the root node of the logical form trees. They take a set of entities, and compare their attributes to a given value, and return true or false. The entities they take can be boxes or objects. If the assertion function takes objects, it may compare their colors or shapes with the given value; If it takes boxes, the attributes it compares are only the counts. The comparison operator can be any of equals, not equals, greater than, etc. So, the function specifies what kind of entities it takes, the attribute being compared and the comparison operator. For example, “object_count_not_equals” takes a set of objects, compares their count to the given value and returns true iff they are not equal. They have names like “object_*” or “box_*”

2) Object Attribute Functions: They take sets of objects and return sets of attributes. color and shape are the attribute functions.

3) Box Membership Function : This takes a box as an argument and returns the objects in it. This is a special kind of attribute function for two reasons. Firstly, it returns a set of objects instead of attributes, and secondly it occurs only within the second argument of a box filtering function (see below). It provides a way to query boxes based on the attributes of objects contained within it. The function is called object_in_box, and it gets executed within _execute_box_filter.

4) Box Filtering Functions : These are of the form filter(set_of_boxes, attribute_function, target_attribute) The idea is that we take a set of boxes, an attribute function that extracts the relevant attribute from a box, and a target attribute that we compare against. The logic is that we execute the attribute function on each of the given boxes and return only those whose attribute value, in comparison with the target attribute, satisfies the filtering criterion (i.e., equal to the target, less than, greater than etc.). The fitering function defines the comparison operator. All the functions in this class with names filter_* belong to this category.

5) Object Filtering Functions : These are of the form filter(set_of_objects). These are similar to box filtering functions, but they operate on objects instead. Also, note that they take just one argument instead of three. This is because while box filtering functions typically query complex attributes, object filtering functions query the properties of the objects alone. These are simple and finite in number. Thus, we essentially let the filtering function define the attribute function, and the target attribute as well, along with the comparison operator. That is, these are functions like black (which takes a set of objects, and returns those whose “color” (attribute function) “equals” (comparison operator) “black” (target attribute)), or “square” (which returns objects that are squares).

6) Negate Object Filter : Takes an object filter and a set of objects and applies the negation of the object filter on the set.

get_agenda_for_sentence(sentence: str, add_paths_to_agenda: bool = False) → typing.List[str][source]

Given a sentence, returns a list of actions the sentence triggers as an agenda. The agenda can be used while by a parser to guide the decoder. sequences as possible. This is a simplistic mapping at this point, and can be expanded.

Parameters:
sentence : str

The sentence for which an agenda will be produced.

add_paths_to_agenda : bool , optional

If set, the agenda will also include nonterminal productions that lead to the terminals from the root node (default = False).

get_basic_types() → typing.Set[nltk.sem.logic.Type][source]

Returns the set of basic types (types of entities) in the world.

get_valid_starting_types() → typing.Set[nltk.sem.logic.Type][source]

Returns the set of all types t, such that actions {START_SYMBOL} -> t are valid. In other words, these are all the possible types of complete logical forms in this world.

classmethod medium(objects: typing.Set[allennlp.semparse.worlds.nlvr_world.Object]) → typing.Set[allennlp.semparse.worlds.nlvr_world.Object][source]
static negate_filter(filter_function: typing.Callable[typing.Set[allennlp.semparse.worlds.nlvr_world.Object], typing.Set[allennlp.semparse.worlds.nlvr_world.Object]], objects: typing.Set[allennlp.semparse.worlds.nlvr_world.Object]) → typing.Set[allennlp.semparse.worlds.nlvr_world.Object][source]
static object_in_box(box: typing.Set[allennlp.semparse.worlds.nlvr_world.Box]) → typing.Set[allennlp.semparse.worlds.nlvr_world.Object][source]
classmethod same_color(objects: typing.Set[allennlp.semparse.worlds.nlvr_world.Object]) → typing.Set[allennlp.semparse.worlds.nlvr_world.Object][source]

Filters the set of objects, and returns those objects whose color is the most frequent color in the initial set of objects, if the highest frequency is greater than 1, or an empty set otherwise.

This is an unusual name for what the method does, but just as blue filters objects to those that are blue, this filters objects to those that are of the same color.

classmethod same_shape(objects: typing.Set[allennlp.semparse.worlds.nlvr_world.Object]) → typing.Set[allennlp.semparse.worlds.nlvr_world.Object][source]

Filters the set of objects, and returns those objects whose color is the most frequent color in the initial set of objects, if the highest frequency is greater than 1, or an empty set otherwise.

This is an unusual name for what the method does, but just as triangle filters objects to those that are triangles, this filters objects to those that are of the same shape.

classmethod small(objects: typing.Set[allennlp.semparse.worlds.nlvr_world.Object]) → typing.Set[allennlp.semparse.worlds.nlvr_world.Object][source]
classmethod square(objects: typing.Set[allennlp.semparse.worlds.nlvr_world.Object]) → typing.Set[allennlp.semparse.worlds.nlvr_world.Object][source]
top(objects: typing.Set[allennlp.semparse.worlds.nlvr_world.Object]) → typing.Set[allennlp.semparse.worlds.nlvr_world.Object][source]

Return the topmost objects (i.e. minimum y_loc). The comparison is done separately for each box.

classmethod touch_bottom(objects: typing.Set[allennlp.semparse.worlds.nlvr_world.Object]) → typing.Set[allennlp.semparse.worlds.nlvr_world.Object][source]
classmethod touch_corner(objects: typing.Set[allennlp.semparse.worlds.nlvr_world.Object]) → typing.Set[allennlp.semparse.worlds.nlvr_world.Object][source]
classmethod touch_left(objects: typing.Set[allennlp.semparse.worlds.nlvr_world.Object]) → typing.Set[allennlp.semparse.worlds.nlvr_world.Object][source]
touch_object(objects: typing.Set[allennlp.semparse.worlds.nlvr_world.Object]) → typing.Set[allennlp.semparse.worlds.nlvr_world.Object][source]

Returns all objects that touch the given set of objects.

classmethod touch_right(objects: typing.Set[allennlp.semparse.worlds.nlvr_world.Object]) → typing.Set[allennlp.semparse.worlds.nlvr_world.Object][source]
classmethod touch_top(objects: typing.Set[allennlp.semparse.worlds.nlvr_world.Object]) → typing.Set[allennlp.semparse.worlds.nlvr_world.Object][source]
classmethod touch_wall(objects: typing.Set[allennlp.semparse.worlds.nlvr_world.Object]) → typing.Set[allennlp.semparse.worlds.nlvr_world.Object][source]
classmethod triangle(objects: typing.Set[allennlp.semparse.worlds.nlvr_world.Object]) → typing.Set[allennlp.semparse.worlds.nlvr_world.Object][source]
classmethod yellow(objects: typing.Set[allennlp.semparse.worlds.nlvr_world.Object]) → typing.Set[allennlp.semparse.worlds.nlvr_world.Object][source]
class allennlp.semparse.worlds.nlvr_world.Object(attributes: typing.Dict[str, typing.Any], box_id: str) → None[source]

Bases: object

Objects are the geometric shapes in the NLVR domain. They have values for attributes shape, color, x_loc, y_loc and size. We take a dict read from the JSON file and store it here, and define a get method for getting the attribute values. We need this to be hashable because need to make sets of Objects during execution, which get passed around between functions.

Parameters:
attributes : JsonDict

The dict for each object from the json file.