allennlp.semparse.contexts

A KnowledgeGraph is a graphical representation of some structured knowledge source: say a table, figure or an explicit knowledge base.

class allennlp.semparse.contexts.knowledge_graph.KnowledgeGraph(entities: typing.Set[str], neighbors: typing.Dict[str, typing.List[str]], entity_text: typing.Dict[str, str] = None) → None[source]

Bases: object

A KnowledgeGraph represents a collection of entities and their relationships.

The KnowledgeGraph currently stores (untyped) neighborhood information and text representations of each entity (if there is any).

The knowledge base itself can be a table (like in WikitableQuestions), a figure (like in NLVR) or some other structured knowledge source. This abstract class needs to be inherited for implementing the functionality appropriate for a given KB.

All of the parameters listed below are stored as public attributes.

Parameters:
entities : Set[str]

The string identifiers of the entities in this knowledge graph. We sort this set and store it as a list. The sorting is so that we get a guaranteed consistent ordering across separate runs of the code.

neighbors : Dict[str, List[str]]

A mapping from string identifiers to other string identifiers, denoting which entities are neighbors in the graph.

entity_text : Dict[str, str]

If you have additional text associated with each entity (other than its string identifier), you can store that here. This might be, e.g., the text in a table cell, or the description of a wikipedia entity.

class allennlp.semparse.contexts.table_question_knowledge_graph.TableQuestionKnowledgeGraph(entities: typing.Set[str], neighbors: typing.Dict[str, typing.List[str]], entity_text: typing.Dict[str, str], question_tokens: typing.List[allennlp.data.tokenizers.token.Token]) → None[source]

Bases: allennlp.semparse.contexts.knowledge_graph.KnowledgeGraph

A TableQuestionKnowledgeGraph represents the linkable entities in a table and a question about the table. The linkable entities in a table are the cells and the columns of the table, and the linkable entities from the question are the numbers in the question. We use the question to define our space of allowable numbers, because there are infinitely many numbers that we could include in our action space, and we really don’t want to do that. Additionally, we have a method that returns the set of entities in the graph that are relevant to the question, and we keep the question for this method. See get_linked_agenda_items for more information.

To represent the table as a graph, we make each cell and column a node in the graph, and consider a column’s neighbors to be all cells in that column (and thus each cell has just one neighbor - the column it belongs to). This is a rather simplistic view of the table. For example, we don’t store the order of rows.

We represent numbers as standalone nodes in the graph, without any neighbors.

Additionally, when we encounter cells that can be split, we create fb:part.[something] entities, also without any neighbors.

cell_part_regex = re.compile(',\\s|\\n|/')
get_linked_agenda_items() → typing.List[str][source]

Returns entities that can be linked to spans in the question, that should be in the agenda, for training a coverage based semantic parser. This method essentially does a heuristic entity linking, to provide weak supervision for a learning to search parser.

classmethod read_from_file(filename: str, question: typing.List[allennlp.data.tokenizers.token.Token]) → allennlp.semparse.contexts.table_question_knowledge_graph.TableQuestionKnowledgeGraph[source]

We read tables formatted as TSV files here. We assume the first line in the file is a tab separated list of column headers, and all subsequent lines are content rows. For example if the TSV file is:

Nation Olympics Medals USA 1896 8 China 1932 9

we read “Nation”, “Olympics” and “Medals” as column headers, “USA” and “China” as cells under the “Nation” column and so on.

classmethod read_from_json(json_object: typing.Dict[str, typing.Any]) → allennlp.semparse.contexts.table_question_knowledge_graph.TableQuestionKnowledgeGraph[source]

We read tables formatted as JSON objects (dicts) here. This is useful when you are reading data from a demo. The expected format is:

{"question": [token1, token2, ...],
 "columns": [column1, column2, ...],
 "cells": [[row1_cell1, row1_cell2, ...],
           [row2_cell1, row2_cell2, ...],
           ... ]}
classmethod read_from_lines(lines: typing.List[str], question: typing.List[allennlp.data.tokenizers.token.Token]) → allennlp.semparse.contexts.table_question_knowledge_graph.TableQuestionKnowledgeGraph[source]