allennlp.predictors

A Predictor is a wrapper for an AllenNLP Model that makes JSON predictions using JSON inputs. If you want to serve up a model through the web service (or using allennlp.commands.predict), you’ll need a Predictor that wraps it.

class allennlp.predictors.predictor.Predictor(model: allennlp.models.model.Model, dataset_reader: allennlp.data.dataset_readers.dataset_reader.DatasetReader)[source]

Bases: allennlp.common.registrable.Registrable

a Predictor is a thin wrapper around an AllenNLP model that handles JSON -> JSON predictions that can be used for serving models through the web API or making predictions in bulk.

capture_model_internals(self) → Iterator[dict][source]

Context manager that captures the internal-module outputs of this predictor’s model. The idea is that you could use it as follows:

with predictor.capture_model_internals() as internals:
    outputs = predictor.predict_json(inputs)

return {**outputs, "model_internals": internals}
dump_line(self, outputs:Dict[str, Any]) → str[source]

If you don’t want your outputs in JSON-lines format you can override this function to output them differently.

classmethod from_archive(archive:allennlp.models.archival.Archive, predictor_name:str=None) → 'Predictor'[source]

Instantiate a Predictor from an Archive; that is, from the result of training a model. Optionally specify which Predictor subclass; otherwise, the default one for the model will be used.

classmethod from_path(archive_path:str, predictor_name:str=None) → 'Predictor'[source]

Instantiate a Predictor from an archive path.

If you need more detailed configuration options, such as running the predictor on the GPU, please use from_archive.

Parameters
archive_path The path to the archive.
Returns
A Predictor instance.
load_line(self, line:str) → Dict[str, Any][source]

If your inputs are not in JSON-lines format (e.g. you have a CSV) you can override this function to parse them correctly.

predict_batch_instance(self, instances:List[allennlp.data.instance.Instance]) → List[Dict[str, Any]][source]
predict_batch_json(self, inputs:List[Dict[str, Any]]) → List[Dict[str, Any]][source]
predict_instance(self, instance:allennlp.data.instance.Instance) → Dict[str, Any][source]
predict_json(self, inputs:Dict[str, Any]) → Dict[str, Any][source]
class allennlp.predictors.bidaf.BidafPredictor(model: allennlp.models.model.Model, dataset_reader: allennlp.data.dataset_readers.dataset_reader.DatasetReader)[source]

Bases: allennlp.predictors.predictor.Predictor

Predictor for the BidirectionalAttentionFlow model.

predict(self, question:str, passage:str) → Dict[str, Any][source]

Make a machine comprehension prediction on the supplied input. See https://rajpurkar.github.io/SQuAD-explorer/ for more information about the machine comprehension task.

Parameters
questionstr

A question about the content in the supplied paragraph. The question must be answerable by a span in the paragraph.

passagestr

A paragraph of information relevant to the question.

Returns
A dictionary that represents the prediction made by the system. The answer string will be under the
“best_span_str” key.
class allennlp.predictors.decomposable_attention.DecomposableAttentionPredictor(model: allennlp.models.model.Model, dataset_reader: allennlp.data.dataset_readers.dataset_reader.DatasetReader)[source]

Bases: allennlp.predictors.predictor.Predictor

Predictor for the DecomposableAttention model.

predict(self, premise:str, hypothesis:str) → Dict[str, Any][source]

Predicts whether the hypothesis is entailed by the premise text.

Parameters
premisestr

A passage representing what is assumed to be true.

hypothesisstr

A sentence that may be entailed by the premise.

Returns
A dictionary where the key “label_probs” determines the probabilities of each of
[entailment, contradiction, neutral].
class allennlp.predictors.dialog_qa.DialogQAPredictor(model: allennlp.models.model.Model, dataset_reader: allennlp.data.dataset_readers.dataset_reader.DatasetReader, language: str = 'en_core_web_sm')[source]

Bases: allennlp.predictors.predictor.Predictor

predict(self, jsonline:str) → Dict[str, Any][source]

Make a dialog-style question answering prediction on the supplied input. The supplied input json must contain a list of question answer pairs, containing question, answer, yesno, followup, id as well as the context (passage).

Parameters
jsonline: ``str``

A json line that has the same format as the quac data file.

Returns
A dictionary that represents the prediction made by the system. The answer string will be under the
“best_span_str” key.
class allennlp.predictors.semantic_role_labeler.SemanticRoleLabelerPredictor(model: allennlp.models.model.Model, dataset_reader: allennlp.data.dataset_readers.dataset_reader.DatasetReader)[source]

Bases: allennlp.predictors.predictor.Predictor

Predictor for the SemanticRoleLabeler model.

static make_srl_string(words:List[str], tags:List[str]) → str[source]
predict(self, sentence:str) → Dict[str, Any][source]

Predicts the semantic roles of the supplied sentence and returns a dictionary with the results.

{"words": [...],
 "verbs": [
    {"verb": "...", "description": "...", "tags": [...]},
    ...
    {"verb": "...", "description": "...", "tags": [...]},
]}
Parameters
sentence, ``str``

The sentence to parse via semantic role labeling.

Returns
A dictionary representation of the semantic roles in the sentence.
predict_batch_json(self, inputs:List[Dict[str, Any]]) → List[Dict[str, Any]][source]

Expects JSON that looks like [{"sentence": "..."}, {"sentence": "..."}, ...] and returns JSON that looks like

[
    {"words": [...],
     "verbs": [
        {"verb": "...", "description": "...", "tags": [...]},
        ...
        {"verb": "...", "description": "...", "tags": [...]},
    ]},
    {"words": [...],
     "verbs": [
        {"verb": "...", "description": "...", "tags": [...]},
        ...
        {"verb": "...", "description": "...", "tags": [...]},
    ]}
]
predict_instances(self, instances:List[allennlp.data.instance.Instance]) → Dict[str, Any][source]
predict_json(self, inputs:Dict[str, Any]) → Dict[str, Any][source]

Expects JSON that looks like {"sentence": "..."} and returns JSON that looks like

{"words": [...],
 "verbs": [
    {"verb": "...", "description": "...", "tags": [...]},
    ...
    {"verb": "...", "description": "...", "tags": [...]},
]}
predict_tokenized(self, tokenized_sentence:List[str]) → Dict[str, Any][source]

Predicts the semantic roles of the supplied sentence tokens and returns a dictionary with the results.

Parameters
tokenized_sentence, ``List[str]``

The sentence tokens to parse via semantic role labeling.

Returns
A dictionary representation of the semantic roles in the sentence.
tokens_to_instances(self, tokens)[source]
class allennlp.predictors.sentence_tagger.SentenceTaggerPredictor(model: allennlp.models.model.Model, dataset_reader: allennlp.data.dataset_readers.dataset_reader.DatasetReader, language: str = 'en_core_web_sm')[source]

Bases: allennlp.predictors.predictor.Predictor

Predictor for any model that takes in a sentence and returns a single set of tags for it. In particular, it can be used with the CrfTagger model and also the SimpleTagger model.

predict(self, sentence:str) → Dict[str, Any][source]
class allennlp.predictors.coref.CorefPredictor(model: allennlp.models.model.Model, dataset_reader: allennlp.data.dataset_readers.dataset_reader.DatasetReader, language: str = 'en_core_web_sm')[source]

Bases: allennlp.predictors.predictor.Predictor

Predictor for the CoreferenceResolver model.

coref_resolved(self, document:str) → str[source]

Produce a document where each coreference is replaced by the its main mention

Parameters
documentstr

A string representation of a document.

Returns
A string with each coference replaced by its main mention
predict(self, document:str) → Dict[str, Any][source]

Predict the coreference clusters in the given document.

{
"document": [tokenised document text]
"clusters":
  [
    [
      [start_index, end_index],
      [start_index, end_index]
    ],
    [
      [start_index, end_index],
      [start_index, end_index],
      [start_index, end_index],
    ],
    ....
  ]
}
Parameters
documentstr

A string representation of a document.

Returns
A dictionary representation of the predicted coreference clusters.
predict_tokenized(self, tokenized_document:List[str]) → Dict[str, Any][source]

Predict the coreference clusters in the given document.

Parameters
tokenized_documentList[str]

A list of words representation of a tokenized document.

Returns
A dictionary representation of the predicted coreference clusters.
static replace_corefs(document:spacy.tokens.doc.Doc, clusters:List[List[List[int]]]) → str[source]

Uses a list of coreference clusters to convert a spacy document into a string, where each coreference is replaced by its main mention.

class allennlp.predictors.constituency_parser.ConstituencyParserPredictor(model: allennlp.models.model.Model, dataset_reader: allennlp.data.dataset_readers.dataset_reader.DatasetReader, language: str = 'en_core_web_sm')[source]

Bases: allennlp.predictors.predictor.Predictor

Predictor for the SpanConstituencyParser model.

predict(self, sentence:str) → Dict[str, Any][source]

Predict a constituency parse for the given sentence. Parameters ———- sentence The sentence to parse.

Returns
A dictionary representation of the constituency tree.
predict_batch_instance(self, instances:List[allennlp.data.instance.Instance]) → List[Dict[str, Any]][source]
predict_instance(self, instance:allennlp.data.instance.Instance) → Dict[str, Any][source]
class allennlp.predictors.seq2seq.Seq2SeqPredictor(model: allennlp.models.model.Model, dataset_reader: allennlp.data.dataset_readers.dataset_reader.DatasetReader)[source]

Bases: allennlp.predictors.predictor.Predictor

Predictor for sequence to sequence models, including simple_seq2seq and copynet_seq2seq.

predict(self, source:str) → Dict[str, Any][source]
class allennlp.predictors.simple_seq2seq.SimpleSeq2SeqPredictor(model: allennlp.models.model.Model, dataset_reader: allennlp.data.dataset_readers.dataset_reader.DatasetReader)[source]

Bases: allennlp.predictors.seq2seq.Seq2SeqPredictor

Predictor for the simple_seq2seq model.

class allennlp.predictors.wikitables_parser.WikiTablesParserPredictor(model: allennlp.models.model.Model, dataset_reader: allennlp.data.dataset_readers.dataset_reader.DatasetReader)[source]

Bases: allennlp.predictors.predictor.Predictor

Wrapper for the WikiTablesSemanticParser model.

predict_batch_instance(self, instances:List[allennlp.data.instance.Instance]) → List[Dict[str, Any]][source]
predict_instance(self, instance:allennlp.data.instance.Instance) → Dict[str, Any][source]
predict_json(self, inputs:Dict[str, Any]) → Dict[str, Any][source]

We need to override this because of the interactive beam search aspects.

class allennlp.predictors.nlvr_parser.NlvrParserPredictor(model: allennlp.models.model.Model, dataset_reader: allennlp.data.dataset_readers.dataset_reader.DatasetReader)[source]

Bases: allennlp.predictors.predictor.Predictor

dump_line(self, outputs:Dict[str, Any]) → str[source]

If you don’t want your outputs in JSON-lines format you can override this function to output them differently.

class allennlp.predictors.quarel_parser.QuarelParserPredictor(model: allennlp.models.model.Model, dataset_reader: allennlp.data.dataset_readers.dataset_reader.DatasetReader)[source]

Bases: allennlp.predictors.predictor.Predictor

Wrapper for the quarel_semantic_parser model.

predict_json(self, inputs:Dict[str, Any]) → Dict[str, Any][source]
class allennlp.predictors.biaffine_dependency_parser.BiaffineDependencyParserPredictor(model: allennlp.models.model.Model, dataset_reader: allennlp.data.dataset_readers.dataset_reader.DatasetReader, language: str = 'en_core_web_sm')[source]

Bases: allennlp.predictors.predictor.Predictor

Predictor for the BiaffineDependencyParser model.

predict(self, sentence:str) → Dict[str, Any][source]

Predict a dependency parse for the given sentence. Parameters ———- sentence The sentence to parse.

Returns
A dictionary representation of the dependency tree.
predict_batch_instance(self, instances:List[allennlp.data.instance.Instance]) → List[Dict[str, Any]][source]
predict_instance(self, instance:allennlp.data.instance.Instance) → Dict[str, Any][source]
class allennlp.predictors.open_information_extraction.OpenIePredictor(model: allennlp.models.model.Model, dataset_reader: allennlp.data.dataset_readers.dataset_reader.DatasetReader)[source]

Bases: allennlp.predictors.predictor.Predictor

Predictor for the :class: models.SemanticRolelabeler model (in its Open Information variant). Used by online demo and for prediction on an input file using command line.

predict_json(self, inputs:Dict[str, Any]) → Dict[str, Any][source]

Create instance(s) after predicting the format. One sentence containing multiple verbs will lead to multiple instances.

Expects JSON that looks like {"sentence": "..."}

Returns a JSON that looks like

{"tokens": [...],
 "tag_spans": [{"ARG0": "...",
                "V": "...",
                "ARG1": "...",
                 ...}]}
allennlp.predictors.open_information_extraction.consolidate_predictions(outputs:List[List[str]], sent_tokens:List[allennlp.data.tokenizers.token.Token]) → Dict[str, List[str]][source]

Identify that certain predicates are part of a multiword predicate (e.g., “decided to run”) in which case, we don’t need to return the embedded predicate (“run”).

allennlp.predictors.open_information_extraction.get_coherent_next_tag(prev_label:str, cur_label:str) → str[source]

Generate a coherent tag, given previous tag and current label.

allennlp.predictors.open_information_extraction.get_predicate_indices(tags:List[str]) → List[int][source]

Return the word indices of a predicate in BIO tags.

allennlp.predictors.open_information_extraction.get_predicate_text(sent_tokens:List[allennlp.data.tokenizers.token.Token], tags:List[str]) → str[source]

Get the predicate in this prediction.

allennlp.predictors.open_information_extraction.join_mwp(tags:List[str]) → List[str][source]

Join multi-word predicates to a single predicate (‘V’) token.

allennlp.predictors.open_information_extraction.make_oie_string(tokens:List[allennlp.data.tokenizers.token.Token], tags:List[str]) → str[source]

Converts a list of model outputs (i.e., a list of lists of bio tags, each pertaining to a single word), returns an inline bracket representation of the prediction.

allennlp.predictors.open_information_extraction.merge_overlapping_predictions(tags1:List[str], tags2:List[str]) → List[str][source]

Merge two predictions into one. Assumes the predicate in tags1 overlap with the predicate of tags2.

allennlp.predictors.open_information_extraction.predicates_overlap(tags1:List[str], tags2:List[str]) → bool[source]

Tests whether the predicate in BIO tags1 overlap with those of tags2.

allennlp.predictors.open_information_extraction.sanitize_label(label:str) → str[source]

Sanitize a BIO label - this deals with OIE labels sometimes having some noise, as parentheses.

class allennlp.predictors.event2mind.Event2MindPredictor(model: allennlp.models.model.Model, dataset_reader: allennlp.data.dataset_readers.dataset_reader.DatasetReader)[source]

Bases: allennlp.predictors.predictor.Predictor

Predictor for the event2mind model.

predict(self, source:str) → Dict[str, Any][source]

Given a source string of some event, returns a JSON dictionary containing, for each target type, the top predicted sequences as indices, as tokens and the log probability of each.

The JSON dictionary looks like:

{
    `${target_type}_top_k_predictions`: [[1, 2, 3], [4, 5, 6], ...],
    `${target_type}_top_k_predicted_tokens`: [["to", "feel", "brave"], ...],
    `${target_type}_top_k_log_probabilities`: [-0.301, -0.046, ...]
}

By default target_type can be xreact, oreact and xintent.

class allennlp.predictors.atis_parser.AtisParserPredictor(model: allennlp.models.model.Model, dataset_reader: allennlp.data.dataset_readers.dataset_reader.DatasetReader)[source]

Bases: allennlp.predictors.predictor.Predictor

Predictor for the AtisSemanticParser model.

class allennlp.predictors.text_classifier.TextClassifierPredictor(model: allennlp.models.model.Model, dataset_reader: allennlp.data.dataset_readers.dataset_reader.DatasetReader)[source]

Bases: allennlp.predictors.predictor.Predictor

Predictor for any model that takes in a sentence and returns a single class for it. In particular, it can be used with the BasicClassifier model

predict(self, sentence:str) → Dict[str, Any][source]