allennlp.tools

Modules containing official evaluators of various tasks for which we build models.

allennlp.tools.drop_eval.answer_json_to_strings(answer: Dict[str, Any]) → Tuple[Tuple[str, ...], str][source]

Takes an answer JSON blob from the DROP data release and converts it into strings used for evaluation.

allennlp.tools.drop_eval.evaluate_json(annotations: Dict[str, Any], predicted_answers: Dict[str, Any]) → Tuple[float, float][source]

Takes gold annotations and predicted answers and evaluates the predictions for each question in the gold annotations. Both JSON dictionaries must have query_id keys, which are used to match predictions to gold annotations (note that these are somewhat deep in the JSON for the gold annotations, but must be top-level keys in the predicted answers).

The annotations are assumed to have the format of the dev set in the DROP data release. The predicted_answers JSON must be a dictionary keyed by query id, where the value is a string (or list of strings) that is the answer.

allennlp.tools.drop_eval.evaluate_prediction_file(prediction_path: str, gold_path: str) → Tuple[float, float][source]

Takes a prediction file and a gold file and evaluates the predictions for each question in the gold file. Both files must be json formatted and must have query_id keys, which are used to match predictions to gold annotations. The gold file is assumed to have the format of the dev set in the DROP data release. The prediction file must be a JSON dictionary keyed by query id, where the value is either a JSON dictionary with an “answer” key, or just a string (or list of strings) that is the answer.

allennlp.tools.drop_eval.get_metrics(predicted: Union[str, List[str], Tuple[str, ...]], gold: Union[str, List[str], Tuple[str, ...]]) → Tuple[float, float][source]

Takes a predicted answer and a gold answer (that are both either a string or a list of strings), and returns exact match and the DROP F1 metric for the prediction. If you are writing a script for evaluating objects in memory (say, the output of predictions during validation, or while training), this is the function you want to call, after using answer_json_to_strings() when reading the gold answer from the released data file.

Official evaluation script for v1.1 of the SQuAD dataset.

allennlp.tools.squad_eval.evaluate(dataset, predictions)[source]
allennlp.tools.squad_eval.exact_match_score(prediction, ground_truth)[source]
allennlp.tools.squad_eval.f1_score(prediction, ground_truth)[source]
allennlp.tools.squad_eval.metric_max_over_ground_truths(metric_fn, prediction, ground_truths)[source]
allennlp.tools.squad_eval.normalize_answer(s)[source]

Lower text and remove punctuation, articles and extra whitespace.

This is the official evaluator taken from the original dataset. I made minimal changes to make it Python 3 compatible, and conform to our style guidelines.

class allennlp.tools.wikitables_evaluator.DateValue(year, month, day, original_string=None)[source]

Bases: allennlp.tools.wikitables_evaluator.Value

match(other)[source]

Return True if the value matches the other value.

Args:
other (Value)
Returns:
a boolean
static parse(text)[source]

Try to parse into a date.

Return:
tuple (year, month, date) if successful; otherwise None.
ymd
class allennlp.tools.wikitables_evaluator.NumberValue(amount, original_string=None)[source]

Bases: allennlp.tools.wikitables_evaluator.Value

amount
match(other)[source]

Return True if the value matches the other value.

Args:
other (Value)
Returns:
a boolean
static parse(text)[source]

Try to parse into a number.

Return:
the number (int or float) if successful; otherwise None.
class allennlp.tools.wikitables_evaluator.StringValue(content)[source]

Bases: allennlp.tools.wikitables_evaluator.Value

match(other)[source]

Return True if the value matches the other value.

Args:
other (Value)
Returns:
a boolean
class allennlp.tools.wikitables_evaluator.Value[source]

Bases: object

match(other)[source]

Return True if the value matches the other value.

Args:
other (Value)
Returns:
a boolean
normalized
allennlp.tools.wikitables_evaluator.check_denotation(target_values, predicted_values)[source]

Return True if the predicted denotation is correct.

Args:
target_values (list[Value]) predicted_values (list[Value])
Returns:
bool
allennlp.tools.wikitables_evaluator.main()[source]
allennlp.tools.wikitables_evaluator.normalize(x)[source]
allennlp.tools.wikitables_evaluator.to_value(original_string, corenlp_value=None)[source]

Convert the string to Value object.

Args:
original_string (basestring): Original string corenlp_value (basestring): Optional value returned from CoreNLP
Returns:
Value
allennlp.tools.wikitables_evaluator.to_value_list(original_strings, corenlp_values=None)[source]

Convert a list of strings to a list of Values

Args:
original_strings (list[basestring]) corenlp_values (list[basestring or None])
Returns:
list[Value]
allennlp.tools.wikitables_evaluator.tsv_unescape(x)[source]

Unescape strings in the TSV file. Escaped characters include: - newline (0x10) -> backslash + n - vertical bar (0x7C) -> backslash + p - backslash (0x5C) -> backslash + backslash

Parameters:
x : str
Returns:
``str``
allennlp.tools.wikitables_evaluator.tsv_unescape_list(x)[source]

Unescape a list in the TSV file. List items are joined with vertical bars (0x5C)

Args:
x (str or unicode)
Returns:
a list of unicodes

Helper script for modifying config.json files that are locked inside model.tar.gz archives. This is useful if you need to rename things or add or remove values, usually because of changes to the library.

This script will untar the archive to a temp directory, launch an editor to modify the config.json, and then re-tar everything to a new archive. If your $EDITOR environment variable is not set, you’ll have to explicitly specify which editor to use.

allennlp.tools.archive_surgery.main()[source]
allennlp.tools.create_elmo_embeddings_from_vocab.main(vocab_path: str, elmo_config_path: str, elmo_weights_path: str, output_dir: str, batch_size: int, device: int, use_custom_oov_token: bool = False)[source]

Creates ELMo word representations from a vocabulary file. These word representations are _independent_ - they are the result of running the CNN and Highway layers of the ELMo model, but not the Bidirectional LSTM. ELMo requires 2 additional tokens: <S> and </S>. The first token in this file is assumed to be an unknown token.

This script produces two artifacts: A new vocabulary file with the <S> and </S> tokens inserted and a glove formatted embedding file containing word : vector pairs, one per line, with all values separated by a space.

allennlp.tools.inspect_cache.main()[source]