The evaluate subcommand can be used to evaluate a trained model against a dataset and report any metrics calculated by the model.

$ allennlp evaluate --help
usage: allennlp evaluate [-h] [--output-file OUTPUT_FILE]
                         [--weights-file WEIGHTS_FILE]
                         [--cuda-device CUDA_DEVICE] [-o OVERRIDES]
                         [--include-package INCLUDE_PACKAGE]
                         archive_file input_file

Evaluate the specified model + dataset

positional arguments:
archive_file          path to an archived trained model
input_file            path to the file containing the evaluation data

optional arguments:
-h, --help            show this help message and exit
--output-file OUTPUT_FILE
                        path to output file to save metrics
--weights-file WEIGHTS_FILE
                        a path that overrides which weights file to use
--cuda-device CUDA_DEVICE
                        id of GPU to use (if any)
                        a JSON structure used to override the experiment
--include-package INCLUDE_PACKAGE
                        additional packages to include