allennlp.commands.elmoΒΆ

The elmo subcommand allows you to make bulk ELMo predictions.

Given a pre-processed input text file, this command outputs the internal layers used to compute ELMo representations to a single (potentially large) file.

The input file is previously tokenized, whitespace separated text, one sentence per line. The output is a hdf5 file (<http://docs.h5py.org/en/latest/>) where, with the –all flag, each sentence is a size (3, num_tokens, 1024) array with the biLM representations.

For information, see “Deep contextualized word representations”, Peters et al 2018. https://arxiv.org/abs/1802.05365

$ allennlp elmo --help
usage: allennlp [command] elmo [-h] [--vocab-path VOCAB_PATH]
                                    [--options-file OPTIONS_FILE]
                                    [--weight-file WEIGHT_FILE]
                                    [--batch-size BATCH_SIZE]
                                    [--cuda-device CUDA_DEVICE]
                                    input_file output_file

Create word vectors using ELMo.

positional arguments:
  input_file            path to input file
  output_file           path to output file

optional arguments:
  -h, --help            show this help message and exit
  --vocab-path VOCAB_PATH
                        A path to a vocabulary file to generate
  --options-file OPTIONS_FILE
                        The path to the ELMo options file.
  --weight-file WEIGHT_FILE
                        The path to the ELMo weight file.
  --batch-size BATCH_SIZE
                        The batch size to use.
  --cuda-device CUDA_DEVICE
                        The cuda_device to run on.