Helper functions for Trainers


Bases: object

tqdm_ignores_underscores = False allennlp.common.params.Params, serialization_dir: str, recover: bool, force: bool) → None[source]

This function creates the serialization directory if it doesn’t exist. If it already exists and is non-empty, then it verifies that we’re recovering from a training with an identical configuration.

params: ``Params``

A parameter object specifying an AllenNLP Experiment.

serialization_dir: ``str``

The directory in which to save results and logs.

recover: ``bool``

If True, we will try to recover from an existing serialization directory, and crash if the directory doesn’t exist, or doesn’t match the configuration we’re given.

force: ``bool``

If True, we will overwrite the serialization directory if it already exists. List[Dict[str, Union[torch.Tensor, Dict[str, torch.Tensor]]]], model: allennlp.models.model.Model, cuda_devices: List) → Dict[str, torch.Tensor][source]

Performs a forward pass using multiple GPUs. This is a simplification of torch.nn.parallel.data_parallel to support the allennlp model interface. allennlp.common.params.Params) → Dict[str, Iterable[]][source]

Load all the datasets specified by the config. Dict[str, float]) → str[source] allennlp.models.model.Model, grad_clipping: Optional[float]) → None[source] allennlp.models.model.Model, instances: Iterable[], data_iterator:, cuda_device: int, batch_weight_key: str) → Dict[str, Any][source] Union[Dict, torch.Tensor]) → int[source]

Returns the size of the batch dimension. Assumes a well-formed batch, returns 0 otherwise. allennlp.models.model.Model, total_loss: float, num_batches: int, reset: bool = False) → Dict[str, float][source]

Gets the metrics but sets "loss" to the total loss divided by the num_batches so that the "loss" metric is “average loss per batch”.[source]

Move the optimizer state to GPU, if necessary. After calling, any parameter specific state in the optimizer will be located on the same device as the parameter. allennlp.models.model.Model, grad_norm: Optional[float] = None) → Optional[float][source]

Performs gradient rescaling. Is a no-op if gradient rescaling is not enabled., max_norm, norm_type=2) → float[source]

Clips gradient norm of an iterable of parameters.

The norm is computed over all gradients together, as if they were concatenated into a single vector. Gradients are modified in-place. Supports sparse gradients.

parameters : (Iterable[torch.Tensor])

An iterable of Tensors that will have gradients normalized.

max_norm : float

The max norm of the gradients.

norm_type : float

The type of the used p-norm. Can be 'inf' for infinity norm.

Total norm of the parameters (viewed as a single vector). str) → datetime.datetime[source]

Convert human readable string to datetime.datetime. int) → str[source]

Convert seconds past Epoch to human readable string.