allennlp.training.util

Helper functions for Trainers

class allennlp.training.util.HasBeenWarned[source]

Bases: object

tqdm_ignores_underscores = False
allennlp.training.util.create_serialization_dir(params: allennlp.common.params.Params, serialization_dir: str, recover: bool, force: bool) → None[source]

This function creates the serialization directory if it doesn’t exist. If it already exists and is non-empty, then it verifies that we’re recovering from a training with an identical configuration.

Parameters:
params: ``Params``

A parameter object specifying an AllenNLP Experiment.

serialization_dir: ``str``

The directory in which to save results and logs.

recover: ``bool``

If True, we will try to recover from an existing serialization directory, and crash if the directory doesn’t exist, or doesn’t match the configuration we’re given.

force: ``bool``

If True, we will overwrite the serialization directory if it already exists.

allennlp.training.util.data_parallel(batch_group: List[Dict[str, Union[torch.Tensor, Dict[str, torch.Tensor]]]], model: allennlp.models.model.Model, cuda_devices: List) → Dict[str, torch.Tensor][source]

Performs a forward pass using multiple GPUs. This is a simplification of torch.nn.parallel.data_parallel to support the allennlp model interface.

allennlp.training.util.datasets_from_params(params: allennlp.common.params.Params) → Dict[str, Iterable[allennlp.data.instance.Instance]][source]

Load all the datasets specified by the config.

allennlp.training.util.description_from_metrics(metrics: Dict[str, float]) → str[source]
allennlp.training.util.enable_gradient_clipping(model: allennlp.models.model.Model, grad_clipping: Optional[float]) → None[source]
allennlp.training.util.evaluate(model: allennlp.models.model.Model, instances: Iterable[allennlp.data.instance.Instance], data_iterator: allennlp.data.iterators.data_iterator.DataIterator, cuda_device: int, batch_weight_key: str) → Dict[str, Any][source]
allennlp.training.util.get_batch_size(batch: Union[Dict, torch.Tensor]) → int[source]

Returns the size of the batch dimension. Assumes a well-formed batch, returns 0 otherwise.

allennlp.training.util.get_metrics(model: allennlp.models.model.Model, total_loss: float, num_batches: int, reset: bool = False) → Dict[str, float][source]

Gets the metrics but sets "loss" to the total loss divided by the num_batches so that the "loss" metric is “average loss per batch”.

allennlp.training.util.move_optimizer_to_cuda(optimizer)[source]

Move the optimizer state to GPU, if necessary. After calling, any parameter specific state in the optimizer will be located on the same device as the parameter.

allennlp.training.util.rescale_gradients(model: allennlp.models.model.Model, grad_norm: Optional[float] = None) → Optional[float][source]

Performs gradient rescaling. Is a no-op if gradient rescaling is not enabled.

allennlp.training.util.sparse_clip_norm(parameters, max_norm, norm_type=2) → float[source]

Clips gradient norm of an iterable of parameters.

The norm is computed over all gradients together, as if they were concatenated into a single vector. Gradients are modified in-place. Supports sparse gradients.

Parameters:
parameters : (Iterable[torch.Tensor])

An iterable of Tensors that will have gradients normalized.

max_norm : float

The max norm of the gradients.

norm_type : float

The type of the used p-norm. Can be 'inf' for infinity norm.

Returns:
Total norm of the parameters (viewed as a single vector).
allennlp.training.util.str_to_time(time_str: str) → datetime.datetime[source]

Convert human readable string to datetime.datetime.

allennlp.training.util.time_to_str(timestamp: int) → str[source]

Convert seconds past Epoch to human readable string.