allennlp.training.trainer

A Trainer is responsible for training a Model.

Typically you might create a configuration file specifying the model and training parameters and then use train rather than instantiating a Trainer yourself.

class allennlp.training.trainer.TensorboardWriter(train_log: tensorboardX.writer.SummaryWriter = None, validation_log: tensorboardX.writer.SummaryWriter = None) → None[source]

Bases: object

Wraps a pair of SummaryWriter instances but is a no-op if they’re None. Allows Tensorboard logging without always checking for Nones first.

add_train_histogram(name: str, values: torch.Tensor, global_step: int) → None[source]
add_train_scalar(name: str, value: float, global_step: int) → None[source]
add_validation_scalar(name: str, value: float, global_step: int) → None[source]
class allennlp.training.trainer.Trainer(model: allennlp.models.model.Model, optimizer: torch.optim.optimizer.Optimizer, iterator: allennlp.data.iterators.data_iterator.DataIterator, train_dataset: typing.Iterable[allennlp.data.instance.Instance], validation_dataset: typing.Union[typing.Iterable[allennlp.data.instance.Instance], NoneType] = None, patience: typing.Union[int, NoneType] = None, validation_metric: str = '-loss', validation_iterator: allennlp.data.iterators.data_iterator.DataIterator = None, shuffle: bool = True, num_epochs: int = 20, serialization_dir: typing.Union[str, NoneType] = None, num_serialized_models_to_keep: int = 20, keep_serialized_model_every_num_seconds: int = None, model_save_interval: float = None, cuda_device: typing.Union[int, typing.List] = -1, grad_norm: typing.Union[float, NoneType] = None, grad_clipping: typing.Union[float, NoneType] = None, learning_rate_scheduler: typing.Union[allennlp.training.learning_rate_schedulers.LearningRateScheduler, NoneType] = None, summary_interval: int = 100, histogram_interval: int = None, should_log_parameter_statistics: bool = True, should_log_learning_rate: bool = False) → None[source]

Bases: allennlp.common.registrable.Registrable

batch_loss(batch: torch.Tensor, for_training: bool) → torch.Tensor[source]

Does a forward pass on the given batch and returns the loss value in the result. If for_training is True also applies regularization penalty.

default_implementation = 'default'
find_latest_checkpoint() → typing.Tuple[str, str][source]

Return the location of the latest model and training state files. If there isn’t a valid checkpoint then return None.

classmethod from_params(model: allennlp.models.model.Model, serialization_dir: str, iterator: allennlp.data.iterators.data_iterator.DataIterator, train_data: typing.Iterable[allennlp.data.instance.Instance], validation_data: typing.Union[typing.Iterable[allennlp.data.instance.Instance], NoneType], params: allennlp.common.params.Params, validation_iterator: allennlp.data.iterators.data_iterator.DataIterator = None) → allennlp.training.trainer.Trainer[source]
rescale_gradients() → typing.Union[float, NoneType][source]

Performs gradient rescaling. Is a no-op if gradient rescaling is not enabled.

train() → typing.Dict[str, typing.Any][source]

Trains the supplied model with the supplied parameters.

allennlp.training.trainer.is_sparse(tensor)[source]
allennlp.training.trainer.move_optimizer_to_cuda(optimizer)[source]

Move the optimizer state to GPU, if necessary. After calling, any parameter specific state in the optimizer will be located on the same device as the parameter.

allennlp.training.trainer.sparse_clip_norm(parameters, max_norm, norm_type=2) → float[source]

Clips gradient norm of an iterable of parameters.

The norm is computed over all gradients together, as if they were concatenated into a single vector. Gradients are modified in-place. Supports sparse gradients.

Parameters:
parameters : (Iterable[torch.Tensor])

An iterable of Tensors that will have gradients normalized.

max_norm : float

The max norm of the gradients.

norm_type : float

The type of the used p-norm. Can be 'inf' for infinity norm.

Returns:
Total norm of the parameters (viewed as a single vector).
allennlp.training.trainer.str_to_time(time_str: str) → datetime.datetime[source]

Convert human readable string to datetime.datetime.

allennlp.training.trainer.time_to_str(timestamp: int) → str[source]

Convert seconds past Epoch to human readable string.