A Trainer is responsible for training a Model.

Typically you might create a configuration file specifying the model and training parameters and then use train rather than instantiating a Trainer yourself.

class tensorboardX.writer.SummaryWriter = None, validation_log: tensorboardX.writer.SummaryWriter = None) → None[source]

Bases: object

Wraps a pair of SummaryWriter instances but is a no-op if they’re None. Allows Tensorboard logging without always checking for Nones first.

add_train_histogram(name: str, values: torch.Tensor, global_step: int) → None[source]
add_train_scalar(name: str, value: float, global_step: int) → None[source]
add_validation_scalar(name: str, value: float, global_step: int) → None[source]
class allennlp.models.model.Model, optimizer: torch.optim.optimizer.Optimizer, iterator:, train_dataset: typing.Iterable[], validation_dataset: typing.Union[typing.Iterable[], NoneType] = None, patience: typing.Union[int, NoneType] = None, validation_metric: str = '-loss', validation_iterator: = None, shuffle: bool = True, num_epochs: int = 20, serialization_dir: typing.Union[str, NoneType] = None, num_serialized_models_to_keep: int = 20, keep_serialized_model_every_num_seconds: int = None, model_save_interval: float = None, cuda_device: typing.Union[int, typing.List] = -1, grad_norm: typing.Union[float, NoneType] = None, grad_clipping: typing.Union[float, NoneType] = None, learning_rate_scheduler: typing.Union[, NoneType] = None, summary_interval: int = 100, histogram_interval: int = None, should_log_parameter_statistics: bool = True, should_log_learning_rate: bool = False, log_batch_size_period: typing.Union[int, NoneType] = None) → None[source]

Bases: allennlp.common.registrable.Registrable

batch_loss(batch: torch.Tensor, for_training: bool) → torch.Tensor[source]

Does a forward pass on the given batch and returns the loss value in the result. If for_training is True also applies regularization penalty.

default_implementation = 'default'
find_latest_checkpoint() → typing.Tuple[str, str][source]

Return the location of the latest model and training state files. If there isn’t a valid checkpoint then return None.

classmethod from_params(model: allennlp.models.model.Model, serialization_dir: str, iterator:, train_data: typing.Iterable[], validation_data: typing.Union[typing.Iterable[], NoneType], params: allennlp.common.params.Params, validation_iterator: = None) →[source]
rescale_gradients() → typing.Union[float, NoneType][source]

Performs gradient rescaling. Is a no-op if gradient rescaling is not enabled.

train() → typing.Dict[str, typing.Any][source]

Trains the supplied model with the supplied parameters.[source][source]

Move the optimizer state to GPU, if necessary. After calling, any parameter specific state in the optimizer will be located on the same device as the parameter., max_norm, norm_type=2) → float[source]

Clips gradient norm of an iterable of parameters.

The norm is computed over all gradients together, as if they were concatenated into a single vector. Gradients are modified in-place. Supports sparse gradients.

parameters : (Iterable[torch.Tensor])

An iterable of Tensors that will have gradients normalized.

max_norm : float

The max norm of the gradients.

norm_type : float

The type of the used p-norm. Can be 'inf' for infinity norm.

Total norm of the parameters (viewed as a single vector). str) → datetime.datetime[source]

Convert human readable string to datetime.datetime. int) → str[source]

Convert seconds past Epoch to human readable string.