allennlp.training.learning_rate_schedulers

AllenNLP just uses PyTorch learning rate schedulers, with a thin wrapper to allow registering them and instantiating them from_params.

The available learning rate schedulers are

class allennlp.training.learning_rate_schedulers.LearningRateScheduler(lr_scheduler) → None[source]

Bases: allennlp.common.registrable.Registrable

This class just allows us to implement Registrable for Pytorch LRSchedulers.

classmethod from_params(optimizer: torch.optim.optimizer.Optimizer, params: allennlp.common.params.Params)[source]
step(metric: float, epoch: typing.Union[int, NoneType] = None)[source]
step_batch(batch_num_total: typing.Union[int, NoneType])[source]
class allennlp.training.learning_rate_schedulers.LearningRateWithMetricsWrapper(lr_scheduler: torch.optim.lr_scheduler.ReduceLROnPlateau) → None[source]

Bases: allennlp.training.learning_rate_schedulers.LearningRateScheduler

A wrapper around learning rate schedulers that require metrics, At the moment there is only a single instance of this lrs. It is the ReduceLROnPlateau

step(metric: float, epoch: typing.Union[int, NoneType] = None)[source]
class allennlp.training.learning_rate_schedulers.LearningRateWithoutMetricsWrapper(lr_scheduler: torch.optim.lr_scheduler._LRScheduler) → None[source]

Bases: allennlp.training.learning_rate_schedulers.LearningRateScheduler

A wrapper around learning rate schedulers that do not require metrics

step(metric: float, epoch: typing.Union[int, NoneType] = None)[source]
class allennlp.training.learning_rate_schedulers.NoamLR(optimizer: torch.optim.optimizer.Optimizer, model_size: int, warmup_steps: int, factor: float = 1.0, last_epoch: int = -1) → None[source]

Bases: torch.optim.lr_scheduler._LRScheduler

Implements the Noam Learning rate schedule. This corresponds to increasing the learning rate linearly for the first warmup_steps training steps, and decreasing it thereafter proportionally to the inverse square root of the step number, scaled by the inverse square root of the dimensionality of the model. Time will tell if this is just madness or it’s actually important.

Parameters:
model_size : int, required.

The hidden size parameter which dominates the number of parameters in your model.

warmup_steps: ``int``, required.

The number of steps to linearly increase the learning rate.

factor : float, optional (default = 1.0).

The overall scale factor for the learning rate decay.

get_lr()[source]
step(epoch=None)[source]
step_batch(epoch=None)[source]