allennlp.modules.stacked_bidirectional_lstm

class allennlp.modules.stacked_bidirectional_lstm.StackedBidirectionalLstm(input_size: int, hidden_size: int, num_layers: int, recurrent_dropout_probability: float = 0.0, use_highway: bool = True) → None[source]

Bases: torch.nn.modules.module.Module

A standard stacked Bidirectional LSTM where the LSTM layers are concatenated between each layer. The only difference between this and a regular bidirectional LSTM is the application of variational dropout to the hidden states of the LSTM. Note that this will be slower, as it doesn’t use CUDNN.

Parameters:
input_size : int, required

The dimension of the inputs to the LSTM.

hidden_size : int, required

The dimension of the outputs of the LSTM.

num_layers : int, required

The number of stacked Bidirectional LSTMs to use.

recurrent_dropout_probability: float, optional (default = 0.0)

The dropout probability to be used in a dropout scheme as stated in A Theoretically Grounded Application of Dropout in Recurrent Neural Networks .

forward(inputs: torch.nn.utils.rnn.PackedSequence, initial_state: typing.Union[typing.Tuple[torch.Tensor, torch.Tensor], NoneType] = None)[source]
Parameters:
inputs : PackedSequence, required.

A batch first PackedSequence to run the stacked LSTM over.

initial_state : Tuple[torch.Tensor, torch.Tensor], optional, (default = None)

A tuple (state, memory) representing the initial hidden state and memory of the LSTM. Each tensor has shape (1, batch_size, output_dimension * 2).

Returns:
output_sequence : PackedSequence

The encoded sequence of shape (batch_size, sequence_length, hidden_size * 2)

final_states: torch.Tensor

The per-layer final (state, memory) states of the LSTM, each with shape (num_layers, batch_size, hidden_size * 2).