class allennlp.modules.sampled_softmax_loss.SampledSoftmaxLoss(num_words: int, embedding_dim: int, num_samples: int, sparse: bool = False, unk_id: int = None, use_character_inputs: bool = True, use_fast_sampler: bool = False)[source]

Bases: torch.nn.modules.module.Module

Based on the default log_uniform_candidate_sampler in tensorflow.

NOTE: num_words DOES NOT include padding id.

NOTE: In all cases except (tie_embeddings=True and use_character_inputs=False) the weights are dimensioned as num_words and do not include an entry for the padding (0) id. For the (tie_embeddings=True and use_character_inputs=False) case, then the embeddings DO include the extra 0 padding, to be consistent with the word embedding layer.

num_words, ``int``

The number of words in the vocabulary

embedding_dim, ``int``

The dimension to softmax over

num_samples, ``int``

During training take this many samples. Must be less than num_words.

sparse, ``bool``, optional (default = False)

If this is true, we use a sparse embedding matrix.

unk_id, ``int``, optional (default = None)

If provided, the id that represents unknown characters.

use_character_inputs, ``bool``, optional (default = True)

Whether to use character inputs

use_fast_sampler, ``bool``, optional (default = False)

Whether to use the fast cython sampler.

forward(embeddings: torch.Tensor, targets: torch.Tensor, target_token_embedding: torch.Tensor = None) → torch.Tensor[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.


Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

log_uniform_candidate_sampler(targets, choice_func=<function _choice>)[source]