Instance(fields: MutableMapping[str, allennlp.data.fields.field.Field])¶
Instanceis a collection of
Fieldobjects, specifying the inputs and outputs to some model. We don’t make a distinction between inputs and outputs here, though - all operations are done on all fields, and when we return arrays, we return them as dictionaries keyed by field name. A model can then decide which fields it wants to use as inputs as which as outputs.
Instancecan start out either indexed or un-indexed. During the data processing pipeline, all fields will be indexed, after which multiple instances can be combined into a
Batchand then converted into padded arrays.
Fieldobjects that will be used to produce data arrays for this instance.
add_field(self, field_name:str, field:allennlp.data.fields.field.Field, vocab:allennlp.data.vocabulary.Vocabulary=None) → None¶
Add the field to the existing fields mapping. If we have already indexed the Instance, then we also index field, so it is necessary to supply the vocab.
as_tensor_dict(self, padding_lengths:Dict[str, Dict[str, int]]=None) → Dict[str, ~DataArray]¶
Fieldin this instance to the lengths given in
padding_lengths(which is keyed by field name, then by padding key, the same as the return value in
get_padding_lengths()), returning a list of torch tensors for each field.
padding_lengthsis omitted, we will call
self.get_padding_lengths()to get the sizes of the tensors to create.
count_vocab_items(self, counter:Dict[str, Dict[str, int]])¶
Increments counts in the given
counterfor all of the vocabulary items in all of the
get_padding_lengths(self) → Dict[str, Dict[str, int]]¶
Returns a dictionary of padding lengths, keyed by field name. Each
Fieldreturns a mapping from padding keys to actual lengths, and we just key that dictionary by field name.
index_fields(self, vocab:allennlp.data.vocabulary.Vocabulary) → None¶
Indexes all fields in this
Instanceusing the provided
Vocabulary. This mutates the current object, it does not return a new
DataIteratorwill call this on each pass through a dataset; we use the
indexedflag to make sure that indexing only happens once.
This means that if for some reason you modify your vocabulary after you’ve indexed your instances, you might get unexpected behavior.