allennlp.data.instance

class allennlp.data.instance.Instance(fields: typing.Dict[str, allennlp.data.fields.field.Field]) → None[source]

Bases: object

An Instance is a collection of Field objects, specifying the inputs and outputs to some model. We don’t make a distinction between inputs and outputs here, though - all operations are done on all fields, and when we return arrays, we return them as dictionaries keyed by field name. A model can then decide which fields it wants to use as inputs as which as outputs.

The Fields in an Instance can start out either indexed or un-indexed. During the data processing pipeline, all fields will end up as IndexedFields, and will then be converted into padded arrays by a DataGenerator.

Parameters:

fields : Dict[str, Field]

The Field objects that will be used to produce data arrays for this instance.

as_array_dict(padding_lengths: typing.Dict[str, typing.Dict[str, int]] = None) → typing.Dict[str, DataArray][source]

Pads each Field in this instance to the lengths given in padding_lengths (which is keyed by field name, then by padding key, the same as the return value in get_padding_lengths()), returning a list of numpy arrays for each field.

If padding_lengths is omitted, we will call self.get_padding_lengths() to get the sizes of the arrays to create.

count_vocab_items(counter: typing.Dict[str, typing.Dict[str, int]])[source]

Increments counts in the given counter for all of the vocabulary items in all of the Fields in this Instance.

get_padding_lengths() → typing.Dict[str, typing.Dict[str, int]][source]

Returns a dictionary of padding lengths, keyed by field name. Each Field returns a mapping from padding keys to actual lengths, and we just key that dictionary by field name.

index_fields(vocab: allennlp.data.vocabulary.Vocabulary)[source]

Converts all UnindexedFields in this Instance to IndexedFields, given the Vocabulary. This mutates the current object, it does not return a new Instance.