Various utilities that don’t fit anwhere else.
add_noise_to_dict_values(dictionary: typing.Dict[A, float], noise_param: float) → typing.Dict[A, float]¶
Returns a new dictionary with noise added to every key in
dictionary. The noise is uniformly distributed within
noise_parampercent of the value for every value in the dictionary.
get_spacy_model(spacy_model_name: str, pos_tags: bool, parse: bool, ner: bool) → spacy.language.Language¶
In order to avoid loading spacy models a whole bunch of times, we’ll save references to them, keyed by the options we used to create the spacy model, so any particular configuration only gets loaded once.
group_by_count(iterable: typing.List[typing.Any], count: int, default_value: typing.Any) → typing.List[typing.List[typing.Any]]¶
Takes a list and groups it into sublists of size
default_valueto pad the list at the end if the list is not divisable by
For example: >>> group_by_count([1, 2, 3, 4, 5, 6, 7], 3, 0) [[1, 2, 3], [4, 5, 6], [7, 0, 0]]
This is a short method, but it’s complicated and hard to remember as a one-liner, so we just make a function out of it.
namespace_match(pattern: str, namespace: str)¶
Matches a namespace pattern against a namespace string. For example,
pad_sequence_to_length(sequence: typing.List, desired_length: int, default_value: typing.Callable[typing.Any] = <function <lambda>>, padding_on_right: bool = True) → typing.List¶
Take a list of objects and pads it to the desired length, returning the padded list. The original list is not modified.
sequence : List
A list of objects to be padded.
desired_length : int
Maximum length of each sequence. Longer sequences are truncated to this length, and shorter ones are padded to it.
default_value: Callable, default=lambda: 0
Callable that outputs a default value (of any type) to use as padding values. This is a lambda to avoid using the same object when the default value is more complex, like a list.
padding_on_right : bool, default=True
When we add padding tokens (or truncate the sequence), should we do it on the right or the left?
padded_sequence : List
prepare_environment(params: typing.Union[allennlp.common.params.Params, typing.Dict[str, typing.Any]])¶
Sets random seeds for reproducible experiments. This may not work as expected if you use this from within a python project in which you have already imported Pytorch. If you use the scripts/run_model.py entry point to training models with this library, your experiments should be reasonably reproducible. If you are using this from your own project, you will want to call this function before importing Pytorch. Complete determinism is very difficult to achieve with libraries doing optimized linear algebra due to massively parallel execution, which is exacerbated by using GPUs.
params: Params object or dict, required.
Paramsobject or dict holding the json parameters.
sanitize(x: typing.Any) → typing.Any¶
Sanitize turns PyTorch and Numpy types into basic Python types so they can be serialized into JSON.