Olmo 3 & Olmo Hybrid

For details on reproducing models, see:

Tokenizer Settings

When releasing multiple models (instruct and think) and using two different codebases for post-training (allenai/olmo-core for SFT and allenai/open-instruct for DPO and RL), there are many steps needed to get exact chat templates right. The final step is getting the chat templates right for public release, which can entail different system prompts to maintain model identity.

This document is a reference for the settings used for Olmo 3, based on the best available information.

Olmo 3 Instruct Models (7b, 32b): Tokenized AND intermediately evaluated with allenai/olmo-3-tokenizer-instruct-dev, released with:

7b instruct: the above tokenizer on HuggingFace (no Olmo identity modified system prompt, appropriate tool use chat template/special tokens), the above tokenizer and Olmo identity system prompt modification for demos such as Ai2's playground.
32b instruct: the above tokenizer on HuggingFace for SFT and DPO, and for RL/Playground the final model has the chat template with modified Olmo system prompt on playground.
Reason for the above discrepancy: We needed to match the chat templates for the final models relative to what was used in demos, which in the case for these models involved system prompt edits to improve model reliability. Matching models in demos and on HuggingFace is ideal.

Olmo 3 Thinking Models:

7b: Training data tokenized with the olmo_thinker_no_think_7b chat template (that has the olmo identity in the prompt), but there was a minor miscommunication in transition to the next training stages, so the DPO and RL models have a slightly different chat template, all reflected in the final released models.
32b: Training data tokenized with the olmo_thinker_no_think_sft_tokenization chat template (otherwise identical, doesn't have olmo identity in the prompt), released with that chat template + the think token in add_generation_prompt.
Reason for the difference between 7b and 32b: we learned as we went to not have the identity baked into the prompt (so it was easier to fix at the time of the demo in the form of a system prompt) but couldn't afford to retrain 7b thinking model at that point.

Olmo 3.2+ models (also used for Olmo Hybrid):

Think SFT data is tokenized with the Instruct chat template allenai/olmo-3-tokenizer-instruct-dev. This template does not include <think>, which prevents <think> from being masked out during tokenization so the model learns to generate it. (We plan to fix the underlying masking bug so this workaround is no longer needed.)
Think evaluation should use allenai/olmo-3.2-tokenizer-think-dev, which is the instruct chat template plus <think> in add_generation_prompt (new models should combine tool use abilities from the instruct template with <think> for reasoning). Named 3.2 to distinguish from the original Olmo 3 think tokenizers, which did not include function calling.
Think release models should use allenai/olmo-3.2-tokenizer-think-release, which is the same as the think-dev template but with the Olmo identity system prompt.
Instruct release models should use allenai/olmo-3-tokenizer-instruct-release, which is the same as instruct-dev but with the Olmo identity system prompt. This is analogous to how think-release differs from think-dev.

Note on chat_template.jinja vs tokenizer_config.json: When a HuggingFace repo contains both a chat_template.jinja file and a chat_template field in tokenizer_config.json, transformers prioritizes chat_template.jinja. Keep both in sync, or only use one. The diff_tokenizers.py script compares both files.

There are two main issues that lead to all the floating chat templates: one, the token chopping in the tokenization script where our code incorrectly masks the first token as part of the prompt, and two, the identity issue which means we should train and release with different system prompts.

TLDR until these two issues are resolved:

allenai/olmo-3-tokenizer-instruct-dev is the primary chat template for tokenizing both instruct and think models that have tool use abilities.
For Instruct evaluation/training, use allenai/olmo-3-tokenizer-instruct-dev. For release, use allenai/olmo-3-tokenizer-instruct-release (adds Olmo identity).
For Think SFT tokenization, use allenai/olmo-3-tokenizer-instruct-dev (avoids the <think> masking bug). For Think evaluation and post-SFT stages (DPO, RL), use allenai/olmo-3.2-tokenizer-think-dev (adds <think> to add_generation_prompt). For release, use allenai/olmo-3.2-tokenizer-think-release (adds Olmo identity).

To verify that two tokenizer repos differ only where expected, use the diff tool:

python scripts/tokenizers/diff_tokenizers.py allenai/olmo-3-tokenizer-instruct-dev allenai/olmo-3-tokenizer-instruct-release