Skip to content

Welcome to Open Instruct

This repo serves as an open effort on instruction-tuning and post-training popular pretrained language models on publicly available datasets. We release this repo and will keep updating it with:

  1. Code for finetuning language models with latest techniques and instruction datasets in a unified format.
  2. Code for DPO, preference finetuning and reinforcement learning with verifiable rewards (RLVR).
  3. Checkpoints or other useful artifacts that we build in our exploration.

We also support some evaluations natively in the codebase, but these are now unmaintained and instead we suggest using OLMES, which we used for TÜLU 3. Below are some of our papers:

Tülu (a hybrid camel) represents a suite of LLaMa models that we built by fully-finetuning them on a strong mix of datasets.

Try some of the models we train with Open Instruct. There is a free demo or download them from HuggingFace:

Stage Llama 3.1 8B Llama 3.1 70B OLMo-2 7B OLMo-2 13B
Base Model meta-llama/Llama-3.1-8B meta-llama/Llama-3.1-70B allenai/OLMo2-7B-1124 allenai/OLMo-2-13B-1124
SFT allenai/Llama-3.1-Tulu-3-8B-SFT allenai/Llama-3.1-Tulu-3-70B-SFT allenai/OLMo-2-1124-7B-SFT allenai/OLMo-2-1124-13B-SFT
DPO allenai/Llama-3.1-Tulu-3-8B-DPO allenai/Llama-3.1-Tulu-3-70B-DPO allenai/OLMo-2-1124-7B-DPO allenai/OLMo-2-1124-13B-DPO
Final Models (RLVR) allenai/Llama-3.1-Tulu-3-8B allenai/Llama-3.1-Tulu-3-70B allenai/OLMo-2-1124-7B-Instruct allenai/OLMo-2-1124-13B-Instruct
Final Models (RLVR) (🔥 New, trained with GRPO) allenai/Llama-3.1-Tulu-3.1-8B
Reward Model (RM) allenai/Llama-3.1-Tulu-3-8B-RM (Same as 8B) allenai/OLMo-2-1124-7B-RM (Same as 7B)

News

Citation

If you used this repository or our models, please cite our work:

Tulu 1:

@misc{wang2023far,
   title={How Far Can Camels Go? Exploring the State of Instruction Tuning on Open Resources},
   author={Yizhong Wang and Hamish Ivison and Pradeep Dasigi and Jack Hessel and Tushar Khot and Khyathi Raghavi Chandu and David Wadden and Kelsey MacMillan and Noah A. Smith and Iz Beltagy and Hannaneh Hajishirzi},
   year={2023},
   eprint={2306.04751},
   archivePrefix={arXiv},
   primaryClass={cs.CL}
}

Tulu 2:

@misc{ivison2023camels,
      title={Camels in a Changing Climate: Enhancing LM Adaptation with Tulu 2},
      author={Hamish Ivison and Yizhong Wang and Valentina Pyatkin and Nathan Lambert and Matthew Peters and Pradeep Dasigi and Joel Jang and David Wadden and Noah A. Smith and Iz Beltagy and Hannaneh Hajishirzi},
      year={2023},
      eprint={2311.10702},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}

Tulu 2.5:

@misc{ivison2024unpacking,
      title={Unpacking DPO and PPO: Disentangling Best Practices for Learning from Preference Feedback},
      author={Hamish Ivison and Yizhong Wang and Jiacheng Liu and Zeqiu Wu and Valentina Pyatkin and Nathan Lambert and Noah A. Smith and Yejin Choi and Hannaneh Hajishirzi},
      year={2024},
      eprint={2406.09279},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
}

Tulu 3:

@article{lambert2024tulu3,
  title = {Tülu 3: Pushing Frontiers in Open Language Model Post-Training},
  author = {
    Nathan Lambert and Jacob Morrison and Valentina Pyatkin and Shengyi Huang and Hamish Ivison and Faeze Brahman and Lester James V. Miranda and Alisa Liu and Nouha Dziri and Shane Lyu and Yuling Gu and Saumya Malik and Victoria Graf and Jena D. Hwang and Jiangjiang Yang and Ronan Le Bras and Oyvind Tafjord and Chris Wilhelm and Luca Soldaini and Noah A. Smith and Yizhong Wang and Pradeep Dasigi and Hannaneh Hajishirzi
  },
  year = {2024},
  email = {tulu@allenai.org}
}

OLMo 3:

@misc{olmo2025olmo3,
      title={OLMo 3},
      author={Team OLMo and Allyson Ettinger and Amanda Bertsch and Bailey Kuehl and David Graham and David Heineman and Dirk Groeneveld and Faeze Brahman and Finbarr Timbers and Hamish Ivison and Jacob Morrison and Jake Poznanski and Kyle Lo and Luca Soldaini and Matt Jordan and Mayee Chen and Michael Noukhovitch and Nathan Lambert and Pete Walsh and Pradeep Dasigi and Robert Berry and Saumya Malik and Saurabh Shah and Scott Geng and Shane Arora and Shashank Gupta and Taira Anderson and Teng Xiao and Tyler Murray and Tyler Romero and Victoria Graf and Akari Asai and Akshita Bhagia and Alexander Wettig and Alisa Liu and Aman Rangapur and Chloe Anastasiades and Costa Huang and Dustin Schwenk and Harsh Trivedi and Ian Magnusson and Jaron Lochner and Jiacheng Liu and Lester James V. Miranda and Maarten Sap and Malia Morgan and Michael Schmitz and Michal Guerquin and Michael Wilson and Regan Huff and Ronan Le Bras and Rui Xin and Rulin Shao and Sam Skjonsberg and Shannon Zejiang Shen and Shuyue Stella Li and Tucker Wilde and Valentina Pyatkin and Will Merrill and Yapei Chang and Yuling Gu and Zhiyuan Zeng and Ashish Sabharwal and Luke Zettlemoyer and Pang Wei Koh and Ali Farhadi and Noah A. Smith and Hannaneh Hajishirzi},
      year={2025},
      eprint={2512.13961},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2512.13961},
}