Skip to content

MolmoSpaces

Large-scale assets and benchmarks for vision-language policies.

Overview

MolmoSpaces is a large-scale simulation environment and benchmark suite for training and evaluating vision-language policies in robotics. It provides:

  • High-fidelity 3D asset libraries (objects, environments, robots)
  • Multi-simulator support (MuJoCo, Isaac Sim, ManiSkill)
  • Standardized evaluation protocols for vision-language policies
  • Data generation pipelines for imitation learning
Resource Description
Assets Asset management and resource usage
Data Format Episode and observation data specifications
Data Processing Preprocessing and postprocessing pipelines
Code Structure Repository layout and module organization
Development Contributing guidelines and tooling setup
API Reference Auto-generated Python API documentation

Installation

# Clone the repository
git clone https://github.com/allenai/molmospaces.git
cd molmospaces

# Install with uv (recommended)
uv pip install -e ".[mujoco]"

See the README on GitHub for full installation instructions including conda setup and optional dependency groups.