MolmoSpaces¶

Large-scale assets and benchmarks for vision-language policies.

Overview¶

MolmoSpaces is a large-scale simulation environment and benchmark suite for training and evaluating vision-language policies in robotics. It provides:

High-fidelity 3D asset libraries (objects, environments, robots)
Multi-simulator support (MuJoCo, Isaac Sim, ManiSkill)
Standardized evaluation protocols for vision-language policies
Data generation pipelines for imitation learning

Quick Links¶

Resource	Description
Assets	Asset management and resource usage
Data Format	Episode and observation data specifications
Data Processing	Preprocessing and postprocessing pipelines
Code Structure	Repository layout and module organization
Development	Contributing guidelines and tooling setup
API Reference	Auto-generated Python API documentation

Installation¶

# Clone the repository
git clone https://github.com/allenai/molmospaces.git
cd molmospaces

# Install with uv (recommended)
uv pip install -e ".[mujoco]"

See the README on GitHub for full installation instructions including conda setup and optional dependency groups.