Back to Models
AllenAI: Molmo2 8B
visionallenai

Molmo2-8B is an open vision-language model developed by the Allen Institute for AI (Ai2) as part of the Molmo2 family, supporting image, video, and multi-image understanding and grounding. It is based on Qwen3-8B and uses SigLIP 2 as its vision backbone, outperforming other open-weight, open-data models on short videos, counting, and captioning, while remaining competitive on long-video tasks.

Input Price

$0.2000/1M

Output Price

$0.2000/1M

Context

36.9K tokens

Parameters

8000.0M

Features
Image Support
Model ID: allenai/molmo-2-8b