Open Source AI

SANA-WM 2.6B OSS world model for video

SANA-WM 2.6B OSS world model for video

Key Questions

What is SANA-WM and its key capabilities?

SANA-WM is a 2.6B parameter open-source world model from NVIDIA that generates up to one minute of 720p video from a single image and camera path. It emphasizes controllability and local deployment potential.

How does SANA-WM fit into the broader OSS multimodal trend?

It aligns with the surge in open-source multimodal models by enabling video generation workflows. Integration with tools like Hermes and ComfyUI supports agentic applications.

What are the hardware requirements for running SANA-WM?

Its compact 2.6B size allows feasible local inference on consumer GPUs. This supports accessible video generation without massive cloud resources.

Where can developers access SANA-WM code and models?

The model and code are available on GitHub at NVlabs/Sana under an open-source license. It includes support for controllable video synthesis tasks.

What real-world applications does SANA-WM enable?

It powers minute-long video creation for simulation, content, and agent planning scenarios. The open weights encourage community extensions in multimodal AI pipelines.

Compact 2.6B open-source world model enabling 1-min 720p video generation; fits OSS multimodal surge with potential local deployment and agent workflows; aligns with Hermes/ComfyUI tools.

Sources (4)
Updated May 16, 2026
What is SANA-WM and its key capabilities? - Open Source AI | NBot | nbot.ai