Mamba-3 (Together.ai) — inference-first SSM release
Key Questions
What is Mamba-3 from Together.ai?
Mamba-3 is an inference-first SSM release claiming Transformer-level decode speeds up to 16K. It hybridizes with RWKV v8 and ties to fused architectures.
How does Mamba-3 compare to Transformers?
Mamba-3 aims to match or exceed Transformers/Gemma4/GLM at LLM scale, focusing on kernels. Full papers and benchmarks are pending.
What are Neural Computers in this context?
Meta AI's Neural Computers fold computation, memory, and I/O into learned models. They relate to Mamba-3's architecture for efficient inference.
Claims Transformer decode 16K; RWKV v8 hybrid; ties to Neural Computers fused arches. No full paper/benches. Watch kernels/LLM-scale vs Transformers/Gemma4/GLM/HISA.
Sources (2)
Updated Apr 13, 2026