Mamba-3 (Together.ai) — inference-first SSM release

Key Questions

What is Mamba-3 from Together.ai?

Mamba-3 is an inference-first SSM release claiming Transformer-level decode speeds up to 16K. It hybridizes with RWKV v8 and ties to fused architectures.

How does Mamba-3 compare to Transformers?

Mamba-3 aims to match or exceed Transformers/Gemma4/GLM at LLM scale, focusing on kernels. Full papers and benchmarks are pending.

What are Neural Computers in this context?

Meta AI's Neural Computers fold computation, memory, and I/O into learned models. They relate to Mamba-3's architecture for efficient inference.

Claims Transformer decode 16K; RWKV v8 hybrid; ties to Neural Computers fused arches. No full paper/benches. Watch kernels/LLM-scale vs Transformers/Gemma4/GLM/HISA.

Sources (2)

Updated Apr 13, 2026

AI Impact Daily

Mamba-3 (Together.ai) — inference-first SSM release

Key Questions

What is Mamba-3 from Together.ai?

How does Mamba-3 compare to Transformers?

What are Neural Computers in this context?

Meta AI and KAUST Researchers Propose Neural Computers That Fold Computation, Memory, and I/O Into One Learned Model

@omarsar0 reposted: NEW paper from Meta. (bookmark this one) What if the model wasn't just using t...