SubQ first fully subquadratic LLM 12M ctx

Key Questions

What is SubQ and its key achievement?

SubQ is the first fully subquadratic LLM preview, seed-funded with $29M, enabling linear scaling to 12M context length via sparse attention. It supports OSS local deploys on 32-64GB VRAM, surpassing quadratic limits of DeepSeek, MiMo, and Qwen at 1M context.

How does SubQ improve efficiency over existing models?

SubQ uses sparse attention for linear scaling, allowing much longer contexts than quadratic models. This signals a major efficiency leap, with HN buzz, though GGUF quants and benchmarks vs Mistral/Qwen are pending amid agentic eval surge.

What concerns surround SubQ's hype?

While funding and buzz highlight its potential, some flag the hype as Theranos-like due to unproven claims. Detailed benches and real-world performance remain pending.

$29M seed-funded SubQ preview enables linear scaling to 12M ctx with sparse attention for 32-64GB VRAM OSS local deploys, surpassing DeepSeek/MiMo/Qwen 1M quadratic limits; HN buzz and funding signal major efficiency leap but hype flagged as Theranos-like, GGUF quants/benches vs Mistral/Qwen pending amid agentic eval surge.

Sources (2)

Updated May 7, 2026

Open LLM Deploy

SubQ first fully subquadratic LLM 12M ctx

Key Questions

What is SubQ and its key achievement?

How does SubQ improve efficiency over existing models?

What concerns surround SubQ's hype?

SubQ: The Dawn of Fully Subquadratic Language Models

1,000x Faster AI or the Next Theranos? (Subquadratic Explained)