Qwen 3.6-27B Open-Source: Dense Model Matches 397B MoE on Single GPU

Key Questions

How does Qwen 3.6-27B compare to much larger models?

The 27.8B dense Qwen 3.6-27B model matches the performance of a prior 397B MoE model, scoring 77.2% on SWE-bench. It runs affordably on a single RTX 3080 at roughly $7 per hour.

Is Qwen 3.6-27B suitable for local agentic workflows?

Yes, its Apache-2.0 license, strong benchmark results, and single-GPU capability make it practical for local high-quality agent deployments and cost-sensitive applications.

What quantized releases support local Qwen deployment?

Quantized Qwen3.5 checkpoints co-designed with inference engines have been released to enable efficient local running on consumer hardware.

Alibaba released Qwen 3.6-27B, a dense 27.8B model under Apache-2.0 that matches previous 397B MoE performance while running on a single RTX 3080. SWE-bench score 77.2%, cost ~$7/hr. Game-changer for local agentic workflows and affordable high-quality open models. Additionally, Qwen3.5 quantized checkpoints have been released, co-designed with inference engines, further enabling local deployment. An article highlights the broader shift from model strength to deployment cost, with Alibaba's Tongyi Qianwen (Qwen) as a key example of open-source edge applications.

Sources (2)

Updated Jun 11, 2026

Open LLM Playbook

Qwen 3.6-27B Open-Source: Dense Model Matches 397B MoE on Single GPU

Key Questions

How does Qwen 3.6-27B compare to much larger models?

Is Qwen 3.6-27B suitable for local agentic workflows?

What quantized releases support local Qwen deployment?

Running Claude Code Offline on an M3 Pro with Qwen3.6

Frontier of AI shifts from model strength to deployment, cost ...