Nemotron-Cascade 2 / Mamba-3 SSMs + GLM-5.1 + ModelScope + new arch (Neural Computers/Flow map LMs/IHA)
Key Questions
What are Neural Computers?
Neural Computers from Meta/KAUST fuse compute/memory/I/O into learned models as runtimes. Prototypes like Wan2.1 NCCLIGen/NCGUIWorld show high PSNR/SSIM/OCR with low arithmetic and prompt-steerable generation.
What is GLM-5.1's performance?
GLM-5.1 achieves SOTA on SWE-Bench Pro/TerminalBench in agentic tasks, supported by MIT HF/vLLM integration.
What advancements does Nemotron-Cascade 2 offer?
Nemotron-Cascade 2 is a 30B MoE model topping math/agents benchmarks, alongside Nemotron-OCR capabilities.
Neural Computers (Meta/KAUST) models as learned runtimes fusing compute/memory/I/O (Wan2.1 NCCLIGen/NCGUIWorld prototypes high PSNR/SSIM/OCR low arith prompt-steerable); Flow map LMs non-AR continuous flow gen update; IHA CUDA Hopper kernels training speedup; Attention Sink survey Transformer pathologies; Nemotron-3 Super open MoE hybrid Mamba-Transformer agentic reasoning; GLM-5.1 SOTA SWE-Bench Pro/TerminalBench agentic MIT HF/vLLM; Cascade 2 30B MoE #1 math/agents; Nemotron-OCR; ModelScope Alibaba OSS model search/fine-tune/eval HF rival/Meta replay buffer RL efficiency/Lightning OPD. Benches: TCO vs Gemma4/Llama3.5/Muse Spark+vLLM/SWE-CI/Lightning OPD.