大模型前沿速递

Compute Heavy Industry & Financing & Chip Competition & Edge AI

Compute Heavy Industry & Financing & Chip Competition & Edge AI

Key Questions

How does NVIDIA Blackwell perform on agentic AI workloads versus AMD?

AA-AgentPerf benchmark shows NVIDIA Blackwell dominating AMD MI355X on agentic AI tasks. Blackwell also set MLPerf records with 8,192 GPUs and 1.6x gains in GB300 NVL72 configurations.

What is the purpose of Anthropic's $35B financing round?

Anthropic secured $35B from Apollo and Blackstone to fund Google chips and compute expansion. This supports scaling amid rising infrastructure demands for frontier models.

What efficiency does Google's DiffusionGemma achieve?

DiffusionGemma delivers over 1000 tokens per second on an 18GB GPU under Apache 2.0 license. It provides 4x speed gains over prior models in inference scenarios.

How is China investing in AI data centers?

China announced a $280B five-year investment in AI data centers. Hong Kong is also planning its largest domestic AI computing center to bolster national infrastructure.

What funding is Baseten raising and why?

Baseten is reportedly raising $1.5B at an $11-13B valuation as an AI inference provider. This reflects booming demand for inference infrastructure and token optimization tools.

What new chip architecture is 算苗科技 developing?

算苗科技 taped out its first 3D TokenPU chip optimized for inference with stacked memory and logic wafers. It targets extreme performance in large-model cloud scenarios.

Why did NVIDIA issue its first bond in five years?

NVIDIA's $25B bond issuance was oversubscribed 3.4x, signaling AI infrastructure capex entering a balance-sheet era. Funds support continued scaling of Blackwell and related platforms.

What delays are affecting OpenAI's Stargate data centers?

OpenAI's Stargate project faces longer timelines and higher costs than competitors. This highlights broader challenges in financing and building frontier-scale AI infrastructure.

AA-AgentPerf benchmark shows NVIDIA Blackwell dominates AMD MI355X on agentic AI workloads. Anthropic secures $35B from Apollo/Blackstone for Google chips and compute expansion. Google releases DiffusionGemma with 1000+ tok/s on 18GB GPU, Apache 2.0. DS-4 local inference engine enables 100K token context on 96GB Mac for DeepSeek-V4-Flash. OpenAI IPO plans signal capital race. China announces $280B five-year AI data center investment. 速石科技 FAAP platform adapts to Huawei Ascend. PixelRAG reduces agent token costs 10x. DeepSeek $7.4B financing for compute expansion; SiliconFlow 20B RMB Series B for inference infrastructure. Hong Kong Science Park and SenseTime plan 4万P+ computing center by 2030. Latest: Llama 4 Scout available on Gemini Enterprise Agent Platform with 10M context, expanding enterprise agent deployment options. NVIDIA issues first bond in five years, oversubscribed 3.4x, signaling AI infrastructure capital expenditure entering balance sheet era. New: NVIDIA Blackwell achieves record MLPerf training scale with 8,192 GPUs, GB300 NVL72 1.6x gain over GB200, NVFP4 precision. Amazon AI outlines Nova2 and Trainium strategy, challenging NVIDIA dominance. Today: Hong Kong plans largest domestic AI computing center. Beijing startup 算苗科技 tapes out 3D TokenPU chip for inference. OpenAI Stargate data centers facing delays and cost overruns. AIEC 2026 emphasizes inference-phase demand and ecosystem competition. New today: Baseten reportedly raising $1.5B at $11-13B valuation, signaling inference infrastructure boom.

Sources (16)
Updated Jun 23, 2026