大模型前沿速递

Compute Heavy Industry & Financing & Chip Competition & Edge AI

Compute Heavy Industry & Financing & Chip Competition & Edge AI

Key Questions

How does NVIDIA Blackwell perform on agentic AI workloads?

AA-AgentPerf benchmark shows NVIDIA Blackwell dominating AMD MI355X. It also set MLPerf records with 8,192 GPUs and 1.6x gains on GB300 NVL72 systems.

What major compute financing deals were announced?

Anthropic secured $35B from Apollo and Blackstone for Google chips and expansion. DeepSeek raised $7.4B and SiliconFlow completed a 20B RMB Series B for inference infrastructure.

Which new chips or architectures target AI inference?

算苗科技 taped out its 3D TokenPU chip for inference, DS-4 engine supports 100K context on Macs, and Google released DiffusionGemma achieving over 1000 tok/s on 18GB GPUs.

What signals the shift to inference-focused AI infrastructure?

AIEC 2026 emphasized inference-phase ecosystem competition, Baseten is reportedly raising $1.5B, and NVIDIA issued its first bond in five years to fund AI infrastructure.

How are data center investments scaling globally?

China announced a $280B five-year AI data center plan, Hong Kong is building its largest domestic AI computing center, and OpenAI's Stargate project faces delays and cost overruns.

What open or efficient inference models were released?

Google open-sourced DiffusionGemma under Apache 2.0 with high token throughput, and Llama 4 Scout became available on Gemini Enterprise platforms with 10M context support.

Which companies are adapting platforms for domestic Chinese chips?

速石科技's FAAP platform now supports Huawei Ascend, and Chinese firms are advancing Ascend NPU-based recommendation systems amid US-China chip competition.

What does NVIDIA's bond issuance indicate about AI capex?

The oversubscribed 3.4x bond signals that AI infrastructure capital expenditure is entering a balance-sheet era, with $85B in demand for the $25B issuance.

AA-AgentPerf benchmark shows NVIDIA Blackwell dominates AMD MI355X on agentic AI workloads. Anthropic secures $35B from Apollo/Blackstone for Google chips and compute expansion. Google releases DiffusionGemma with 1000+ tok/s on 18GB GPU, Apache 2.0. DS-4 local inference engine enables 100K token context on 96GB Mac for DeepSeek-V4-Flash. OpenAI IPO plans signal capital race. China announces $280B five-year AI data center investment. 速石科技 FAAP platform adapts to Huawei Ascend. PixelRAG reduces agent token costs 10x. DeepSeek $7.4B financing for compute expansion; SiliconFlow 20B RMB Series B for inference infrastructure. Hong Kong Science Park and SenseTime plan 4万P+ computing center by 2030. Latest: Llama 4 Scout available on Gemini Enterprise Agent Platform with 10M context, expanding enterprise agent deployment options. NVIDIA issues first bond in five years, oversubscribed 3.4x, signaling AI infrastructure capital expenditure entering balance sheet era. New: NVIDIA Blackwell achieves record MLPerf training scale with 8,192 GPUs, GB300 NVL72 1.6x gain over GB200, NVFP4 precision. Amazon AI outlines Nova2 and Trainium strategy, challenging NVIDIA dominance. Today: Hong Kong plans largest domestic AI computing center. Beijing startup 算苗科技 tapes out 3D TokenPU chip for inference. OpenAI Stargate data centers facing delays and cost overruns. AIEC 2026 emphasizes inference-phase demand and ecosystem competition. New today: Baseten reportedly raising $1.5B at $11-13B valuation, signaling inference infrastructure boom.

Sources (17)
Updated Jun 23, 2026