DeepSeek V4 Pro/Flash (1.6T/862B MIT MoE 1M ctx SOTA coding) with permanent 75% price cut ($0.435/M input, $0.0036/M cache). Reasonix achieves 94-99.8% cache hit rates. MiMo V2.5 API permanent price cut up to 99% ($0.0028/M cache hit), usage up 5-8x. New: MiMo-V2.5-Pro-UltraSpeed hits 1000+ tps on 1T model using 8 standard GPUs, but premium pricing (3x cost for 10x speed) and application-based access limit indie wrapper adoption. MiMo Code now open-source – a code-specific variant for self-hosted coding agents, further enabling low-cost indie wrappers. New coding agent pricing: Cursor Composer 2.5 at $0.50/M input, Qwen 3.7 Max at $2.50/M (90% cache discount), Anthropic's self-hosted sandboxes. Alibaba's closed-weight shift. MiniMax M3 open-weight model now released and validated by Vercel CEO: leads Next.js agent evals, 10x cheaper than Opus/GPT-5, with 20x discount on AI Gateway; launch discount 50% ($0.60/M input). Now open-sourced on Hugging Face, enabling self-hosting or easy access via Spaces. Qwen3.7-Plus multimodal API pricing disclosed at $0.40/$1.60 per 1M tokens, proprietary, strong on vision but coding/reasoning lags. MAI-Code-1-Flash (137B, 51% SWE-bench Pro) from Microsoft lacks pricing details. New cost optimization pattern: use Opus/GPT for planning, DeepSeek Flash/Gemma for execution – 10x cost reduction in multi-agent loops. Seedance 2.0 video generation pricing: FAL.AI $0.04/s, Kie $0.155/s – useful for indie video wrapper builders. New: Dreamina Seedance 2.0 Mini – 30% cheaper and 2x faster than standard Fast tier, with comparable quality. Building AI video apps with coding agents is a practical guide for turning APIs into sustainable SaaS. Intensifying price war enables extreme low-cost indie wrappers for coding/agent SaaS. New: Claude Fable 5 at half the price of Mythos Preview ($10/$50 per M tokens), demonstrated Stripe 50M-line migration in a day and vision-based app reconstruction – premium but enables high-value wrappers. Free window until June 22. Early tests show marginal real-world improvement over Opus 4.8, with aggressive fallback. Fable 5 is safety-locked. Practical guides now available: accessing Fable 5 via single API provider and a Python tutorial for building a developer assistant. AtlasCloud now hosts GLM-5.1 (#1 SWE-Bench Pro, 8-hour autonomous coding agent) – another API aggregator option for coding agent SaaS. New: GLM-5.2 open-weight on Hugging Face with 1M context, MIT license, tops Terminal-Bench at 80%+ (first open-weight), second globally on Code Arena. No pricing yet, but early analysis suggests it may be expensive compared to other open-weight models – caution for indie wrappers. New: Grok Imagine 1.5 now has pricing: $4.20/min for 720p, 86% cheaper than Sora 2 Pro, topping leaderboard. New API aggregator offering $100/month free credits to call 100+ models. Blackmagic AI launched as cheaper OpenRouter alternative ($10 prepaid, no subscription, 13 providers). Rewind AI API aggregator offers 400+ tools with one key, free self-hosted models, token billing. Replicate's CTO revealed they automated their platform using cloud agents, reducing team from 30 to 3 – practical patterns for scaling AI model hosting on Cloudflare Workers. ZeroGPU (separate from HF ZeroGPU) is a cost-efficient API for routing routine inference off frontier models, claiming 10x faster, 50% cheaper with OpenAI-compatible API. Step 3.7 Flash (Apache 2.0, 11B active, 400 TPS, agent model) beats GPT-4 on all benchmarks. Grok Imagine 1.5 Preview image-to-video API (no pricing yet). Gemma 4 12B open-weight multimodal (Apache 2.0, runs on 16GB RAM, no API pricing yet). Platforms: HF Spaces/Endpoints, Replicate, DeepInfra, OpenRouter, fal.ai, Groq, Modal. Infrastructure tip: HF Spaces + Cloudflare Worker as low as $0.83/month. DeepSeek moving from model provider to coding platform. Xiaomi's MiMo price cut drove 111% usage surge on OpenRouter. Cautionary note: API key security – old key scraped led to $500 charge; indie wrappers need key rotation and spending limits. Rook AI web search API flat-rate £9.99/month for indie hackers – practical utility for search/RAG pipelines. New: Gemini Omni Flash video API coming soon, topping Video Arena with +158 pt improvement over Veo 3.1, strong for image-to-video, text-to-video, and video editing – potential low-cost video SaaS building block. No pricing yet. Also, a reminder to use a gateway between code and model providers for cost control, key rotation, and fallback routing – essential operational hygiene for indie wrappers. New: Kimi K2.7-Code open-source coding model with better token efficiency, now available on Puter.js with open weights on Hugging Face under Modified MIT. No API pricing yet, but worth monitoring for low-cost coding wrappers. Recent tweets claim Kimi 2.7 is 100x cheaper than Claude Fable 5 and solves 70% of its tasks, reinforcing its potential for indie wrappers. UnslothAI achieved 48% size reduction via dynamic 2-bit quantization, enabling >40 tok/s on high-end setups for local self-hosting. Zonos 2 (update to Zyphra's TTS) now on HF Spaces with voice cloning – strengthens voice surge options. New: OpenRouter Fusion API launched – multi-model blending for frontier-level performance at half the cost, scores 69% on DRACO. This directly enables indie wrappers to build cheaper, higher-quality SaaS. OpenRouter hit unicorn status ($1.3B valuation, Alphabet investment, 25T tokens/week) – validates API aggregation model. MiniMax Speech Turbo 2.6 TTS pricing and specs now available – speed-optimized TTS, relevant for voice surge wrappers. Need to compare latency/cost against ElevenLabs, Zonos, etc. New: Edgee Turbo Models – fallback models for coding agent SaaS, addressing Anthropic credit cap with chainable fallback and token compression, enabling indie wrappers to keep services running with alternative models like Kimi K2.6, GLM, Qwen. Zero-code integration. New: Luma Ray 3.2 API documented with text-to-video, image-to-video, keyframes – no pricing yet, adds to video API options. New: DeepSeek Introduces Vision – potentially a new multimodal API for indie wrappers, no details yet.