MetalRT/MLX/Ollama accelerates on-device AI on Apple Silicon

Key Questions

How does MetalRT compare to MLX for on-device AI on Apple Silicon?

MetalRT outperforms MLX on M3/M4/Max chips, delivering up to 658 tokens/second for LLMs, speech-to-text (STT), and text-to-speech (TTS).

What performance boosts does Ollama 0.19 provide on Apple Silicon?

Ollama 0.19 accelerates M1+ to M5 chips, offering 7x faster decoding on M1 Max (23 tokens/s) and support for Qwen3.5 35B on 32GB RAM using NVFP4.

Is the M4 Mac Mini with 24GB RAM suitable for AI models?

It handles lighter models like Gemma 26B effectively but is RAM-limited for larger models compared to M5 desktops.

What is Google's Eloquent app?

Eloquent is an offline AI dictation app using Gemma, compatible with iOS 16+ and M1+ hardware, extending capabilities to older devices.

What is 'apfel' for Mac users?

'Apfel' simplifies access to built-in Mac AI with no setup, downloads, or token fees required.

MetalRT > MLX on M3/M4/Max (658 t/s LLMs/STT/TTS); Ollama 0.19 boosts M1+/M3/M4/M5 (7x decode M1 Max 23 tok/s, Qwen3.5 35B 32GB RAM, NVFP4); M4 Mac Mini 24GB viable lighter AI (Gemma 26B), but big models RAM-limited vs M5 desktops. Third-party Google Eloquent (offline Gemma dictation iOS16+/M1+) extends older hardware.

Sources (5)

Updated Apr 8, 2026

Apple AI Buying Guide

MetalRT/MLX/Ollama accelerates on-device AI on Apple Silicon

Key Questions

How does MetalRT compare to MLX for on-device AI on Apple Silicon?

What performance boosts does Ollama 0.19 provide on Apple Silicon?

Is the M4 Mac Mini with 24GB RAM suitable for AI models?

What is Google's Eloquent app?

What is 'apfel' for Mac users?

Google Quietly Launches Offline AI Dictation App on iOS | by Impact Newswire | Apr, 2026 | Medium

'apfel' makes it easy to use the free AI built into Macs; no setup, no download, and no token fees required. - GIGAZINE

Google quietly drops offline AI dictation app on iPhone — and it could replace typing

Your iPhone is about to get a lot better at listening than Siri ever was

Mac Mini M4 (24GB) LM Studio Performance Test | Gemma 4 26B A4B Instruct

********MetalRT/MLX/Ollama accelerates on-device AI on Apple Silicon********

Key Questions

How does MetalRT compare to MLX for on-device AI on Apple Silicon?

What performance boosts does Ollama 0.19 provide on Apple Silicon?

Is the M4 Mac Mini with 24GB RAM suitable for AI models?

What is Google's Eloquent app?

What is 'apfel' for Mac users?

Google Quietly Launches Offline AI Dictation App on iOS | by Impact Newswire | Apr, 2026 | Medium

'apfel' makes it easy to use the free AI built into Macs; no setup, no download, and no token fees required. - GIGAZINE

Google quietly drops offline AI dictation app on iPhone — and it could replace typing

Your iPhone is about to get a lot better at listening than Siri ever was

Mac Mini M4 (24GB) LM Studio Performance Test | Gemma 4 26B A4B Instruct

MetalRT/MLX/Ollama accelerates on-device AI on Apple Silicon