Generative AI Pulse

**Agent velocity: Cursor 3, GLM-5V-Turbo, Claude MS365, HF agent traces, Karpathy LLM Wiki, GEN-1/Poke robotics/agents, Qwen3.6 [developing]**

**Agent velocity: Cursor 3, GLM-5V-Turbo, Claude MS365, HF agent traces, Karpathy LLM Wiki, GEN-1/Poke robotics/agents, Qwen3.6 [developing]**

Key Questions

What performance did Cursor 3 achieve on Terminal2?

Cursor 3 scored 61.7% on Terminal2. This benchmark highlights its capabilities in agentic tasks. It is part of ongoing developments in agent velocity.

How does GLM-5V perform on Design2Code?

GLM-5V achieved 94.8% on Design2Code. GLM-5.1, a related model, is designed for long-horizon tasks and can work continuously and autonomously. This positions it strongly in coding and design benchmarks.

What integrations are available with Claude v2.1.88?

Claude v2.1.88 integrates with MS365 and Skills for enhanced agentic capabilities. Atlassian has also launched visual AI tools and third-party agents in Confluence, showing broader ecosystem support. These enable better real-world skill usage.

What is the HF crowdsourced agent traces dataset?

Hugging Face released a crowdsourced dataset of agent traces to support open-source frontier agents. As noted by Clement Delangue, it aims to build datasets for advanced agent development. This addresses the need for high-quality training data.

What is Karpathy's LLM Wiki and how does it relate to RAG?

Andrej Karpathy's LLM Wiki serves as an alternative to RAG workflows. It was highlighted as potentially replacing many RAG setups. This tool focuses on efficient knowledge retrieval for LLMs.

What are the key features of Generalist's GEN-1 robotics model?

Generalist's GEN-1 is a highly capable robotic intelligence AI foundation model achieving 99% performance and 3x faster improvisation. It targets embodied robotics intelligence. This marks progress in agentic robotics.

What is Poke and its connection to OpenClaw?

Poke is described as OpenClaw for normies, scaling via iMessage and raising a $10M round. OpenClaw faced a paywall from Anthropic, impacting AI model evaluation. Poke represents accessible agent scaling.

What is the training scale for Qwen3.6?

Qwen3.6 processes 1T tokens per day. It is under development alongside benchmarks like Agentic-MME and Xpertbench. This supports its competitiveness in agentic tasks.

Cursor 3 (61.7% Terminal2); GLM-5V 94.8% Design2Code; Claude v2.1.88 + MS365/Skills; HF crowdsourced agent traces dataset; LLM Wiki RAG alt; Copilot DRACO/SLMs; GEN-1 robotics 99%/3x faster improv; Poke OpenClaw iMessage scaling ($10M round); Qwen3.6 1T tokens/day; Unsloth MLX/Self-Exec; Agentic-MME/Xpertbench.

Sources (34)
Updated Apr 8, 2026