Research and startups in embodied robotics, 3D perception, and sim‑to‑real world models

Embodied Robotics & World Modeling

Advancements in embodied robotics and physical AI are accelerating rapidly, driven by innovations in perception, world modeling, and hardware acceleration. These developments are enabling robots to operate reliably in unstructured, complex environments, transforming industries such as construction, logistics, healthcare, and disaster response.

Robotics systems, vision models, and reinforcement learning frameworks are at the forefront of this revolution. Object-centric world models like Causal-JEPA leverage self-supervised learning and multi-view consistency to generate detailed scene representations from visual data alone. This high-level understanding allows robots to perform latent interventions, manipulate objects, and reason about their surroundings without retraining. Additionally, models such as PhyCritic contribute to scene understanding by accurately predicting scene geometry and safety parameters using multi-view analysis, which is crucial for navigation and manipulation tasks. SARAH, a spatially aware real-time agent, integrates flow matching and transformer-based autoencoders to facilitate conversational motion planning, enhancing human-robot collaboration in manufacturing and disaster zones.

Zero-shot and sim-to-real transfer frameworks like SimToolReal have become foundational in reducing deployment costs and accelerating development cycles. Policies trained entirely in simulation can now be transferred directly to physical robots, enabling dexterous manipulation of tools and objects in real-world environments without extensive retraining. Startups such as EgoPush exemplify this progress, pioneering perception-driven policies for multi-object rearrangement in cluttered settings, vital for logistics, healthcare, and autonomous construction.

Hardware advancements are equally critical. Companies like FLEXOO have raised €11 million to expand sensor platforms capable of high-fidelity perception in challenging environments, while startups like Mirai and KiloClaw develop specialized low-power edge chips optimized for on-device inference. These chips enable local decision-making with minimal latency and energy consumption, making autonomous operations scalable even in areas with limited connectivity. The deployment of edge hardware accelerators such as Vera Rubin GPUs from NVIDIA, promising tenfold throughput improvements, further supports real-time perception and planning.

The European robotics ecosystem continues to grow robustly, with investments in physical AI doubling to €1.45 billion in 2025. This surge reflects a strategic push towards integrating large language models (LLMs) with robotics, exemplified by acquisitions like Vercept by Anthropic, which aim to enhance autonomous decision-making and multi-modal reasoning in embodied systems.

On the deployment front, companies like Scopey Onsite seek €5 million to develop edge-computing solutions for robots operating in unstructured environments. These systems enable autonomous site inspections, material handling, and real-time planning, demonstrating readiness for industry-scale applications. Datasets such as DeepVision-103K and RoboCurate underpin efforts to develop trustworthy perception models, essential for safety-critical sectors like healthcare and construction, by enabling formal safety verification and robust transfer learning.

Finally, safety and trustworthiness are central to the widespread adoption of embodied AI. Initiatives like Epismo Skills provide community-curated libraries of proven, reliable behaviors, facilitating standardization and safety assurance across diverse scenarios. Vision-language models such as MedCLIPSeg demonstrate how probabilistic perception can support medical image segmentation, further emphasizing the importance of robust, interpretable AI in sensitive applications. Companies like Evoke Security secure funding to develop fleet security and cybersecurity solutions addressing vulnerabilities like prompt injections, while open-source efforts like IronClaw aim to establish industry standards for trustworthy autonomous operations.

In summary, the convergence of advanced perception models, hardware acceleration, and strategic investments is transforming embodied robotics into reliable, scalable systems capable of operating safely in complex environments. These innovations are not only accelerating research but also paving the way for widespread industrial deployment—making trustworthy, scalable physical AI an integral part of society’s future.

Sources (24)

Updated Mar 2, 2026

Founders' AI Startup Digest

Research and startups in embodied robotics, 3D perception, and sim‑to‑real world models

MedCLIPSeg: Probabilistic Vision-Language Adaptation for Data-Efficient and Generalizable Medical Image Segmentation

FLEXOO: €11 Million Series A Raised To Scale Physical AI Sensor Platform

Encord Raises $60M in Series C to Scale Physical AI Data

EmbodMocap: In-the-Wild 4D Human-Scene Reconstruction for Embodied Agents

European Robotics Investment Doubles to €1.45bn — Why VCs Are Betting Big on Physical AI

@jon_barron reposted: [1/N] Current visual geometry prediction models primarily rely on labeled 3D dat...

ARLArena: A Unified Framework for Stable Agentic Reinforcement Learning

NoLan: Mitigating Object Hallucinations in Large Vision-Language Models via Dynamic Suppression of Language Priors

@_akhaliq: LAP Language-Action Pre-Training Enables Zero-shot Cross-Embodiment Transfer https://t.co/YTxNABdwr...

@_akhaliq: Learning from Trials and Errors Reflective Test-Time Planning for Embodied LLMs https://t.co/P3zdfc...

@_akhaliq: SimToolReal An Object-Centric Policy for Zero-Shot Dexterous Tool Manipulation paper: https://t.co...

Physical AI startup RLWRLD raises $26M - The Robot Report

PyVision-RL: Forging Open Agentic Vision Models via RL

LaS-Comp: Zero-shot 3D Completion with Latent-Spatial Consistency

RoboCurate: Harnessing Diversity with Action-Verified Neural Trajectory for Robot Learning

SkillOrchestra: Learning to Route Agents via Skill Transfer

DeepVision-103K: A Visually Diverse, Broad-Coverage, and Verifiable Mathematical Dataset for Multimodal Reasoning

@CMHungSteven reposted: 🚀 Excited to share that our paper Fast-ThinkAct has been accepted to #CVPR2026! ...

EgoPush: Learning End-to-End Egocentric Multi-Object Rearrangement for Mobile Robots

SARAH: Spatially Aware Real-time Agentic Humans

@Scobleizer reposted: Gave a robot 3D vision with just a regular camera👁️ Full Tutorial: https://t.co...

Sitegeist Robotics raises €4 million pre-seed funding to commercialize its construction robots

@_akhaliq reposted: 🚀 Thrilled to share that PhyCritic has been accepted to #CVPR2026! See you in De...

StereoAdapter-2: Globally Structure-Consistent Underwater Stereo Depth Estimation