Entity

MIND: Multi-Scale Intent Diffusion for Text-Driven Physics-Based Humanoid Control

Enabling physics-based humanoids to execute diverse behaviors from high-level textual commands remains a significant challenge. Existing methods typically follow either a two-stage paradigm that combines kinematic motion generation with physics-based tracking, or an end-to-end imitation-learning paradigm that directly generates actions from text. However, the former suffers from the inherent domain shift between kinematic generation and physics-based tracking, while the latter struggles with the

Paper · arXiv

cs.CV

Authors: Bin Li, Ruichi Zhang, Han Liang, Jingyan Zhang, Juze Zhang + 2 more
Published: 2026-05-25
Categories: cs.CVcs.GRcs.RO

Abstract ↗

via arXiv · 2605.26006