“MaskControl: Spatio-Temporal Control for Masked Motion Synthesis” has been accepted as an oral at ICCV 2025

🚨 Our paper “MaskControl: Spatio-Temporal Control for Masked Motion Synthesis” has been accepted as an oral at ICCV 2025, with a 🌟 perfect review score! 🎉
__________________________________________________________________
While Masked Motion Models have recently outperformed Motion Diffusion Models in both quality and speed, most state-of-the-art controllable motion generation methods still rely on diffusion-based approaches in motion space.
MaskControl is the first to bring controllability to Masked Motion Models via two novel components:
🧠Logits Regularizer – ControlNet-like for Masked Models
🎯 Logits Optimization – Inference-time guidance tailored for Masked Models
To address the non-differentiable nature of quantized models, we propose Differentiable Expectation Sampling, which relaxes quantization and enables conversion across multiple representation spaces:
→ 🧱 VQ-VAE codebook
→ 🔳 Masked Transformer tokens
→ 🌀 Residual Transformer tokens
📈 MaskControl consistently outperforms existing methods on motion control tasks in both generation quality and control precision, while enabling a wide range of real-world applications:
🎯 Any-joint-any-frame control
🕺 Body-part timeline control
🧠Zero-shot objective control
__________________________________________________________________
🔗 Project Page: ekkasit.com/ControlMM-page
💻 Code: https://lnkd.in/etr6mMCV
📄 Paper: arxiv.org/abs/2410.10780
Congratulations to all who were involved including our own Ekkasit Pinyoanuntapong, Muhammad Usama Saleem, Dr. Pu Wang & Dr. Hongfei Xue