“MaskControl: Spatio-Temporal Control for Masked Motion Synthesis” has been accepted as an oral at ICCV 2025
π¨ Our paper “MaskControl: Spatio-Temporal Control for Masked Motion Synthesis” has been accepted as an oral at ICCV 2025, with a π perfect review score! π
__________________________________________________________________
While Masked Motion Models have recently outperformed Motion Diffusion Models in both quality and speed, most state-of-the-art controllable motion generation methods still rely on diffusion-based approaches in motion space.
MaskControl is the first to bring controllability to Masked Motion Models via two novel components:
π§ Logits Regularizer β ControlNet-like for Masked Models
π― Logits Optimization β Inference-time guidance tailored for Masked Models
To address the non-differentiable nature of quantized models, we propose Differentiable Expectation Sampling, which relaxes quantization and enables conversion across multiple representation spaces:
β π§± VQ-VAE codebook
β π³ Masked Transformer tokens
β π Residual Transformer tokens
π MaskControl consistently outperforms existing methods on motion control tasks in both generation quality and control precision, while enabling a wide range of real-world applications:
π― Any-joint-any-frame control
πΊ Body-part timeline control
π§ Zero-shot objective control
__________________________________________________________________
π Project Page: ekkasit.com/ControlMM-page
π» Code: https://lnkd.in/etr6mMCV
π Paper: arxiv.org/abs/2410.10780
Congratulations to all who were involved including our own Ekkasit Pinyoanuntapong, Muhammad Usama Saleem, Dr. Pu Wang & Dr. Hongfei Xue