MotionMatcher: Motion Customization of Text-to-Video Diffusion Models via Motion Feature Matching
Authors: Y Wu, C Huang, F Yang, F Wang
Published: 2025
Publication: ArXiv
MotionMatcher is a novel framework for motion customization in text-to-video (T2V) diffusion models, using high-level spatio-temporal motion features rather than pixel-level objectives, achieving state-of-the-art performance.
Methods: Fine-tuning pre-trained text-to-video diffusion models at feature level by comparing spatio-temporal motion features instead of pixel-level objectives to address motion customization from reference videos.
Key Findings: Efficacy of motion customization in T2V models; ability to accurately capture complex motion and avoid content leakage from reference videos.
Institution: Nvidia, National Taiwan University
Research Area: Motion Customization of Text-to-Video Diffusion Models
Discipline: Computer Vision , Pattern Recognition