Modality-agnostic robust tracking under missing or degraded modalities
Develop a unified visual object tracking algorithm that maintains high accuracy and temporal consistency across RGB-only, RGB-plus-auxiliary (e.g., depth, thermal, event), and auxiliary-only inputs, and that remains resilient to missing or degraded modalities caused by sensor failure, occlusion, or bandwidth constraints.
References
Consequently, designing a tracker that is accurate, modality-agnostic, and resilient to missing or degraded inputs continues to be a critical open problem in visual object tracking.
— Video Understanding: From Geometry and Semantics to Unified Models
(2603.17840 - An et al., 18 Mar 2026) in Summary paragraph, Section 3.2 (Video object tracking)