Visual Object Tracking (VOT) is a fundamental task with widespread
applications in autonomous navigation, surveillance, and maritime robotics.
Despite significant advances in generic object tracking, maritime environments
continue to present unique challenges, including specular water reflections,
low-contrast targets, dynamically changing backgrounds, and frequent
occlusions. These complexities significantly degrade the performance of
state-of-the-art tracking algorithms, highlighting the need for domain-specific
datasets. To address this gap, we introduce the Maritime Visual Tracking
Dataset (MVTD), a comprehensive and publicly available benchmark specifically
designed for maritime VOT. MVTD comprises 182 high-resolution video sequences,
totaling approximately 150,000 frames, and includes four representative object
classes: boat, ship, sailboat, and unmanned surface vehicle (USV). The dataset
captures a diverse range of operational conditions and maritime scenarios,
reflecting the real-world complexities of maritime environments. We evaluated
14 recent SOTA tracking algorithms on the MVTD benchmark and observed
substantial performance degradation compared to their performance on
general-purpose datasets. However, when fine-tuned on MVTD, these models
demonstrate significant performance gains, underscoring the effectiveness of
domain adaptation and the importance of transfer learning in specialized
tracking contexts. The MVTD dataset fills a critical gap in the visual tracking
community by providing a realistic and challenging benchmark for maritime
scenarios. Dataset and Source Code can be accessed here
"this https URL".