Paper
19 October 2023 Unveiling the power of unpaired multi-modal data for RGBT tracking
Shen Qing, Wang Yifan, Guo Yu, Mengmeng Yang
Author Affiliations +
Proceedings Volume 12709, Fourth International Conference on Artificial Intelligence and Electromechanical Automation (AIEA 2023); 127092N (2023) https://doi.org/10.1117/12.2685082
Event: Fourth International Conference on Artificial Intelligence and Electromechanical Automation (AIEA 2023), 2023, Nanjing, China
Abstract
RGBT tracking receives increasing interests due to its flexible application in all-day and all-weather environments. However, the training of deep RGBT trackers usually relies on large-scale aligned RGBT pairs, which usually require high human labor and time cost. Considering the strong commonality and specificity of multi-modal data, we propose a novel two-stage learning framework to capture modality-shared and modality-specific features using large-scale unpaired RGBT data, and thus achieve state-of-the-art performance in RGBT tracking. In specific, in the first stage, we aim to learn the modality-shared representations and thus design a generic transformer network which only requires mixed modal data for training. In the second stage, we aim to learn the modality specific representations and achieve adaptive back-propagation using unpaired data. To achieve these goals, we design a modality transformer network, in which two modality encoders are used to capture modality-specific features and a modality-adaptive attention module is designed to enforce the interchange of information between different modalities in a separate-gather way. Since the two training stages do not rely on paired or aligned multi-modal data, the power of unpaired multi-modal data is unveiled in the training of deep RGBT tracker. Extensive experiments on three benchmark datasets demonstrate the effectiveness of our method against state-ofthe-art RGBT trackers.
© (2023) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Shen Qing, Wang Yifan, Guo Yu, and Mengmeng Yang "Unveiling the power of unpaired multi-modal data for RGBT tracking", Proc. SPIE 12709, Fourth International Conference on Artificial Intelligence and Electromechanical Automation (AIEA 2023), 127092N (19 October 2023); https://doi.org/10.1117/12.2685082
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Transformers

Education and training

Design and modelling

Detection and tracking algorithms

Feature extraction

Matrices

Visualization

Back to Top