Presentation + Paper
7 June 2024 Exploring action recognition in endoscopy video datasets
Author Affiliations +
Abstract
Surgical image and video applications using endoscopic datasets have been actively investigated to develop advanced surgical assistant systems. These applications are particularly crucial for understanding surgical scenes during procedures. Specifically, segmentation techniques allow for identifying anatomical structures and surgical instruments, while quality control methods refine surgical techniques, and action recognition aids in discerning surgical steps. A significant improvement in performance across different downstream tasks has been achieved due to the advancements in deep neural networks and the expansive training dataset available. However, the exploration of surgical action recognition remains limited. Existing methods face challenges in real-world settings, mainly due to the lack of adaptability in a dynamic imaging environment. In this study, we present a framework for surgical action recognition in endoscopic datasets by leveraging video-masked autoencoders (VideoMAE), which has shown promise in video dataset analysis with minimal datasets. Additionally, we incorporate a temporal data augmentation technique to represent diverse imaging conditions and resolve the issue of using single-source data with low quality. For our experiments, we utilize VideoMAE v2 pre-trained on Unlabeled Hybrid datasets and fine-tune the model on the CholecT45 dataset for validation. Our proposed method shows the effectiveness of using the VideoMAE structure with focal loss, particularly for action recognition tasks in surgical scenarios.
Conference Presentation
(2024) Published by SPIE. Downloading of the abstract is permitted for personal use only.
Yuchen Tian, Sidike Paheding, Ehsan Azimi, and Eung-Joo Lee "Exploring action recognition in endoscopy video datasets", Proc. SPIE 13034, Real-Time Image Processing and Deep Learning 2024, 130340D (7 June 2024); https://doi.org/10.1117/12.3014345
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Video

Action recognition

Data modeling

Endoscopy

Education and training

Light sources and illumination

Equipment

Back to Top