Paper
11 July 2024 ROD-YOLO: improved YOLOv8 semantic segmentation of obstacles in complex road scenes based on Swin Transformer
Baoxiang Jiang, Jingbo Xia, Tairui Meng, Yusong Hu, Kai Zhang, Daoqin Lei
Author Affiliations +
Abstract
Deep learning has found extensive applications in the domain of autonomous driving. However, in complex road environments, diverse obstacles such as irregularly shaped objects, children's toys, animals, and other unconventional entities pose significant challenges. Convolutional Neural Network (CNN)-based road detectors struggle to satisfy real-time demands owing to the complexities associated with accommodating multi-scale and intricate backgrounds. In this paper, for the road obstacle detection problem in the field of autonomous driving, we propose a YOLOv8-based detection method, ROD-YOLO (Road Obstacle Detection), which has a better multi-scale adaptability, and the model is used to segment the obstacles on the road. Compared to the original network, ROD-YOLO adds a detection header, and this paper proposes a method to add Transfomer with GAM attention mechanism to the C2f module. In order to make the model better adapt to multi-scale obstacles, we add a new small-scale segmentation header and a special feature fusion part. Specifically the new GlobalCSP C2FGAM module is proposed with the C2STR module that incorporates the Transfomer idea to obtain faster segmentation speed and better accuracy for different obstacles, and the algorithm performs well in real-time object segmentation tasks and is able to maintain a high level of accuracy. It improves the mAP by 1.9% compared to the original network YOLOv8, which significantly improves the segmentation of small object samples. The research results in this paper are of great significance for improving the safety and efficiency of self-driving vehicles.
(2024) Published by SPIE. Downloading of the abstract is permitted for personal use only.
Baoxiang Jiang, Jingbo Xia, Tairui Meng, Yusong Hu, Kai Zhang, and Daoqin Lei "ROD-YOLO: improved YOLOv8 semantic segmentation of obstacles in complex road scenes based on Swin Transformer", Proc. SPIE 13210, Third International Symposium on Computer Applications and Information Systems (ISCAIS 2024), 1321022 (11 July 2024); https://doi.org/10.1117/12.3034934
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Object detection

Roads

Transformers

Detection and tracking algorithms

Data modeling

Semantics

Target detection

Back to Top