Dongyang Zhao, Guoyong Su, Gang Cheng, Pengyu Wang, Wei Chen
Journal of Electronic Imaging, Vol. 34, Issue 01, 013010, (January 2025) https://doi.org/10.1117/1.JEI.34.1.013010
TOPICS: Inspection, Mining, Head, Land mines, Target detection, Detection and tracking algorithms, Feature fusion, Feature extraction, Mathematical optimization, Inspection equipment
Aiming at problems of key target inspection perception caused by harsh factors, such as high dust and fog, uneven illumination, mixed arrangement, and cross-scale variation of personnel-equipment elements in coal mine complex operation scenes, an inspection perception method of coal mining and excavating face based on machine vision is proposed. First, taking the YOLOv5s algorithm as the baseline model, the C3-DRES feature extraction module incorporating the dual attention mechanism and dual residual structure, the cross-layer dense connection and cross-scale connection (CDS-FFN) feature fusion network based on cross-layer dense connection and cross-scale connection, and the task-specific context decoupling head are designed to construct the coal mine complex inspection scene key objective detection network (CMCIS-Net). Second, based on the coal mine complex scene data set, the performance verification experiment of optimization strategies, ablation experiment, and comparison experiment are carried out, and the CMCIS-Net is deployed to the visual perception terminal platform of the inspection robot, to test the detection performance of the inspection robot on equipment and personnel. The experimental results show that in the complex operation scenes of coal mine, the three optimization strategies of C3-DRES, CDS-FFN, and task-specific context decoupling head effectively improve the target feature region focus and feature information fusion ability of the network and reduce the classification and positioning loss, to increase the detection accuracy of the network by 1.4%, 1.6%, and 1.9%, respectively; the detection accuracy of CMCIS-Net reaches 95.0%, which is 4.9% higher than that of the YOLOv5s algorithm, and compared with the eight mainstream target detection algorithms of SSD–YOLOv9c, CMCIS-Net has the best detection performance; on the vision control board, the real-time detection speed and inference time of CMCIS-Net reach 36.3 frames/s and 26.3 ms, and the detection accuracy of all targets is above 90.0%, which can realize the stable identification and accurate positioning of equipment and personnel.