Currently, object detection based on deep learning has received extensive research and attention in the field of grid inspection, achieving high detection accuracy and recognition precision. However, pre-trained object detection models lack overall perception and reasoning capabilities, resulting in higher false positives and missings due to a lack of holistic understanding of challenging samples. Recently, the combination of natural language models and image understanding in multi-modal large language models has gained significant attention. In this paper, we propose the Grid-Blip model, a multi-modal large model enhanced with general knowledge, to specifically study wildfires detection in grid inspection. Grid-Blip is based on the blip model architecture, which includes a natural language model, a visual generation model, and a fusion model. We conduct large-scale sample annotation at the semantic level of whole-image grid inspection, providing crucial training samples for multi-modal large-scale model research. Furthermore, we investigate the design of the fusion model network, training the model to effectively integrate the pre-trained natural language model and visual generation model. Experimental results demonstrate that compared to object detection models, the proposed multi-modal large-scale model in this paper achieves overall semantic perception and reasoning capabilities. The Grid-Blip model reduces the false alarm rate for wildfire smoke trend prediction from 20% to 10% and the missed detection rate from 18% to 13%.
Infrared smoke interference technology seriously infected the combat effectiveness of photoelectric guided weapons in modern warfare. As a result of the occlusion caused by smoke screen, the robustness of image matching guidance algorithm will decrease. Thus, to judge whether there is smoke interference in images and smoke screen area extraction are of great importance for the accuracy of image matching guidance algorithm. However, most of the smoke detection methods aimed at fire early warning, so that they focused on whether smoke exists or not. While both of the discrimination of smoke interference and smoke screen area extraction are what we concern. In this paper, a smoke detection method based on superpixel segmentation and region merging is proposed. Firstly, over-segmentation regions of input infrared image with superpixel segmentation are obtained. Then, fusion texture feature of the image is computed. Finally, superpixel regions are merged based on the fusion features of each superpixel block obtained in the previous step and smoke screen area extraction is completed.
Sea-land segmentation is one of important research domains in the remote sensing image processing. Edge aware of sealand segmentation is one of hot-points. Edge information is used as an auxiliary learning to provide more information for the segmentation. In this paper, we propose a novel model for the sea-land segmentation with an edge detection in the lower layers and segmentation in higher layers, which is proved as an effective way to fuse the different tasks. We exploit pre-trained VGG16 model to initial the backbone. We use F-score to assess the segment output. Land accuracy is 0.9929 of F-score and sea accuracy score is 0.9937 of F-score in our own test dataset in the sea-land segmentation, which is the highest score among the five methods we take in the comparisons.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.