With the rapid development of deep learning, object detection methods have made significant progress, but small object detection is still a very challenging task. Based on object detector Single Shot Multibox Detector (SSD), we conjecture this is due to two factors: (1) In Convolutional neural networks (CNN), shallow feature maps contain strong spatial position information and insufficient semantic information. Therefore, single-layer feature detectors could not make full use of the limited information of small objects in the images; (2) The loss is not a good measure of the relationship between the prediction box and ground truth, what’s more, it is easy to cause the unbalanced loss attribution, thus impairing the accuracy of small object detection. To solve the above problems, we propose an improved SSD algorithm, named PASSD, including a multi-scale feature fusion module, which provides contextual semantic information and enhances high-resolution spatial information for shallow layers. Furthermore, a feedback-based loss function based on Gaussian Wasserstein Distance is introduced to measure the similarity between boxes in a new way. Experiment results show that the proposed method outperforms the previous method, achieving good speed and comparable accuracy trade-off. For 300×300 input, the proposed method achieves 86.28% mAP on the PASCAL VOC dataset, respectively. Especially APs is improved by 4% on MS COCO dataset compared to the benchmark model SSD.
|