1 February 2023 EPformer: an efficient transformer-based approach for retail product detection in fisheye images
Yunfei Yang, Hongwei Deng
Author Affiliations +
Abstract

Retail product detection in fisheye camera capture scenes frequently suffers from excessive object occlusion and deformation, as well as difficulty in distinguishing products with small fine-grained differences, so accurately classifying and localizing products in these images presents a challenge for computer vision. We propose an efficient product detection network called EPformer by fusing a visual transformer and convolutional neural network to reliably detect retail products in fisheye images. We employ a shifted window strategy for feature information interaction across windows to more precisely detect products due to the issue of dense occlusion of products. To address the issue of excessive product deformation brought on by fisheye cameras, we develop a deformation image processing module without explicit correction and embed it into the path aggregation network structure. This enables the model to efficiently capture product geometric changes and conduct feature fusion. To address the issue of differentiating fine-grained products, we design an effective coordinate squeeze-excitation (ECSE) attention module that can capture the fine-grained texture and boundary information differences between individuals in terms of spatial and channel relationships. The inability to differentiate fine-grained products can be solved by training the ECSE module in tandem with the decoupled head. The experimental results demonstrate that EPformer is a potent product detection model with a 4.9% higher mean average precision than the state-of-the-art method (YOLOX) on the fisheye product image dataset. In addition, the EPformer model can effectively detect products in fisheye images on the Jeston Xavier NX embedded device to meet the application requirements in realistic scenarios.

© 2023 SPIE and IS&T
Yunfei Yang and Hongwei Deng "EPformer: an efficient transformer-based approach for retail product detection in fisheye images," Journal of Electronic Imaging 32(1), 013017 (1 February 2023). https://doi.org/10.1117/1.JEI.32.1.013017
Received: 8 October 2022; Accepted: 17 January 2023; Published: 1 February 2023
Lens.org Logo
CITATIONS
Cited by 2 scholarly publications.
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Deformation

Windows

Object detection

Education and training

Image processing

Feature extraction

Cameras

Back to Top