Food recognition plays a vital role in various domains including dietary monitoring, nutrition analysis and food service automation. Real-world food recognition is a challenging task, as the contents of a plate of food can be complex intermixed objects, making it difficult to define their individual structures. Currently, technology offers a wide range of feasible options for dietary assessment and image-based methods hold the capability of substituting for traditional methods such as food records, food frequency questionnaires, and 24-hour recalls, which can have low accuracy and provide unreliable results. Faced with these issues, deep learning methods have shown better accuracy and ability to identify ingredients and types of food compared to traditional approaches for image classification. However, many deep learning methods rely on powerful computational resources which have limitations in terms of cost, energy consumption, and size. The rapid evolution of embedded hardware systems has significantly influenced the domain of computer vision and offers promising solutions to these challenges.
This paper presents a method that utilises deep-learning methods for detection and segmentation that are optimised for resource-constrained embedded platforms. These networks are tailored to efficiently process food images while ensuring low latency and energy efficiency. Additionally, strategies such as model quantisation, pruning, and compression are employed to reduce the computational complexity and memory footprint, making them suitable for deployment on embedded devices with limited resources, such as a Raspberry Pi. The method consists of a custom recognition pipeline that makes use of YOLOv81 and EdgeSAM2 approaches for detection and segmentation, which are trained on the foodSeg1033 dataset. The resulting system provides a fast, accurate way to recognise foods without requiring expensive, energy intensive hardware.
|