Paper
14 August 2019 An image caption model incorporating high-level semantic features
Author Affiliations +
Proceedings Volume 11179, Eleventh International Conference on Digital Image Processing (ICDIP 2019); 1117917 (2019) https://doi.org/10.1117/12.2540579
Event: Eleventh International Conference on Digital Image Processing (ICDIP 2019), 2019, Guangzhou, China
Abstract
Encoder-decoder framework attracts great interests in image caption. It focuses on the extraction of low-level features and achieves good results. The performance can be further improved if high-level semantics are considered. In this work, we propose a new image caption model incorporating high-level semantic features through an revised Convolutional Neural Network(CNN). Both the low-level image features and high-level semantic features are fed into the Long-Short Term Memory networks(LSTMs) to acquire natural sentence descriptions. We show in a number of experiments on Flickr8K and Flickr30K datasets that our method outperforms most standard network baseline for image caption.
© (2019) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Zhiwang Luo, Jiwei Hu, Quan Liu, and Jiamei Deng "An image caption model incorporating high-level semantic features", Proc. SPIE 11179, Eleventh International Conference on Digital Image Processing (ICDIP 2019), 1117917 (14 August 2019); https://doi.org/10.1117/12.2540579
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Associative arrays

Data modeling

Feature extraction

Principal component analysis

Convolutional neural networks

Computing systems

Image processing

Back to Top