For achieving fast and flexible retrieval across heterogeneous modalities, unsupervised is more flexible and easy to use than supervised methods, of which the unsupervised method GAN is the most popular. However, GAN has been suffering from the problems of lack of diversity in generated samples, debugging difficulties and training instability. A cross-modal hashing method based on a diffusion model is proposed in the paper. Specifically: (1) For the first time, the diffusion model is applied to the field of cross-modal retrieval, targeting three modalities for mutual retrieval. (2) The combination of adversarial network GAN and diffusion model improves the sample quality and sample diversity, and ameliorates the problems of complex GAN debugging and unstable training. The effectiveness of the proposed method is demonstrated through experiments on three datasets and comparison with state-of-the-art methods.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.