Multi-modal guided attention for live video comments generation

Yuchen Ren; Yuan Yuan; Lei Chen

doi:10.1117/12.2631006

18 March 2022 Multi-modal guided attention for live video comments generation

Yuchen Ren, Yuan Yuan, Lei Chen

Proceedings Volume 12168, International Conference on Computer Graphics, Artificial Intelligence, and Data Processing (ICCAID 2021); 1216819 (2022) https://doi.org/10.1117/12.2631006
Event: International Conference on Computer Graphics, Artificial Intelligence, and Data Processing (ICCAID 2021), 2021, Harbin, China

Abstract

With the blooming of online video applications, live commenting is an emerging feature of online video sites. The live video comments generation (LVCG) task aims to generate live comments for videos while considering both the video and the surrounding comments made by other viewers. In this work, we aim to improve the relevance between live comments and videos by modeling the cross-modal interactions among different modalities. To overcome the problem of insufficient multimodal interactions for live video comments generation, we built two basic attention blocks: the self attention (SA) block that can model the dense intramodal interactions; and the x-guided attention (XGA) block to model the dense intermodal interactions. After that, by modular compositions of the SA and XGA blocks, we propose different multimodal transformer architectures to handle the multimodal features. Finally, experiments show that our proposed multimodal guided attention models significantly outperform previous methods in most of the metrics.

Citation Download Citation

Yuchen Ren, Yuan Yuan, and Lei Chen "Multi-modal guided attention for live video comments generation", Proc. SPIE 12168, International Conference on Computer Graphics, Artificial Intelligence, and Data Processing (ICCAID 2021), 1216819 (18 March 2022); https://doi.org/10.1117/12.2631006

ACCESS THE FULL ARTICLE

INSTITUTIONAL
Select your institution to access the SPIE Digital Library.

SELECT YOUR INSTITUTION

PERSONAL
Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.

PERSONAL SIGN IN

No SPIE Account? Create one

PURCHASE THIS CONTENT

SUBSCRIBE TO DIGITAL LIBRARY

50 downloads per 1-year subscription

Members: $195

Non-members: $335 ADD TO CART

25 downloads per 1 - year subscription

Members: $145

Non-members: $250 ADD TO CART

PURCHASE SINGLE ARTICLE

Includes PDF, HTML & Video, when available

Members: $17.00

Non-members: $21.00 ADD TO CART

PROCEEDINGS
7 PAGES

DOWNLOAD PAPER SAVE TO MY LIBRARY

GET CITATION

RIGHTS & PERMISSIONS

Get copyright permission Get copyright permission on Copyright Marketplace

KEYWORDS

Video

Visual process modeling

Computer programming

Transformers

Visualization

Head

Information visualization

Show All Keywords

Keywords/Phrases

Search In:

Publication Years