Paper
18 July 2024 LKCA: large kernel convolutional attention
Chenghao Li, Pengbo Shi, Qingzi Chen, Jirui Liu, Lingyun Zhu
Author Affiliations +
Proceedings Volume 13179, International Conference on Optics and Machine Vision (ICOMV 2024); 131790M (2024) https://doi.org/10.1117/12.3031589
Event: International Conference on Optics and Machine Vision (ICOMV 2024), 2024, Nanchang, China
Abstract
We revisit the relationship between attention mechanisms and large kernel ConvNets in visual transformers and propose a new spatial attention named Large Kernel Convolution Attention (LKCA). It simplifies the attention operation by replacing it with a single large kernel convolution. LKCA combines the advantages of convolutional neural networks and visual transformers, possessing a large receptive field, locality, and parameter sharing. We explained the superiority of LKCA from both convolution and attention perspectives, providing equivalent code implementations for each view. Experiments confirm that LKCA implemented from both the convolutional and attention perspectives exhibit equivalent performance. We conducted experiments with LKCA on a wide range of ViT variants, consistently improving classification and performance compare to various models, showcasing the capabilities of LKCA. Our code will be made publicly available
(2024) Published by SPIE. Downloading of the abstract is permitted for personal use only.
Chenghao Li, Pengbo Shi, Qingzi Chen, Jirui Liu, and Lingyun Zhu "LKCA: large kernel convolutional attention", Proc. SPIE 13179, International Conference on Optics and Machine Vision (ICOMV 2024), 131790M (18 July 2024); https://doi.org/10.1117/12.3031589
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Convolution

Transformers

Visualization

Matrices

Visual process modeling

Windows

Modeling

Back to Top