Paper
22 February 2023 Growing neural networks using orthogonal initialization
Xinglin Pan
Author Affiliations +
Proceedings Volume 12587, Third International Seminar on Artificial Intelligence, Networking, and Information Technology (AINIT 2022); 1258723 (2023) https://doi.org/10.1117/12.2667654
Event: Third International Seminar on Artificial Intelligence, Networking, and Information Technology (AINIT 2022), 2022, Shanghai, China
Abstract
In the training of neural networks, the architecture is usually determined first and then the parameters are selected by an optimizer. The choice of architecture and parameters is often independent. Whenever the architecture is modified, an expensive retraining of the parameters is required. In this work, we focus on growing the architecture instead of the expensive retraining. There are two main ways to grow new neurons: splitting and adding. In this paper, we propose orthogonal initialization to mitigate the gradient vanish of the new adding neurons. We use QR decomposition to obtain orthogonal initialization. We performed detailed experiments on two datasets (CIFAR-10, CIFAR-100) and the experimental results show the efficiency of our method.
© (2023) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Xinglin Pan "Growing neural networks using orthogonal initialization", Proc. SPIE 12587, Third International Seminar on Artificial Intelligence, Networking, and Information Technology (AINIT 2022), 1258723 (22 February 2023); https://doi.org/10.1117/12.2667654
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Education and training

Deep learning

Back to Top