26 August 2020 Two-stream spatial-temporal neural networks for pose-based action recognition
Zixuan Wang, Aichun Zhu, Fangqiang Hu, Qianyu Wu, Yifeng Li
Author Affiliations +
Abstract

With recent advances in human pose estimation and human skeleton capture systems, pose-based action recognition has drawn lots of attention among researchers. Although most existing action recognition methods are based on convolutional neural network and long short-term memory, which present outstanding performance, one of the shortcomings of these methods is that they lack the ability to explicitly exploit the rich spatial-temporal information between the skeletons in the behavior, so they are not conducive to improving the accuracy of action recognition. To better address this issue, the two-stream spatial-temporal neural networks for pose-based action recognition is introduced. First, the pose features that are extracted from the raw video are processed by an action modeling module. Then, the temporal information and the spatial information, in the form of relative speed and relative distance, are fed into the temporal neural network and the spatial neural network, respectively. Afterward, the outputs of two-stream networks are fused for better action recognition. Finally, we perform comprehensive experiments on the SUB-JHMDB, SYSU, MPII-Cooking, and NTU RGB+D datasets, the results of which demonstrate the effectiveness of the proposed model.

© 2020 SPIE and IS&T 1017-9909/2020/$28.00© 2020 SPIE and IS&T
Zixuan Wang, Aichun Zhu, Fangqiang Hu, Qianyu Wu, and Yifeng Li "Two-stream spatial-temporal neural networks for pose-based action recognition," Journal of Electronic Imaging 29(4), 043025 (26 August 2020). https://doi.org/10.1117/1.JEI.29.4.043025
Received: 3 March 2020; Accepted: 11 August 2020; Published: 26 August 2020
Lens.org Logo
CITATIONS
Cited by 3 scholarly publications.
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Neural networks

Video

Data modeling

RGB color model

Data fusion

Information fusion

Performance modeling

RELATED CONTENT

Advertisement replacement in video
Proceedings of SPIE (March 04 2022)

Back to Top