In this paper, we describe the ontology of Partially Observable Group Activities (POGA) in the context of In-Vehicle Group Activity (IVGA) recognition. Initially, we describe the ontology pertaining to the IVGA and show how such an ontology based on in-vehicle volumetric sub-spaces realization and human postural motion limitations can serve as a priori knowledge for inference of human actions inside the confined space of a vehicle. In particular, we treat this predicament as an “action-object” duality problem. This duality signifies the amendable detection of human action observations that can be concluded as the probable object being utilized and vice versa. Furthermore, we use partially observable human postural sequences to recognition actions. Inspired by convolutional neural networks (CNNs) deep learning ability, we present architecture design of a new CNN model for learning “action-object” perception from continuous surveillance videos. In this study, we apply a sequential Deep Hidden Markov Model (DHMM) as a post-processor to CNN to decode realized observations into recognized actions and activities. To generate the needed imagery data set for the training and testing of newly developed techniques, IRIS virtual simulation software is employed for constructing dynamic animation of high fidelity scenarios representing in-vehicle group activities under different operational contexts. The results of our comparative investigation are discussed and presented in detail.
|