Augmented Reality (AR) enhances user interaction with digital content through real-world overlays, finding applications across various fields. AR glasses, serving as an excellent AR platform, have been developed in both optical see-through and video see-through types. Professional video see-through devices and advanced optical see-through devices with vision systems can perform environment recognition and hand detection but are often bulky and heavy for prolonged wear. Conversely, lightweight optical see-through AR glasses, which lack embedded systems and have limited sensors, serve primarily as displays. While they offer the advantage of reduced weight, they lack advanced interaction capabilities. In this research, we utilize an Android mobile phone as the computing unit and present an interactable framework for AR glasses with limited sensors. This framework supports head motion estimation, hand gesture detection and tracking, providing a robust AR experience without the need for high-end hardware. It has been tested on lightweight optical see-through AR glasses only equipped with an Inertial Measurement Unit (IMU) and single camera. Our solution offers a cost-effective and portable approach, enhancing data visualization and virtual object operation.
Most previous target detection methods are based on the physical properties of visible-light polarization images, depending on different targets and backgrounds. However, this process is not only complicated but also vulnerable to environmental noises. A multimodal fusion detection network based on the multimodal deep neural network architecture is proposed in this research. The multimodal fusion detection network integrates the high-level semantic information of visible-light polarization image in crater detection. The network contains the base network, the fusion network, and the detection network. Each of the base networks outputs a corresponding feature figure of polarization image, fused by the fusion network later to output a final fused feature figure, which is input into the detection network to detect the target in the image. To learn target characteristics effectively and improve the accuracy of target detection, we select the base network by comparing between VGG and ResNet networks and adopt the strategy of model parameter pretraining. The experimental results demonstrate that the simulated crater detection performance of the proposed method is superior to the traditional and single-modal-based methods in that the extracted polarization characteristics are beneficial to target detection.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.