High-accuracy and high-speed 3D sensing technology plays an essential role in VR eye tracking as it can build a bridge to connect the user with virtual worlds. In VR eye tracking, fringe projection profilometry (FPP) can avoid dependence on scene textures and provide accurate results in near-eye scenarios; however, phase-shifting based FPP faces challenges like motion artifacts and may not meet the low-latency requirements of eye tracking tasks. On the other hand, Fourier transform profilometry can achieve single-shot 3D sensing, but the system is highly impacted by the texture variations on the eye. As a solution to the challenges above, researchers have explored deep learning-based single-shot fringe projection 3D sensing techniques. However, building a training dataset is expensive, and without abundant data the model is difficult to make generalized. In this paper, we built a virtual fringe projection system along with photorealistic face and eye models to synthesize large amounts of training data. Therefore, we can reduce the cost and enhance the generalization ability of the convolutional neural network (CNN). The training data synthesizer utilizes physically based rendering (PBR) and achieves high photorealism. We demonstrate that PBR can simulate the complex double refraction of structured light due to corneas. To train the CNN, we adopted the idea of transfer learning, where the CNN is first trained by PBR-generated data, then trained with the real data. We tested the CNN on real data, and the predicted results demonstrate that the synthesized data enhances the performance of the model and achieves around 3.722 degree gaze accuracy and 0.5363 mm pupil position error on an unfamiliar participant.
Near-eye display performance is usually summarized with a few simple metrics such as field of view, resolution, brightness, size, and weight, which are derived from the display industry. In practice, near-eye displays often suffer from image artifacts not captured in traditional display metrics. This work defines several immersive near-eye display metrics such as gaze resolution, pupil swim, image contrast, and stray light. We will discuss these metrics and their trade-offs through review of a few families of viewing optics. Fresnel lenses are used in most commercial virtual reality near-eye displays in part due to their light weight, low volume and acceptable pupil swim performance. However, Fresnel lenses can suffer from significant stray light artifacts. We will share our measurements of several lenses and demonstrate ways to improve performance. Smooth refractive lens systems offer the option for lower stray-light viewing but usually at the cost of a much larger size and weight in order to get to the same pupil swim performance. This can be addressed by using a curved image plane but requires new display technology. Polarization-based pancake optics is promising and can provide excellent image resolution and pupil swim performance within an attractive form-factor. This approach, however, generally results in low light efficiency and poor image contrast due to severe ghosting. We will discuss some of the main limitations of that technology.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.