Meta shows AI-based AR for VR headsets
Image: Reality Labs Research
A good VR passthrough is a major technical challenge. Reality Labs researchers present a new AI-based solution.
Headsets like Varjo XR-3, Quest Pro, Lynx R-1 and Apple’s next device will be the best way to experience augmented reality for years to come.
Unlike traditional AR headsets with transparent optics like Hololens 2 and Magic Leap 2, which project AR elements directly into the eye via a waveguide display, the aforementioned headsets film the physical environment with forward-facing cameras. ‘outside, then display it on opaque screens. There they can then be extended with AR elements as desired.
The artificial synthesis of the gaze: a big problem
This technology, called Passthrough AR, has great advantages over conventional AR optics, but also its problems. The challenge is to reconstruct the physical environment using sensor data as if the person wearing the helmet saw it with their own eyes. It is an extremely difficult task.
Resolution, color fidelity, depth representation and perspective: All of these should match the natural visual impression and change with as little latency or delay as the head is moved.
Perspective in particular poses great challenges for the technology, as the positions of the cameras do not correspond exactly to the positions of the eyes. This change in perspective can lead to discomfort and visual artifacts. The latter is due to the fact that the sides of a nearby object or the hands may be obscured in the camera’s view, whereas this would not be the case in the natural view.
The Meta Quest 2 shows the current state of transparent technology available to the consumer: the display of the environment is black and white, grainy and distorted, especially for objects held close to the face.
The devices mentioned above will soon provide better results or already do so, as the sensor technology is optimized for passthrough technology. However, you should not expect a perfect picture of the physical environment, as many basic issues (see above) have yet to be resolved.
AI reconstruction yields high-quality results
In August, Meta researchers unveiled several technical innovations for virtual reality at Siggraphh 2022, including a new passthrough method that reconstructs visual perspective using artificial neural networks. The technique is called NeuralPassthrough.
“We introduce NeuralPassthrough to take advantage of recent advances in deep learning, by solving passthrough as an image-based neural rendering problem. Specifically, we jointly apply stereo depth estimation and image reconstruction networks to produce the eye-point-of-view images via an end-to-end approach, suitable for today’s desktop-connected VR headsets. today and their stringent real-time requirements,” said the research document states.
The developed AI algorithm estimates the depth of the room and the objects in it and reconstructs an artificial viewing angle that matches that of the eyes. The model was trained with synthetic datasets: image sequences showing 80 spatial scenes from different viewing angles. The resulting artificial neural network is flexible and can be applied to different cameras and ocular distances.
Perfect Passthrough AR still has a “very long way” ahead of it
The results are promising. Compared to Meta Quest 2 and other passthrough methods, NeuralPassthrough provides better image quality and meets the requirements for perspective-correct stereoscopic gaze synthesis, as shown in the following video.
However, the technique has certain limitations. For example, the quality of the result strongly depends on the precision of the AI space estimation. Depth sensors could improve the result in the future. Another challenge is perspective-dependent reflections on objects that the AI model cannot reconstruct. This in turn leads to artifacts.
Another problem is Computing power: The prototype built specifically for research purposes (see cover photo) is powered by a desktop computer, in which runs an Intel Xeon W-2155 and two Nvidia Titan V – one high-end graphics card per eye.
The result is a passthrough image with a resolution of 1280 x 720 pixels and a latency of 32 milliseconds. This is too low resolution with too high latency for high quality passthrough.
“To deliver a compelling VR passthrough, the field will need to make significant progress both in image quality (i.e. removing noticeable distortion and disocclusion artifacts), while meeting the requirements strict real-time, stereoscopic and wide field of view. Tackling the additional stress of mobile processors for portable computing devices means there really is a long way to go,” the scientists write.
Find all our AI news on THE DECODER.