Crowned as one of the hottest technologies of the decade, it’s no secret that augmented reality is disrupting countless verticals like online shopping, gaming, education, and health tech to name a few. As widespread as it might be if you’re here you’re most likely trying to make sense of how augmented reality works, what exactly it is, or even how you can create augmented reality apps and experiences.
Well, you’re in the right place.
You’ve most likely heard whispers of augmented(AR), mixed(MR) or experiential reality(XR) in your board room, office space or even your Linkedin community- and naturally you want to know more. For the most part, it might seem like a futuristic technology straight out of science fiction.
But here’s the thing: if you’ve used a smartphone in the past 2 years, chances are that you’ve already interacted with an AR experience.
So what exactly is Augmented Reality?
By definition, augmented reality lets one place and interact with digital objects or layers in their real, physical environment. These digital objects could be images, 3d graphics, text or any kind of data.
For very long, this kind of superimposition of digital objects was only possible with AR-specific devices, most popularly AR glasses.
However, the technology has advanced by leaps and bounds since then and is today accessible through AR compatible smartphones. In fact in 2020, 3.4 billion smartphones are AR compatible straight out-of-the-box- thanks to the introduction of Google’s ARCore and Apple’s ARKit. Yes, that’s close to half of the world’s population!
To clearly understand how augmented reality works, let’s consider a straightforward example. For instance, imagine that you’re using AR to try-on a shoe on yourself to check its size, fit and overall look and feel on your own feet (like the video below).
So how is the magic happening- how does augmented reality actually work? Here’s how:
To start off, your smartphone or tablet camera first scans what it sees in your physical environment. In the shoe example, that’s your feet and immediate surroundings. This is where computer vision kicks in to ‘process’ this raw image and identify exactly what is in the frame.
In simple words, computer vision recognises the 3D world from the 2D frame.
When it comes to enabling an augmented reality experience, two key components come into play:
i) Semantics: Semantics is responsible for the object recognition part of computer vision. Essentially, it helps the smartphone understand or process exactly what is in front of the camera feed. Is it a foot, an empty space, a face, a car and so on. Depending on the use case, semantics works on the basis of deep learning algorithms which can be trained with huge sets of training data.
ii) Projective Geometry and Depth Mapping: While the semantic side of computer vision identifies what is in front of the camera, projective geometry understands where exactly it is in the real, physical world. This includes the distance(or depth), angle and real-time orientation of the object in front.
Without this, although one might be able to overlay digital content but it won’t be placed correctly where you want it. So in the try-on footwear example, you might be able to place a shoe but it won’t be anchored to your feet.
Here it’s important to realise that all of this has to work in real-time for the AR experience to be realistic and accurate. So when a user is moving the camera around, computer vision is responsible for processing and understanding what is in front at every split second.
Once the computer vision module processes and recognises your real 3D environment, it triggers the rendering module. The rendering module is simply put responsible for projecting or superimposing the digital object in a realistic life-like manner.
Again depending on the use case, an AR pipeline is built from beforehand. This pipeline consists of realistic 3D models. 3D models are necessary for almost all AR based try-ons and visualizations, barring navigation or information sharing use cases.
Going back to the the show example once computer vision identifies your feet and immediate surroundings, the 3D models are triggered to replace the original frame of just the foot.
Combined with light adaptation, real-time shadows and high-resolution textures, this whole process is called AR rendering.
That’s a wrap on how augmented reality works. We hope now you have a better understanding of the technology and the key components that make it possible.
As an upcoming technology, the future of augmented reality looks as promising as ever. The immersive nature of the technology is what makes it even more exciting. Especially as more practical and problem solving use cases pop up.
In the next part of the guide, we will jump into how you can easily create augmented reality content, experiences, and apps. Navigate to it here → How To Create Augmented Reality.