The Future of Audio and Video Calling with IVAS

Remember, an old advertisement featuring an actor being shown speaking to co-actress over a video call. That was one of the early advertisements for video calling support in India by a telecom service provider. The advertisement was shown in a way to give the audience a feeling that the actor was reaching a window (with markings similar to those of a phone) and on the other side of the window was the co-actress. The entire video, till the end, makes the audience feel that the actors are talking to each other as if they were across the window. The ad ends with the actor being called by his teammates following which it is finally revealed that he was over the phone on a video call.

Looking at that video and comparing the advertisement with the way actual video call used to be, would show a lot of differences. While the advertisement made the audience feel like they were across each other and could experience the surroundings in an “immersive” manner, but the reality would be far from it.

“Immersive” being the key word. So, in terms of calling experience what would immersive mean?

Going by a normal audio call, the people talking to each other would normally get a suppressed audio where in the different elements of the sound would be compressed resulting in a monophonic sound. Which would mean that the person hearing the sound would hear only the sound but may not feel the surroundings or hear the other sounds around.

Immersive experience would provide more realistic experience in terms of the surroundings and quality of the calls. The sound effects would be three-dimensional. This would try and give the callers a feeling that they are talking to each other in the same physical area.

This is achieved by using the spatial audio which provides an experience like the sound is coming from different directions as if there were different channels to process or deliver the sound. This experience is brought using the IVAS.

“It is a new voice and audio codec standardized by 3GPP. It is part of 3GPP Rel. 18. IVAS is the first 3GPP standard for transmitting conversational stereo and immersive voice and audio. The IVAS codec enables live immersive audio for any device form factor, bringing people together for real-life interaction with accurate and immersive three-dimensional rendering of captured sound. Nokia is participating in IVAS standardization in 3GPP and one of its most active contributors and proponents”

The experience will go beyond phone calls. It can provide a different feeling and experience in terms of group calls or some live event experiences. It may turn out to be a great experience or it could end up in a commotion (for group calls or live events) considering sound captured from multiple sources in order to provide a real life feeling of being in the same physical location.

Enabling this experience for all would need a collaboration amongst different players like operators, chip manufacturers, companies manufacturing handsets etc. to enable spatial communication is supported and available for all.

There is still some time before this becomes available to the masses. It needs support and collaboration between multiple players.

Author Details

Mohammad Athar Jamal

Athar is an Enterprise and Cloud Solution Architect at Infosys. He works on Digital Transformation for different clients and enhances the Digital Experience for enterprises. He architects microservices, UI/Mobile applications, and Enterprise solutions by leveraging cloud services, designing cloud-native applications. His work includes providing leadership, strategy, and technical consultation.

Leave a Comment

Your email address will not be published.