The Future of Mobile Communication: IVAS Audio Calls.
Nokia is researching the future of mobile communications and how IVAS audio calls are transforming this field.
Verbal communication has been the main form of interaction among people, and telephony has facilitated this connection for over a century. Over the years, phone calls have evolved from analog to digital, from landlines to mobile, and have significantly improved in audio quality. However, there still remained a significant advancement needed: achieving the transmission of completely authentic and immersive sound in real time.
The recent introduction of the IVAS codec (Immersive Voice and Audio Services), standardized by 3GPP in its Release 18 in June of this year, marks an important milestone in audio technology. Unlike traditional monophonic voice calls, IVAS allows for the transmission of three-dimensional and immersive audio, providing a richer and more realistic communicative experience. This advancement is achieved through new audio formats optimized for a conversational spatial audio experience. One example is the MASA format (Metadata-Assisted Spatial Audio), which uses only two audio channels and metadata to describe spatial audio. Spatial audio calls enable users to experience sound as if it were happening in real life, including features such as head tracking.
With the arrival of 3D audio calling, representing a significant technological leap in the telecommunications field, new challenges arise in creating an authentic and immersive experience. The transmission of spatial audio, where sounds are perceived as coming from different directions, is notably more complex in mobile environments compared to controlled settings like cinemas or video games. To achieve this immersive experience, it is necessary to address technical issues such as real-time spatial sound processing and hardware limitations.
One of the most significant challenges for spatial communication to be effective is noise reduction, which is essential for improving speech clarity in noisy environments. Traditional noise reduction techniques typically filter only continuous sounds and are not effective in all contexts. Recently, advances in noise reduction through machine learning have enabled intelligent adjustments to the level of noise reduction based on the environment.
Additionally, immersive audio systems face the challenge of acoustic echo, where microphones pick up sounds from nearby speakers, generating unwanted feedback. The solution to this has been the development of a machine learning-based echo cancellation system that enhances audio quality in real-time voice applications.
The IVAS codec has recently been adopted by 3GPP as a new voice standard, developed through the collaboration of 13 companies and based on the well-known EVS codec. This new standard ensures compatibility with existing voice services. Among its main innovations is the new MASA format, designed for devices with size constraints, such as smartphones. IVAS includes a renderer that supports binaural audio followed by head tracking and multi-speaker playback.
The arrival of immersive 3D audio revolutionizes the sound experience for consumers, businesses, and industries. For end users, it enhances personal interaction by allowing local sounds to be shared, both in live and recorded broadcasts. In business environments, 3D voice calls improve customer experience and optimize team collaboration. In the industrial realm, audio analysis can promote automated processes that enhance operational efficiency.
As the use of mobile networks evolves, service providers will need to offer scalable solutions that optimize performance under varying bandwidth conditions. The IVAS codec supports a wide range of bit rates, ensuring immersive audio quality in diverse network conditions. Looking to the future, user behavior regarding verbal communication is expected to continue evolving, incorporating semi-synchronous messaging applications and increased use of group calls. With the growth of extended reality devices and services, the importance of immersion will become a distinctive feature in communication.