3D audio is suddenly a hot topic. Apple recently released a firmware update to AirPods Pro which enables what Apple calls spatial audio. Think of surround sound, the virtual audio reality experience of listening to a band or an orchestra in which you can hear instruments and singers to the left, right, front and back as if you were in the audience. But only if you’re not moving. And only for sounds around you, not above or below. 3D audio first adds up and down to supported directions, so you can hear a helicopter flying overhead. All emulated through your wireless earbuds, no need to be surrounded by speakers. Second, motion sensors track your movement. As you turn your head, or look up or down, walk forward, what you hear adjusts correspondingly. Neither you nor the sound source have to be fixed. True virtual audio reality … if you’re an Apple user. But we want 3D audio for everyone, whether they’re using an Android platform, a head-mounted display, AR glasses, anything.
Immersive audio methods
One supporting technology, called Ambisonics, a full-sphere surround sound format, has history from the 1970’s. In transmission, audio is represented in a speaker-independent format, allowing the ultimate producer a lot of flexibility in decoding based on ambient conditions at the listener’s earbuds. Ambisonics is now widely adopted by YouTube and Oculus VR as a standard for their 360-degree videos. Another option is object-based audio, such as Dolby Atmos, in which mixing can be adjusted on-the-fly to available speakers, in this case the earbuds, again to emulate a 3D experience. And of course there still are more traditional and less immersive options, stereo and surround sound.
3D Audio Market
Market estimates are difficult to find but suggest a CAGR of 16-17% over the next seven years. Take together that AirPods are already Apple’s fastest growing business, and the fact that they pushed out spatial audio so quickly. That suggests they see real opportunity to innovate and lead in this space. Game studios particularly should appreciate differentiated advantage in combining VR audio with VR imaging. Once that trend starts, fear of missing out will quickly push the rest of the industry. The same thing could happen for music and multimedia content. Multiple reviewers have said that true 3D audio combined with VR/AR makes the experience much more believable and immersive than when using conventional audio.
3D processing in the device or the earbuds?
The Apple solution runs on your iPhone. IMU based head tracking data is read from the AirPods Pro earbuds to get head pose and movement, but the processing is in the phone. Which makes for easier software updates perhaps but can introduce undesirable latency compared to a system in which 3D processing is handled directly in each earbud.
CEVA and VisiSonics partnership
CEVA has partnered with VisiSonics who provide embedded software called RealSpace-3D. This supports high order ambisonics, object-based audio, surround sound (5.1 and 7.1), and simple stereo, covering all the bases. That’s important because content producers are likely to use different methods for different purposes. The software can be personalized to an individual user for maximum effect, by calibrating a head related transfer function (HRTF) taking account of different body sizes and ear shapes. Optimizing this transfer function sharpens the position of the listener in the audio space.
The VisiSonics RealSpace-3D software can run on CEVA-X2, CEVA-BX1, and CEVA-BX2 audio DSPs, taking motion data from our MotionEngine solution (software and/or hardware as preferred), all running together in an integrated system which can sit in an earbud, connecting over Bluetooth to your device. All that functionality running directly in each earbud with no latency. Call us to learn how you can take advantage of this exciting new trend.
Published on Embedded Computing Design.