About a month ago, Audience flew me out to their office in California to talk about a number of things. First, they offered the chance to check out anecohic rooms and listening booths used for testing and tuning smartphones equipped with Audience voice processors. Second, compare notes about testing and evaluating voice quality on mobile devices, both compared to the testing I’ve done and by wireless operators. Third, to take a look at their newest voice processor, the eS515. We’ve been covering noise rejection performance and voice quality on smartphones and tablets in our mobile device reviews, and Audience makes a line of standalone voice processors that work to improve voice quality both on the origin and endpoint of a mobile call.

First, Audience’s newest processor, the eS515, fundamentally changes itself from being just a standalone voice processor to a combination voice processor and audio codec. There’s a corresponding change in naming from voice processor to sound processor, as the eS515 takes the place of both a standalone codec and the slot otherwise taken by a standalone voice processor. This move changes Audience’s lineup from being a solution which requires an additional package (codec and voice processor) to being a solution that includes all the functions of a normal audio codec in addition to the audience noise processing infrastructure. The move allows direct access to all of the audio rails in addition to likely being a better sale to OEMs looking for a simple standalone solution. The eS515 includes a 1.13 watt class-D output speaker driver and 30mW output class-G headphone driver.

The other big news about eS515 are inclusion of some new improved audio processing features and others which move beyond just an emphasis on removing noise from calls and for ASR (automatic speech recognition or voice to text).

For a while we’ve seen smartphones shipping with two microphones, and recently some smartphones with three microphones, including the iPhone 5 and a number of prior Motorola phones. Until now, however, these implementations have used the these microphones in pairs, selecting combinations of two microphones to use at a time. In addition Audience designs with three microphones likewise used the microphones in pairs, selecting primary and secondary microphones depending on the phone’s position. eS515 is now the first to include a three-microphone algorithm for processing should an OEM choose to include a third microphone.

New features for eS515 also include de-reverb for rooms with heavy reverb, improved wideband processing for VoIP calls (up to 24 kHz, well beyond the 8 kHz for wideband cellular voice calls, 16 kHz is also supported), further improved ASR processing, and finally features for reducing noise when recording videos.

This last note is similar to what Motorola shipped on a number of former phones that leveraged the three-microphone setup. For example different scenes such as narration mode, wind reduction, and so on. The eS515 includes an interview mode called “Audio Zoom” which looks for a voice source behind and in front of the device and rejects noise elsewhere. Audience envisions a camera UI similar to Motorola’s with different audio scenes for users to choose from when recording videos.

I recorded a short video of “Audio Zoom” being demonstrated on an eS515 simulator.

After getting a design win, Audience works with handset and tablet makers to do final tuning on their devices, both after final industrial design is finished and sometimes on acoustic design before finalization. Part of this requires using special calibrated rooms to characterize the frequency response and directionality of devices. In addition Audience needs testing methodology for benchmarking its own projects.

I got to peek into Audience’s anechoic chamber and and an ETSI room as defined by EG 202 396–1 for noise suppression testing. Inside both rooms are a HATS (Head And Torso Simulator) which is instrumented with microphones for testing phones and tablets, and a controllable testing apparatus for holding the device under test and moving it through various positions.

The ETSI 202 396–1 specification defines a setup with four speakers and a subwoofer playing distractor music around the caller and a simple room layout. I plan to move our own smartphone call testing to a similar setup as well.

eS515 Details and Press Release
Comments Locked


View All Comments

  • Gnarr - Monday, January 7, 2013 - link

    I love when people turn technology on.
  • name99 - Monday, January 7, 2013 - link

    What the the numbers of significance in judging these sorts of products?

    Obviously they work, in some sense --- we've all had enough experience with noise-reduction in modern phones and BT headsets to know that they are better than they were.

    But, to take an obvious example, Apple went with their own custom cell for this task in the iPhone5. What can we know about this? Obviously it may just have been about cost and control, but assuming technology factored in, how does one measure the quality of noise reduction, the quality of "wide-band assist", to make a judgement that "our Apple cell can do these better (or as well, but at lower power) than your IC"?

    I phrase this in terms of Apple, but the point is more general. For example, I believe that Qualcomm has some of these features on some of its ICs, and the same question arises --- why should I choose this chip over what Qualcomm already provides?
  • Paulman - Monday, January 7, 2013 - link

    For a second, I was wondering, "How did you find the time to write this up and post it in the middle of CES?!" Then I realized it was probably on auto-post :P

    Cool stuff! I wonder what upcoming phones we can expect to see this first in. Do you think Apple will return to Audience for their next iPhone? (and actually enable it?)

Log in

Don't have an account? Sign up now