P3610-2MIC Voice Interface

Params

Recommended Main Control XVF3610-QF60B-C
Number of Microphones 2
Microphone Type PDM Microphone
Output Interface USB / I2S
Sampling Rate 16 / 48 KHz
DSP Built-in firmware algorithms (AEC, NC, BF, AGC)
Open Source Status Not open source, configurable
UAC Protocol UAC 1.0

App

The P3610-2MIC is ideal for rapid deployment in voice interaction scenarios (ASR) and voice calls (Communication). We recommend applying it to the following types of products (but not limited to these):

  • Smart TVs
  • Smart TV Boxes
  • Smart Healthcare
  • Smart Home
  • Smart Conferencing

‌‌Purchase

Support and Purchase

Evaluation Boards

Name Function Status Purchase
P3610-2MIC Eval Board P3610-2MIC Voice Interface Eval Board Available Taobao Link

Supporting Chips

Name Function Status Purchase
3SM222KMB1HA-022 Bottom Pickup PDM Microphone Available Taobao Link
ES7243 Cost-effective ADC Available Taobao Link

Solution Background

How to obtain clean human voice in noisy environments (such as kitchens/living rooms/gyms), is a challenge for smart devices like TVs/Set-top boxes/Sound bars for voice interaction and communication. However, in real life, various noises in noisy environments hinder smart devices from effectively capturing human voice: these noises include:

  • Sounds played by the devices themselves, like smart devices such as TVs/Set-top boxes/Sound bars playing music
  • Steady-state and non-steady-state, diffuse noises in the environment, such as fan/air conditioner basic noises
  • Point noises in the environment space, like noises coming from a TV at a fixed position

In addition to the above noises, excessively loud sound played by Sound bars/TVs makes it difficult to accurately and effectively capture useful speaker voice.

A high-performance voice interface solution is crucial in such devices. Apart from solving interference noise issues, a high-performance voice interface solution can provide long-distance pickup and voice interrupt capabilities (Barge-in). This kind of front-end voice solution can output clean and effective human voice for voice interaction (ASR) and conference calls (Communication).

Solution Summary

The P3610-2MIC offers a smart, compact dual-microphone system featuring advanced signal optimization that works simultaneously for both speech recognition and conference calls, providing dedicated channels for these tasks. The device is powered by the XMOS XVF3610 controller and boasts three noise-cancellation algorithms for clear voice in noisy settings, along with an automatic gain control for effective pickup within 5 meters. Its far-field pickup and voice barge-in capability ensure quality conference calls. It is ideal for a range of applications including conference systems, TVs, smart speakers, and robots.

Designed with a USB Type-C port, the P3610-2MIC is versatile across different platforms and is effortlessly integrated into small and medium-sized devices due to its diminutive footprint.

  • Speech Recognition (ASR, Automatic Speech Recognition): Provided for speech recognition cloud engines; ASR audio front-end processing is mainly to improve the recognition rate of cloud-based speech recognition, so the processed spectrum tends to be full, minimizing audio distortion as much as possible. It also enhances human voice, suppresses background noise, and other noises.
  • Conference Calls (Comms, Communication and Calling): Provided for users for conference call use; Comms audio front-end processing is mainly to improve the clarity of human voice, while significant background noise suppression is implemented, resulting in a comparatively clean spectrum but with more distortion compared to ASR output, not recommended for speech recognition use.

P3610-2MIC voice interface solution can handle the three types of noise mentioned above and provide two types of front-end directional voice outputs, greatly meeting the needs of various scenarios and devices.

The main control chip XVF3610 inside P3610-2MIC voice interface integrates a USB 2.0 PHY chip, enabling the transmission of processed voice signals to smart devices (Host) through USB (UAC1.0 protocol) interface. The USB also supports various HID report protocols, such as keyboard, telephone as well as consumer. In standard Android and Linux devices, it can well demonstrate the role of human-machine interface in voice interface.

XVF3610-application

Algorithm Block Diagram

Description of algorithm modules:

XMOS 2-MIC Algorithm Block Diagram

  • AEC Echo Cancellation: Eliminates the sound played by the device itself to allow voice interruption and improve SNR
  • IC Noise Source Elimination: Scans the sound situation in the device's environment and eliminates all point noises in the room
  • NS Noise Suppression: Removes all background (including diffuse and reflective) noise
  • ADE Automatic Delay Estimation: Dynamically adjust the audio reference signals for smooth, real-time voice interruption

In P3610-2MIC's IC noise source elimination phase, it effectively removes the point noises in the environment. Meanwhile, in the automatic delay estimation algorithm phase, it can flexibly assist in the variation of AEC reference signals to increase the possibility of external speakers for smart devices. P3610-2MIC is particularly optimized for ASR front-end processing, significantly improving the speech recognition rate and barge-in (interrupt) success rate, reducing the tuning workload for major speech recognition engines.

After undergoing the above algorithmic process, the output audio effect is as shown below:

xvf3610-record-sample

Hardware Block Diagram

The main control chip XVF3610 of P3610-2MIC is in QFN-60 package, released in 2021. It provides two sets of plug-and-play standard firmware: one for integration via I2S to the mainboard and the other for insertion via USB to the mainboard. The hardware block diagram is as follows:

P3610-2MIC Block Diagram

Where:

  • Directly connect 2 PDM digital microphones to the main control XVF3610
  • External QSPI Flash for storing XVF3610 firmware
  • XVF3610 can use I2S/USB to connect to host for audio signal transmission

In real-world applications, the application diagram of the set-top box motherboard integrated with XVF3610, where XVF3610 is connected to the set-top box control via I2S, is as shown below:

XVF3610-HW-ARCH

XVF3610 offers a flexible selection of AEC reference signal in the following ways:

  • Provide AEC reference signal to XVF3610 in Host via USB UAC method using USB interface
  • Direct provision of reference signal to XVF3610 via I2S by Host; additionally, XVF3610 can increase the possibility of external ADC (ES7243) for analog signal as AEC reference.

Solution Features

Main Control Chip

  • XVF3610-QF60B-C, No software development required
  • QFN-60 package
  • 300mW power consumption

Audio Interface

16KHZ/48kHz audio sampling rate

  • USB Audio Class 1.0(UAC 1.0)
  • I2S master/slave
  • 2 PDM digital microphones

Audio Algorithms

  • Stereo AEC
  • IC Interference Noise Elimination
  • NS Noise Suppression
  • AGC Automatic Gain
  • ADEC Automatic Delay Estimation

Application Scenarios

The P3610-2MIC is a cost-effective voice interface solution that can be used in smart devices not only for voice interactions (ASR) but also for conference calls (Communication). It has been specially optimized for ASR front-end processing, significantly improving interruption handling and speech recognition performance. Given its features, we recommend users to apply it in the following or similar smart devices:

application

Technical Documents

Software Download

Contact Us