Forum Discussion

AravindManohar83's avatar
AravindManohar83
Copper Contributor
Aug 02, 2024

Speech Recognition for Alphanumeric

Hi,

 

I am using Azure Communication Service with Cognitive Service for handling voice call scenarios (STT and TTS). One of our customer use cases requires alpha-numeric input in a workflow. The Azure Speech recognizer performs well for numbers and other patterns. However, when the user spells out alphabets for alphanumeric values, the recognition success rate is very low.

 

For example, the product ID pattern is like "P-43246". In most cases, "P" is recognized as "D", "B", or "3".

 

I have tested this on both mobile phone networks and VoIP. The success rate is significantly lower on mobile networks.

 

Is there any settings available to improve the recognition success rate?


Azure Services used:
ACS Phone Number
Azure Cognitive Service 
Event Grid Subscriptions

Thanks,

Aravind

1 Reply

  • Yeah, this is a common issue with Azure Speech-to-Text, especially when users speak individual letters over the phone. Letters like “P”, “B”, and “D” often get confused because they sound similar, and it’s worse on mobile networks due to audio compression and background noise.

    One way to improve accuracy is by using a Custom Speech model in Azure Speech Studio. You can train it with your specific product ID patterns or common phrases your users say, so it learns what to expect.

    Also, asking users to say letters using the phonetic alphabet, like “P as in Papa,” really helps. On your end, you can map those phonetic words back to actual characters.

    If your product IDs follow a fixed format, it’s also a good idea to apply some post-processing or regex to clean up common recognition errors, like replacing "D" with "P" if the result still matches a valid ID format.

    Lastly, for important inputs, offer a DTMF fallback (pressing keys on the phone) to ensure the user can complete the task accurately. VoIP will typically give better accuracy than mobile, so the network quality definitely makes a difference.

Resources