Question on Speech to Text feature

Deleted
Not applicable

Dear sirs,

 

I am trying the "Bing Speech To Text API" in audio files that contains a real conversations between a person that answer customers in a call-center, and a customer that calls the call center to solve his doubts. Thus, these audios have two persons talking, and sometimes have long silence period when the customer is waiting an answer from support. These audios have 5 to 10 minutes long.

 

My doubt is:

 

What is the best aproach to translate audios like that to text, using Microsoft Cognitive Services?

 

What APIs do I have to use, besides Bing Speech To Text?

 

Do I have to cut or convert the audios before sending them to Bing Speech To Text?

 

I am asking that because the Bing Speech to text API is returning an text very very very very very different from the audio content. It is impossible to use or undertand. But, of course, I think I am doing some mistake.

 

Please, could you explain to me the best strategy to work with audio files like this?

 

I would be very glad for any help.

Best Regads,

 

1 Reply

This forum is for questions about Microsoft Stream. We don't have expertise on Bing APIs or cognitive services APIs. 

 

Maybe this stack overflow forum would be a better place to ask you questions: 

https://stackoverflow.com/questions/tagged/microsoft-cognitive