Batch speech questions

Question

For batch speech:&nbsp;-&nbsp;Automatic Punctuation - this seems to be overly "greedy" on adding sentences - are there some refinement options here?&nbsp;- How to tell which speech models accept Human Labeled transcript AND audio?&nbsp; What's the recommended hours of audio to include?&nbsp; I've seen conflicting amounts (20 - 1000 hours)&nbsp;

mkcmichael · Answer

HeikoRa&nbsp;&nbsp;I am confused because there is no indication here on which supports audio:&nbsp;&nbsp;From trial and error, I found that 20201019 does accept audio and 20200715 does not&nbsp;

heikora · Answer

There is currently no ability to adjust how the automatic punctuation works. 
Languages that specify "Acoustic Model" in the Language Support page here https://docs.microsoft.com/en-us/azure/cognitive-services/speech-service/language-support leverage the audio you provide for adaptation. The maximum amount of data is about 20 hours. I would suggest at least a few hours of audio data for adaptation.

nathan hess · Answer

@HeikoRa&nbsp;, I've run into the same issue before as well and it is&nbsp;time consuming to approach this from a trial-and-error perspective especially depending on the number of baseline models you have, etc...

heikora · Answer

mkcmichael&nbsp;sorry for the confusion. I appreciate your feedback and we will look into making this clearer in the future.

Forum Discussion

Batch speech questions

4 Replies

Resources