Forum Discussion
JoergZ
Nov 01, 2023Copper Contributor
speech service not reliable
I use a Python script from Microsoft to upload a *.wav file and get the text as result. Allthoug Access_key and region are correct there are cancellations from server very often. And cancellation needs up to 10 minutes to appear. Thats no solution for a usable speech-to-text service. If service works it is quite fast and the results are excellent. Here is my script:
import azure.cognitiveservices.speech as speechsdk
# Creates an instance of a speech config with specified subscription key and service region.
# Replace with your own subscription key and region.
speech_key, service_region = "My_real_key", "germanywestcentral"
speech_config=speechsdk.SpeechConfig(subscription=speech_key, region=service_region)
speech_config.set_property(speechsdk.PropertyId.Speech_LogFilename, "/home/jz/azure/speech.log")
# Creates an audio configuration that points to an audio file.
# Replace with your own audio filename.
audio_filename = "/home/jz/test3.wav"
audio_input = speechsdk.audio.AudioConfig(filename=audio_filename)
# Creates a recognizer with the given settings
speech_recognizer = speechsdk.SpeechRecognizer(speech_config=speech_config, language="de-DE", audio_config=audio_input)
print("Recognizing first result...")
# Starts speech recognition, and returns after a single utterance is recognized. The end of a
# single utterance is determined by listening for silence at the end or until a maximum of 15
# seconds of audio is processed. The task returns the recognition text as result.
# Note: Since recognize_once() returns only a single utterance, it is suitable only for single
# shot recognition like command or query.
# For long-running multi-utterance recognition, use start_continuous_recognition() instead.
result = speech_recognizer.recognize_once()
print(result)
# Checks result.
if result.reason == speechsdk.ResultReason.RecognizedSpeech:
print("Recognized by Azure: {}".format(result.text))
elif result.reason == speechsdk.ResultReason.NoMatch:
print("No speech could be recognized: {}".format(result.no_match_details))
elif result.reason == speechsdk.ResultReason.Canceled:
cancellation_details = result.cancellation_details
print("Speech Recognition canceled: {}".format(cancellation_details.reason))
if cancellation_details.reason == speechsdk.CancellationReason.Error:
print("Error details: {}".format(cancellation_details.error_details))
And here an example for error message after roundabout 10 minutes:
Recognizing first result...
SpeechRecognitionResult(result_id=7fda87ec70d24b31809d911333595391, text="", reason=ResultReason.Canceled)
After a while the same request was succesfull:
Recognizing first result...
SpeechRecognitionResult(result_id=1a329282904b443ea4a14ee2a6ef4b28, text="Spiele am Radio im Büro, Kanal 3.", reason=ResultReason.RecognizedSpeech)
How can I ensure that the service really works, or set a timeout that cancels the query, e.g. if there is no response after 10 seconds?
No RepliesBe the first to reply