Forum Discussion

JoergZ's avatar
JoergZ
Copper Contributor
Nov 01, 2023

speech service not reliable

I use a Python script from Microsoft to upload a *.wav file and get the text as result. Allthoug Access_key and region are correct there are cancellations from server very often. And cancellation needs up to 10 minutes to appear. Thats no solution for a usable speech-to-text service. If service works it is quite fast and the results are excellent. Here is my script:

import azure.cognitiveservices.speech as speechsdk

# Creates an instance of a speech config with specified subscription key and service region.
# Replace with your own subscription key and region.
speech_key, service_region = "My_real_key", "germanywestcentral"
speech_config=speechsdk.SpeechConfig(subscription=speech_key, region=service_region)
speech_config.set_property(speechsdk.PropertyId.Speech_LogFilename, "/home/jz/azure/speech.log")

# Creates an audio configuration that points to an audio file.
# Replace with your own audio filename.
audio_filename = "/home/jz/test3.wav"
audio_input = speechsdk.audio.AudioConfig(filename=audio_filename)

# Creates a recognizer with the given settings
speech_recognizer = speechsdk.SpeechRecognizer(speech_config=speech_config, language="de-DE", audio_config=audio_input)

print("Recognizing first result...")

# Starts speech recognition, and returns after a single utterance is recognized. The end of a
# single utterance is determined by listening for silence at the end or until a maximum of 15
# seconds of audio is processed.  The task returns the recognition text as result. 
# Note: Since recognize_once() returns only a single utterance, it is suitable only for single
# shot recognition like command or query. 
# For long-running multi-utterance recognition, use start_continuous_recognition() instead.

result = speech_recognizer.recognize_once()
print(result)

# Checks result.
if result.reason == speechsdk.ResultReason.RecognizedSpeech:
    print("Recognized by Azure: {}".format(result.text))
elif result.reason == speechsdk.ResultReason.NoMatch:
    print("No speech could be recognized: {}".format(result.no_match_details))
elif result.reason == speechsdk.ResultReason.Canceled:
    cancellation_details = result.cancellation_details
    print("Speech Recognition canceled: {}".format(cancellation_details.reason))
    if cancellation_details.reason == speechsdk.CancellationReason.Error:
        print("Error details: {}".format(cancellation_details.error_details))

And here an example for error message after roundabout 10 minutes:

Recognizing first result...
SpeechRecognitionResult(result_id=7fda87ec70d24b31809d911333595391, text="", reason=ResultReason.Canceled)

After a while the same request was succesfull:

Recognizing first result...
SpeechRecognitionResult(result_id=1a329282904b443ea4a14ee2a6ef4b28, text="Spiele am Radio im Büro, Kanal 3.", reason=ResultReason.RecognizedSpeech)

 How can I ensure that the service really works, or set a timeout that cancels the query, e.g. if there is no response after 10 seconds?

No RepliesBe the first to reply

Resources