SOLVED

Web Speech API support ?

Brass Contributor

Currently Edge (81.0.381.0) is behaving strange when using the Web Speech API for speech recognition. The interface exists, but does not work AND does not throw any error.

Here is a quick code example:

 

var rec = new window.webkitSpeechRecognition(); 
rec.onresult = console.log; 
rec.onerror = console.error;
rec.start();

 

Are there any plans to either:

a) Fully implement the speech recognition part or ...

b) ... throw proper errors?

 

Since Microsoft is already using speech recognition in Windows, has the technology for offline and online ASR and has the cloud architecture it should be no big deal to fully support it right? :lol:

 

It would be really great to finally get a 2nd browser that actually has a working implementation of the API (Google is doing it for 6 years now since Chrome 33).

50 Replies

An update on the current state:

- Flags are not required anymore

- Canary works on first try, a follow-up call wont work anymore and the 3rd call will crash the browser, always! This can be easily reproduced using google.com for testing or my own test-page

- Dev is stuck in "no internet connection" on google.com and will crash the browser occasionally

 

I've reported this about 2-3 weeks ago and it is unchanged ever since.

So to sum up the current state is worse than ever before due to constant crashes 😐

 

@MissyQ Do you have any news from the dev team about it? Are they aware of these issues? Can they reproduce it?

After some more testing, it seems Edge's SpeechRecognition API is still not usable. Here are some notes:
 
1) On google.com test (google voice search)
 
System: linux mint 20.0 Edge Dev 90.0.803.0 (Official build) dev (64-bit) 
1st load - says not available
2nd load - crashes after clicking the voice search
 
 
2) With LipSurf Chrome Extension:
 
System: Windows 10
 
LipSurf semi works on Canary build, but confidence is "0" for all results, when it should be returning a decimal between 0 and 1, see below:

 

SpeechRecognitionEvent
bubbles: false
cancelBubble: false
cancelable: false
composed: false
currentTarget: SpeechRecognition {grammars: SpeechGrammarList, lang: "en-US", continuous: true, interimResults: true, maxAlternatives: 1, …}
defaultPrevented: false
emma: null
eventPhase: 0
interpretation: null
isTrusted: true
path: []
resultIndex: 2
results: SpeechRecognitionResultList
0: SpeechRecognitionResult
0: SpeechRecognitionAlternative {transcript: "Scroll down.", confidence: 0}
isFinal: true
length: 1
__proto__: SpeechRecognitionResult
1: SpeechRecognitionResult
0: SpeechRecognitionAlternative {transcript: "Scroll up.", confidence: 0}
isFinal: true
length: 1
__proto__: SpeechRecognitionResult
2: SpeechRecognitionResult
0: SpeechRecognitionAlternative {transcript: "Scroll down.", confidence: 0}
isFinal: true
length: 1
__proto__: SpeechRecognitionResult
length: 3
__proto__: SpeechRecognitionResultList
returnValue: true
srcElement: SpeechRecognition {grammars: SpeechGrammarList, lang: "en-US", continuous: true, interimResults: true, maxAlternatives: 1, …}
target: SpeechRecognition {grammars: SpeechGrammarList, lang: "en-US", continuous: true, interimResults: true, maxAlternatives: 1, …}
timeStamp: 32099.350000000413
type: "result"
__proto__: SpeechRecognitionEvent

@mikohead 

 

I am very disappointed... 

If it doesn't work why is the interface sitting there pretending it does, and making it really hard to code web apps that use speech recognition? 

 

I don't buy that the developers forgot to hook it up. Its not really easy to implement speech recognition in a synchroneous way for mass users in different languages. Its got to be a massive project with timelines and project management and schedules. 

 

 


@Poodll_Guy wrote:

I don't buy that the developers forgot to hook it up.


Well I'd rather say they forgot to disable the API when they switched to the Chromium based Edge 😉

The API has been in Chromium for ages and Google simply adds the Server URL and authentication (I guess) when they build Chrome. Firefox handles this better, they have full API support as well but since they don't have a server they disable it (it was once working in Firefox Nightly when they licensed a Google Server ^^).

 


@Poodll_Guy wrote:

Its not really easy to implement speech recognition in a synchroneous way for mass users in different languages. Its got to be a massive project with timelines and project management and schedules. 


True, that's why Firefox doesn't have it. Microsoft built the infrastructure for Cortana and has proven that it can work (in Edge Dev with Flags for example) but for some reason doesn't go main stream. Apple has the infrastructure built for Siri and has now enabled Web Speech API in Safari.

@florianSB 

 

Well I'd rather say they forgot to disable the API when they switched to the Chromium based Edge 😉


This is correct. It is also the case with Brave browser, which behaves identically. I had not realised Safari Web Speech API was now working. That is very good news. Thanks for sharing,

I've just tested the Web Speech API in the current release of (normal, no-flags, non-dev, non-canary) version of Edge again ... and it seems to work well now :-). I'm cautiously optimistic that we have actually 3 browsers now (Chrome, Edge, Safari) with full support ^^. Took them only 9 years 😛

@florianSB 

 

Did you try restarting the recognizer? Every time I try to restart it using `recognition.start()` Edge freezes for ~8s. 

It seems to work fine for me. You can test it here: https://sepia-framework.github.io/app/index.html
(simply skip the login to get to demo mode and press the microphone below). Its an older version of my app that wasn't working for a while in Edge because of all the initial quirks in Edge's Web Speech API but it has fixed itself recently and now works almost with the same performance as in Chrome ^^.
@florianSB
I tried your test application the day after your post and it indeed worked but the following day it stopped working (same as my own tests) producing "Error: 'E0? - network'". In Edge release notes I see that Microsoft says that support for speech recognition is added for google.com and similar web sites. I wonder if speech recognition worked for all web sites for couple days (as an oversight) and then was shutdown? Would be good to understand what Microsoft's intentions are here.

Hy @mikohead 
did you find a solution about the freeze ? I have the same issue 😕 
it seems that it came from the onError event but i'm not sure.

 

Regards