Security and privacy Live Captions Teams Meeting

Occasional Visitor

Dear Microsoft,


In our company we would like to enable live captions for our users since the Dutch language became available. However, from security and privacy perspective, I need some more information on where the audio gets translated from speech to text (where does the data go). Also I would like to have some information on who has access to this audio and speech while the data is in transit. I would imagine that the audio gets processed in Azure somewhere and maybe Microsoft engineers have access to it.


Do you have more information on encryption of speech to text (specifically for live captions in Teams meetings), where this gets processed (europe/US etc.) and who has access (from Microsoft perspective) etc.? This would help me to ease our security officers and enable the feature for our users.


Thank you in advance for your help!


3 Replies
Hello, if only thinking about opting in using Live captions and not Live transcripts you're pretty much good to go as the Live captions aren't saved anywhere (as the Live transcripts).

See more info here, scroll down to FAQ
Hi Christian,

Thanks for your reply! That already helps a lot in the discussion with our security officers. I am wondering if there is any more information on end-to-end encryption of data in transit / rest and where this data is actually processed. Microsoft indicates that it is processed in the geographical region of your organization. However I cannot find specifically in which countries or data centers the speech-to-text activity occurs.

Also if I google on Microsoft ASR or automatic speech recognition, I cannot find more information. I did however found more info on Azure Cognitive Services. Do you know if that info is applicable to live captions (in other words; are Azure Cognitive Services used for live captions in Teams)?

Thanks again for your help!
Hello, it's being processed where you have Azure (look at the subscription/tenant details) and perhaps here too As far as I know it's simply processed by the service and all Microsoft services are heavily encrypted, with the addition that no one has access to this ASR information.

I am not sure, so not going to pretend I know. But ASR is speech recognition and speech to text is part of the speech service that is part of the cognitive services. Perhaps you should give the official channels a go here, such as an advisory ticket with Microsoft.

I've had the same discussions as you're having and that's why that org. decided to use live captions in their environment and not live transcriptions (as the latter are being saved).