Transcribe audio to text from blob storage without writing any code using Power Automate
Published Mar 26 2023 08:35 PM 8,160 Views
Microsoft

We are happy to introduce the Power Automate Flow template "Transcribe audio files to text from Azure Blob" that helps you to automatically transcribe audio files to text from Azure Blob storage, then save the transcribed text back to Blob storage. By leveraging the Azure AI Speech Batch Transcription, it is able to support more than 100 languages and dialects with best-in-class transcription accuracy.

 

We have created a step-by-step tutorial below to assist you in getting started with the Power Automate Flow template.

 

Step-by-step tutorial

 

Prerequisites

 

Step 1: Set up Azure Speech service Key and Region

  • After your Speech resource is deployed, you can go to Azure portal -> Go to resource -> Keys and Endpoint to view and manage keys. The Speech resource key and region will be required later for the Connector setup.

Archer_Zhao_0-1679720603290.png

 

For more information about Cognitive Services resources, see Get the keys for your resource.

 

Step 2: Set up Azure Storage account and Blob container

  • After your Azure Storage resource is deployed, you can go to Azure portal -> Go to resource -> Access Keys to view and manage keys. The storage account name and key will be required later for the Connector setup.

Archer_Zhao_1-1679720603307.png

 

  • You also need to create a new container or use an existing one to store your audio files. The container name will be used will be required later for the Connector setup as well. Here we created a container named “mypowercontainer” as an example.

Archer_Zhao_0-1679725322148.png

 

Step 3: Create a Power Automate flow from Template

  • Sign in to Power Automate portal. From the left side menu, select My flows. Then select Automated cloud flow > Start from a template

Archer_Zhao_3-1679720603318.png

 

Archer_Zhao_4-1679720603321.png

 

 

  • Set up the Connection for Azure Blob Storage Connector. You can select an existing connection or add a new. Here we will add a new connection as an example. In the Authentication type dropdown, there are three types of authentications supported: , Access Key, Azure AD Integrated. Here we are using “Access Key” for authentication, and you need to input the account name and key from earlier steps.

Archer_Zhao_5-1679720603325.png

 

If you want to use other alternative authentication types for the connection, learn more from Azure Blob Storage - Connectors | Microsoft Learn

 

  • Set up the Connection for Azure Batch Speech-to-text Connector. You can select an existing connection or add a new. Here we will add a new connection as an example. In the Authentication type dropdown, there are two types of authentications supported: Access Key, Azure AD Integrated. Here we are using “Access Key” for authentication, and you need to input the key and region from Step 1.

Archer_Zhao_6-1679720603332.png

 

If you want to use Azure-AD for the connection, learn more from Authentication in Azure Cognitive Services - Azure Cognitive Services | Microsoft Learn

 

  • Then you should be able to use the template and edit on top of it. The original template includes actions of “When a blob is added or modified” and “Check audio format and transcribe into test”, with several conditions and variables.

Archer_Zhao_7-1679720603342.png

 

  • In the action of “When a blob is added or modified”, input the container name you have from Step 2.

Archer_Zhao_8-1679720603349.png

 

  • In the variable of “Input locale”, you need to input a locale variable that matches with your desired audio contents (here we are using en-us as an example). Learn more about Speech service supported languages and locales here.

Archer_Zhao_9-1679720603353.png

 

Alternatively, Iin the action of “Create transcription”, you can also enable automatic Language Identification as an advanced option to identify languages spoken in audio when compared against a list of supported languages. Learn more about Language Identification in Speech Service.

Archer_Zhao_10-1679720603364.png

 

Besides, you can also specify more advanced settings for the batch transcription service (enabling word level timestamps, number of audio channels, profanity filter, etc) per your needs.

 

The rests are all automatically configured. Now you can save the Flow and start to use it once it’s successfully saved.

 

Step 4: Run and test your Flow

  • Now let’s quickly test your automation flow. You can go back to the Storage container from Azure portal and upload an audio file. The Speech service supports audio formats of .wav, .ogg, .mp3. Here we upload “my audio.wav” file as an example.

Archer_Zhao_11-1679720603371.png

 

  • Wait for a few seconds, if everything goes well, you should be able to see two folders (trans, log) created under the same container, and in the trans folder it contains the recognized plain text file as well as the detailed JSON output.

Archer_Zhao_12-1679720603387.png

 

Enjoy your automation with Azure Speech service :smile:

 

For more information

3 Comments
Version history
Last update:
‎Mar 26 2023 08:58 PM
Updated by: