Microsoft Foundry Blog

4 MIN READ

Transcribe audio to text from blob storage without writing any code using Power Automate

Microsoft

Mar 26, 2023

We are happy to introduce the Power Automate Flow template "Transcribe audio files to text from Azure Blob" that helps you to automatically transcribe audio files to text from Azure Blob storage, then save the transcribed text back to Blob storage. By leveraging the Azure AI Speech Batch Transcription, it is able to support more than 100 languages and dialects with best-in-class transcription accuracy.

We have created a step-by-step tutorial below to assist you in getting started with the Power Automate Flow template.

Step-by-step tutorial

Prerequisites

A Power Automate account is required. If you don’t have any, Sign up and sign in to Power Automate.
A valid Speech or Cognitive Service multiple-service resource is required. If you don’t have any, Sign up for Speech up or Create a Cognitive Services resource in the Azure portal.
A valid Azure Blob Storage account is required. If you don’t have any, Create a storage account for Azure Storage.

Step 1: Set up Azure Speech service Key and Region

After your Speech resource is deployed, you can go to Azure portal -> Go to resource -> Keys and Endpoint to view and manage keys. The Speech resource key and region will be required later for the Connector setup.

For more information about Cognitive Services resources, see Get the keys for your resource.

Step 2: Set up Azure Storage account and Blob container

After your Azure Storage resource is deployed, you can go to Azure portal -> Go to resource -> Access Keys to view and manage keys. The storage account name and key will be required later for the Connector setup.

You also need to create a new container or use an existing one to store your audio files. The container name will be used will be required later for the Connector setup as well. Here we created a container named “mypowercontainer” as an example.

Step 3: Create a Power Automate flow from Template

Sign in to Power Automate portal. From the left side menu, select My flows. Then select Automated cloud flow > Start from a template

Search by keyword “speech blob” and select the template “Transcribe audio files to text from Azure Blob”

Set up the Connection for Azure Blob Storage Connector. You can select an existing connection or add a new. Here we will add a new connection as an example. In the Authentication type dropdown, there are three types of authentications supported: , Access Key, Azure AD Integrated. Here we are using “Access Key” for authentication, and you need to input the account name and key from earlier steps.

If you want to use other alternative authentication types for the connection, learn more from Azure Blob Storage - Connectors | Microsoft Learn

Set up the Connection for Azure Batch Speech-to-text Connector. You can select an existing connection or add a new. Here we will add a new connection as an example. In the Authentication type dropdown, there are two types of authentications supported: Access Key, Azure AD Integrated. Here we are using “Access Key” for authentication, and you need to input the key and region from Step 1.

If you want to use Azure-AD for the connection, learn more from Authentication in Azure Cognitive Services - Azure Cognitive Services | Microsoft Learn

Then you should be able to use the template and edit on top of it. The original template includes actions of “When a blob is added or modified” and “Check audio format and transcribe into test”, with several conditions and variables.

In the action of “When a blob is added or modified”, input the container name you have from Step 2.

In the variable of “Input locale”, you need to input a locale variable that matches with your desired audio contents (here we are using en-us as an example). Learn more about Speech service supported languages and locales here.

Alternatively, Iin the action of “Create transcription”, you can also enable automatic Language Identification as an advanced option to identify languages spoken in audio when compared against a list of supported languages. Learn more about Language Identification in Speech Service.

Besides, you can also specify more advanced settings for the batch transcription service (enabling word level timestamps, number of audio channels, profanity filter, etc) per your needs.

The rests are all automatically configured. Now you can save the Flow and start to use it once it’s successfully saved.

Step 4: Run and test your Flow

Now let’s quickly test your automation flow. You can go back to the Storage container from Azure portal and upload an audio file. The Speech service supports audio formats of .wav, .ogg, .mp3. Here we upload “my audio.wav” file as an example.

Wait for a few seconds, if everything goes well, you should be able to see two folders (trans, log) created under the same container, and in the trans folder it contains the recognized plain text file as well as the detailed JSON output.

Enjoy your automation with Azure Speech service

For more information

Updated Mar 26, 2023

Version 3.0

azure ai services

speech

ArcherZ

Microsoft

Joined May 25, 2021

View Profile