Blog Post

AI - Azure AI services Blog
3 MIN READ

Enhancing Accessibility: Language and Audio Document Translation Solution Architecture

SAVITAMITTAL's avatar
SAVITAMITTAL
Icon for Microsoft rankMicrosoft
May 16, 2023

This solution architecture proposal outlines how to effectively utilize OpenAI's language model alongside Azure Cognitive Services to create a user-friendly and inclusive solution for document translation. By leveraging OpenAI's advanced language capabilities and integrating them with Azure Cognitive Services, we can accommodate diverse language preferences and provide audio translations, thereby meeting accessibility standards and reaching a global audience. This solution aims to enhance accessibility, ensure inclusivity, and gain valuable insights through the combined power of OpenAI and Azure Cognitive Services.

Dataflow

Here is the process:

  1. Ingest: PDF documents, text files, and images can be ingested from multiple sources, such as Azure Blob storage, Outlook, OneDrive, SharePoint, or a 3rd party vendor.

  2. Move: Power Automate triggers and moves the file to Azure Blob storage. Blob triggers then get the original file and call an Azure Function.

  3. Extract Text and Translate: The Azure Function calls Azure Computer Vision Read API to read multiple pages of a PDF document in natural formatting order, extract text from images, and generate the text with lines and spaces, which is then stored in Azure Blob storage. The Azure Translator then translates the file and stores it in a blob container. The Azure Speech generates a WAV or MP3 file from the original language and translated language text file, which is also stored in a blob container

  4. Notify: Power Automate triggers and moves the file to the original source location and notifies users in outlook and MS teams with an output audio file.

 Without Open AI

 With Open AI

Refer for OpenAI:  Transform your business with automated insights & optimized workflows using Azure OpenAI GPT-3 - Microsoft Community Hub

Alternatives

The Azure architecture utilizes Azure Blob storage as the default option for file storage during the entire process. However, it's also possible to use alternative storage solutions such as Sharepoint, ADLS or third-party storage options. For processing a high volume of documents, consider using Azure Logic Apps as an alternative to PowerAutomate. Azure Logic Apps can prevent you from exceeding consumption limits within your tenant and is a more cost-effective solution. To learn more about Azure Logic Apps, please refer to the Azure Logic Apps.

Components

These are the key technologies used for this technical content review and research:

Scenario details

This solution uses multiple Cognitive Services from Azure to automate the business process of translating PDF documents and creating audio files in wav/mp3 audio format for accessibility and global audience. It's a great way to streamline the translation process and make content more accessible to people who may speak different languages or have different accessibility needs.

Potential use cases

By leveraging this cloud-based solution idea that can provide comprehensive translation services on demand, organizations can easily reach out to a wider audience without worrying about language barriers. This can help to break down communication barriers and ensure that services are easily accessible for people of all cultures, languages, locations, and abilities.

In addition, by embracing digital transformation, organizations can improve their efficiency, reduce costs, and enhance the overall customer experience. Digital transformation involves adopting new technologies and processes to streamline operations and provide a more seamless experience for customers.

It is particularly relevant to industries that have a large customer base or client base, such as e-commerce, tourism, hospitality, healthcare, and government services.

Updated May 16, 2023
Version 3.0
No CommentsBe the first to comment