Blog Post

Multimodality using Azure AI Content Understanding
Azure AI Foundry Blog
3 MIN READ

Introducing Azure AI Content Understanding for Beginners

jfilcik's avatar
jfilcik
Icon for Microsoft rankMicrosoft
May 13, 2025

A 3-Part Video Series to Get Started

Enterprises today face several challenges in processing and extracting insights from multimodal data, like managing diverse data formats, ensuring data quality, and streamlining workflows efficiently. Ensuring the accuracy and usability of extracted insights often requires advanced AI techniques, while inefficiencies in managing large data volumes increase costs and delay results.

Azure AI Content Understanding addresses these pain points by offering a unified solution to transform unstructured data into actionable insights, improve data accuracy with schema extraction and confidence scoring, and integrate seamlessly with Azure’s ecosystem to enhance efficiency and reduce costs. Content Understanding makes it easy to extract custom task-specific output without advanced GenAI skills. It enables a quick path to scale for retrieval augmented generation (RAG) grounded by multimodal data or transactional content processing for agent workflows and process automation.

We are excited to announce a new video series to help you get started with Azure AI Content Understanding and extract the task specific output for your business. Whether you're looking for a well-rounded overview, want to discover how to develop a RAG index ovideo content, or learn how to build a post-call analytics workflow, this series has something for everyone.

What is Azure AI Content Understanding?

Azure AI Content Understanding is a new Azure AI service, designed to process and transform content of any type, including documents, images, videos, audio, and text into a user-defined output schema. This streamlined process allows developers to reason over large amounts of unstructured data, accelerating time-to-value by generating an output that can be easily integrated into agentic, automation and analytical workflows.

Video Series Highlights

  1. Azure AI Content Understanding: How to Get Started  - Vinod Kurpad, Principal GPM, AI Services, shows how you can process content of any modality—audio, video, documents, and text—in a unified workflow in Azure AI Foundry using Azure AI Content Understanding. It's simple, intuitive, and doesn't require any GenAI skills.

 

Azure AI Content Understanding Getting Started with Vinod Kurpad

2. Post-call Analytics Using Azure AI Content Understanding  - Jan Goergen Senior Program Manager, AI Services shows how to process any number of video or audio call recordings quickly in Azure AI Foundry by leveraging the Post‑Call Analytics template powered by Content Understanding. The video also introduces the broader concept of templates, illustrating how you can embed Content Understanding into reusable templates that you can build, deploy, and share across projects.

 

Post-call analytics using Azure AI Content Understanding

3. RAG on Video Using Azure AI Content Understanding - Joe Filcik, Principal Product Manager, AI Services, shows how you can process videos and ground them on your data with multimodal retrieval augmented generation (RAG) to derive insights that would otherwise take much longer. Joe demonstrates how this can be achieved using a single Azure AI Content Understanding API in Azure AI Foundry.

RAG on video using Azure AI Content Understanding

 

Why Azure AI Content Understanding?

The Azure AI Content Understanding service is ideal for enterprises and developers looking to process large amounts of multimodal content, such as call center recordings and videos for training and compliance, without requiring GenAI skills such as prompt-engineering and model selection.

Enjoy the video series and start exploring the possibilities with Azure AI Content Understanding.

 

For additional resources:

Feedback? Contact us at cu_contact@microsoft.com

Updated May 13, 2025
Version 2.0
No CommentsBe the first to comment