Blog Post

Azure Arc Blog
6 MIN READ

Ignite 2025: Preview - Intelligent Real-Time Video Insights and Agents with Azure AI Video Indexer

MaayanYedidia's avatar
MaayanYedidia
Icon for Microsoft rankMicrosoft
Nov 18, 2025

Azure AI Video Indexer, enabled by Azure Arc, is launching the preview of live video analysis feature set enabling organizations to extract real time actionable insights directly from where their data lives - with the benefits of low latency and data privacy achieved by running locally, and the centralized operational management provided through Azure Arc. It can run on Azure Local or any Arc-connected Kubernetes infrastructure. This service extends Azure Video Indexer enabled by Azure Arc that was announced as a general available product last year and now adds real-time agentic video intelligence for tasks such as safety monitoring, anomaly detection, queue tracking and stores management, supported by pre-built agents. 

Live video analysis meets industry needs for immediate operational intelligence - in manufacturing, it boosts safety and quality control, in retail, it improves customer experience and store operations, and in safety environments, it enhances situational awareness and response. The solution helps organizations act quickly and efficiently across various sectors. 

 

 Key Benefits:  
  • Real-time Insights: Instantly extract actionable information from live video feeds to support operational decisions and enhance situational awareness. 
  • Edge-native: Keep video and insights on-premises for privacy compliance and ultra-low-latency response. 
  • Agentic Intelligence: Engage with AI agents to meet distinct business requirements across various industries, including retail, Manufacturing and public safety. 
  • Custom AI Models: Define detection logic using natural language to monitor specific objects or conditions - no technical expertise required. 
  • Conversational Chat: Use natural language queries to interact with AI agents and obtain clear, context-relevant answers. 
  • Video Summarization: Automatically generate end-of-shift summaries with focused insights. 
  • Scalability and Flexibility: Seamlessly manage multiple locations and extend capabilities with Azure Arc, ensuring consistent performance and centralized control. 
  • Proactive Operations: Automate notifications and reporting to empower teams to respond quickly to anomalies, optimize flows, and continuously improve outcomes. 

 

Edge-Native Deployment: Real-Time, Privacy-First Architecture 

With the growing volume of video content generated outside traditional cloud environments, organizations are increasingly adopting solutions that process video data directly at the source—whether in retail environments, manufacturing sites, or public spaces. Azure AI live video analysis meets this need by operating entirely on-premises, ensuring ultra-low latency and complete data privacy, while still benefiting from centralized management and scalability via Azure Arc. 

Built on the Arc-enabled Video Indexer foundation, this solution now integrates agentic intelligence powered by Foundry Local Gen AI models, which utilize advanced VLMs, SLMs, and LLMs to deliver real-time insights that accelerate decision-making. Importantlyall video data remains local—processed and analyzed at its point of origin—whether on Azure Local or any Arc-connected Kubernetes deployment. This guarantees that no sensitive video data is transmitted to the cloud, ensuring the highest level of privacy and compliance. 

By combining edge-native performance with cloud-grade control, live video analysis empowers organizations to be responsive and efficient without compromising data security. It complements other AI-on-the-Edge capabilities like Edge RAG, which merges retrieval-augmented generation with an edge-native architecture. This is a modern, privacy-first approach to video intelligence—built for the edge, managed by Azure. 

 

Agentic Intelligence: Specialized Agents for Targeted Insights 

Live video analysis is powered by a modular, agent-based architecture. Each agent is designed to perform a specific task, and a conversational interface- AI Video Assistant - intelligently routes operational or investigative queries to the appropriate agent. 

RetailOps Agent monitors the physical condition of retail spaces, identifying messy shelves, missing stock, and safety hazards like spilled liquids. It helps ensure stores remain safe, clean, and well-stocked throughout the day. 

Customer Service Agent enhances customer experience by detecting forgotten personal items (e.g., wallets, phones), measuring wait times, and flagging accessibility issues such as products placed too high. It supports staff in maintaining a responsive and inclusive environment. 

Sales Recommendations Agent analyzes customer engagement with products and correlates it with real-time sales data. It identifies items that attract attention but don’t convert into purchases, offering actionable insights on placement, pricing, and visual appeal to boost performance. 

Security Agent is responsible for safeguarding individuals by proactively detecting potential hazards and unsafe conditions. This includes identifying signs of smoke or fire, detecting falls, and recognizing risky activities such as employees in construction areas not following safety protocols. 

Event Investigation Agent supports incident analysis by mapping interactions between people and objects over time. It helps reconstruct event sequences and understand causality. For example, operators can ask: “Show me what happened before this machine stopped working”.  

 

image 1: Safety agent detects hazard

 

 Custom AIs: Tailored Models for Your Environment 

Live video analysis enables users to easily define custom models using natural language, making it easier than ever to tailor video intelligence to specific operational needs. Instead of relying solely on predefined detections, users can use natural language to describe what they want to monitor , whether it’s “safety equipment,” “liquid spilled on the floor”, or “ people in a store that are struggling to reach high objects on shelves” and the system will configure a model to detect those specific objects or conditions in real time. This enables users to build their own models without needing to know how to code or be familiar with data science or machine learning algorithms. To enhance the model's ability to recognize specific events, situations, or objects, users can simply upload supporting images—such as pictures that illustrate what they want the model to detect. Additionally, they can provide negative examples to help the system distinguish between similar scenarios, such as telling the difference between a clean floor and one that is dirty. 

Whether the goal is to ensure safety compliance in manufacturing or monitor congestion in a retail environment, this flexibility ensures that the insights generated are both relevant and actionable. 

All of this is available through both the API and the portal, making it easy to define, deploy, and manage custom insights across live video streams - without requiring technical or machine learning experience. 

image 2: custom AI insight detects overhead item in the store

 

 
Conversational Interaction: Natural Language Access to Investigative Video Intelligence

Azure AI Video Indexer’s chat-based interface enables teams to investigate live video data using natural language. Through the AI Video Assistant, our new chat interface, users can ask operational or security-related questions and receive clear, context-aware responses from specialized agents - without needing to sift through hours of footage or write complex queries. 

In retail, store managers may inquire: “Is the current staffing level sufficient for operational needs?” 

Within manufacturing, safety officers might pose questions such as: “Are there personnel present who are not equipped with mandated safety gear?” 

For security and public safety, analysts can review footage by asking: “Provide an account of the events leading up to the vehicle accident.” 

This conversational approach enhances organizational interaction with video resources by converting footage into actionable operational intelligence and supporting faster, well-informed decision-making.

image 3: chat interaction with the video assistant

 

Summary 

Azure AI Video Indexer’s live video analysis capabilities, now in public preview, bring real-time video intelligence to the edge by combining local processing with the power of the Azure ecosystem. Built on Azure Arc and supported by Azure Local and Foundry Gen AI models, the solution enables organizations to analyze live video streams directly where data is generated - ensuring low latency, privacy, and centralized control. With agentic orchestration, custom AI models, and conversational interaction, it empowers industries like retail, manufacturing, and public safety to act faster, stay compliant, and improve operational outcomes at scale. 

Try it today and unlock the power of live video analysis at the edge. To get started, sign up at Application for Azure AI Video Indexer Enabled by Arc – Live Video Analysis.

  

Learn more about Azure AI Video Indexer 

 

 

Updated Nov 18, 2025
Version 1.0
No CommentsBe the first to comment