face api
8 TopicsFrom Extraction to Insight: Evolving Azure AI Content Understanding with Reasoning and Enrichment
First introduced in public preview last year, Azure AI Content Understanding enables you to convert unstructured content—documents, audio, video, text, and images—into structured data. The service is designed to support consistent, high-quality output, directed improvements, built-in enrichment, and robust pre-processing to accelerate workflows and reduce cost. A New Chapter in Content Understanding Since our launch we’ve seen customers pushing the boundaries to go beyond simple data extraction with agentic solutions fully automating decisions. This requires more than just extracting fields. For example, a healthcare insurance provider decision to pay a claim requires cross-checking against insurance policies, applicable contracts, patient’s medical history and prescription datapoints. To do this a system needs the ability to interpret information in context, perform more complex enrichments and analysis across various data sources. Beyond field extraction, this requires a custom designed workflow leveraging reasoning. In response to this demand, Content Understanding now introduces Pro mode which enables enhanced reasoning, validation, and information aggregation capabilities. These updates allow the service to aggregate and compare results across sources, enrich extracted data with context, and deliver decisions as output. While Standard mode continues to offer reliable and scalable field extraction, Pro mode extends the service to support more complex content interpretation scenarios—enabling workflows that reflect the way people naturally reason over data. With this update, Content Understanding now solves a much larger component of your data processing workflows, offering new ways to automate, streamline, and enhance decision-making based on unstructured information. Key Benefits of Pro Mode Packed with cutting-edge reasoning capabilities, Pro mode revolutionizes document analysis. Multi-Content Input Process and aggregate information across multiple content files in a single request. Pro mode can build a unified schema from distributed data sources, enabling richer insight across documents. Multi-Step Reasoning Go beyond basic extraction with a process that supports reasoning, linking, validation, and enrichment. Knowledge Base Integration Seamlessly integrate with organizational knowledge bases and domain-specific datasets to enhance field inference. This ensures outputs can reason over the task of generating the output using the context of your business. When to Use Pro Mode Pro mode, currently limited to documents, is designed for scenarios where content understanding needs to go beyond surface-level extraction—ideal for use cases that traditionally require postprocessing, human review and decision-making based on multiple data points and contextual references. Pro mode enables intelligent processing that not only extracts data, but also validates, links, and enriches it. This is especially impactful when extracted information must be cross-referenced with external datasets or internal knowledge sources to ensure accuracy, consistency, and contextual depth. Examples include: Invoice processing that reconciles against purchase orders and contract terms Healthcare claims validation using patient records and prescription history Legal document review where clauses reference related agreements or precedents Manufacturing spec checks against internal design standards and safety guidelines By automating much of the reasoning, you can focus on higher value tasks! Pro mode helps reduce manual effort, minimize errors, and accelerate time to insight—unlocking new potential for downstream applications, including those that emulate higher-order decision-making. Simplified Pricing Model Introducing a simplified pricing structure that significantly reduces costs across all content modalities compared to previous versions, making enterprise-scale deployment more affordable and predictable. Expanded Feature Coverage We are also extending capabilities across various content types: Structured Document Outputs: Improved handling of tables spanning multiple pages, recognition of selection marks, and support for additional file types like .docx, .xlsx, .pptx, .msg, .eml, .rtf, .html, .md, and .xml. Classifier API: Automatically categorize/split and route documents to appropriate processing pipelines. Video Analysis: Extract data across an entire video or break a video into chapters automatically. Enrich metadata with face identification and descriptions that include facial images. Face API Preview: Detect, recognize, and enroll faces, enabling richer user-aware applications. Check out the details about each of these capabilities here - What's New for Content Understanding. Let's hear it from our customers Customers all over the globe are using Content Understanding for its powerful one-stop solution capabilities by leveraging advance modes of reasoning, grounding and confidence scores across diverse content types. ASC: AI-based analytics in ASC’s Recording Insights platform allows customers to move to a 100% compliance review coverage of conversations across multiple channels. ASC’s integration of Content Understanding replaces a previously complex setup—where multiple separate AI services had to be manually connected—with a single multimodal solution that delivers transcription, summarization, sentiment analysis, and data extraction in one streamlined interface. This shift not only simplifies implementation and accelerates time-to-value but also received positive customer feedback for its powerful features and the quick, hands-on support from Microsoft product teams. “With the integration of Content Understanding into the ASC Recording Insights platform, ASC was able to reduce R&D effort by 30% and achieve 5 times faster results than before. This helps ASC drive customer satisfaction and stay ahead of competition.” —Tobias Fengler, Chief Engineering Officer, ASC. To learn more about ASCs integration check out From Complexity to Simplicity: The ASC and Azure AI Partnership.” Ramp: Ramp, the all-in-one financial operations platform, is exploring how Azure AI Content Understanding can help transform receipts, bills, and multi-line invoices into structured data automatically. Ramp is leveraging the pre-built invoice template and experimenting with custom extraction capabilities across various document types. These experiments are helping Ramp evaluate how to further reduce manual entry and enhance the real-time logic that powers approvals, policy checks, and reconciliation. “Content Understanding gives us a single API to parse every receipt and statement we see—then lets our own AI reason over that data in real time. It's an efficient path from image to fully reconciled expense.” — Rahul S, Head of AI, Ramp MediaKind: MK.IO’s cloud-native video platform, available on Azure Marketplace—now integrates Azure AI Content Understanding to make it easy for developers to personalize streaming experiences. With just a few lines of code, you can turn full game footage into real-time, fan-specific highlight reels using AI-driven metadata like player actions, commentary, and key moments. “Azure AI Content Understanding gives us a new level of control and flexibility—letting us generate insights instantly, personalize streams automatically, and unlock new ways to engage and monetize. It’s video, reimagined.” —Erik Ramberg, VP, MediaKind Catch the full story from MediaKind in our breakout session at Build 2025 on May 18: My Game, My Way, where we walk you through the creation of personalized highlight reels in real-time. You’ll never look at your TV in the same way again. Getting Started For more details about the latest from Content Understanding check out Reasoning on multimodal content for efficient agentic AI app building Wednesday, May 21 at 2 PM PST Build your own Content Understanding solution in the Azure AI Foundry. Pro mode will be available in the Foundry starting June 1 st 2025 Refer to our documentation and sample code on Content Understanding Explore the video series on getting started with Content Understanding685Views0likes0CommentsArizona Department of Transportation Innovates with Azure AI Vision
The Arizona Department of Transportation (ADOT) is committed to providing safe and efficient transportation services to the residents of Arizona. With a focus on innovation and customer service, ADOT’s Motor Vehicle Division (MVD) continually seeks new ways to enhance its services and improve the overall experience for its residents. The challenge ADOT MVD had a tough challenge to ensure the security and authenticity of transactions, especially those involving sensitive information. Every day, the department needs to verify thousands of customers seeking to use its online services to perform activities like updating customer information including addresses, renewing vehicle registrations, ordering replacement driver licenses, and ordering driver and vehicle records. Traditional methods of identity verification, such as manual checks and physical presence, were not only time-consuming and error-prone, but didn’t provide any confidence that the department was dealing with the right customer in remote interactions, such as online using its web portal. With high daily demand and stringent security requirements, the department recognized the need to enhance its digital presence and improve customer engagement. Facial verification technology has been a longstanding method for verifying a user's identity on-device and online account login for its convenience and efficiency. However, challenges are increasing as malicious actors persist in their attempts to manipulate and deceive the system through various spoofing techniques. The solution To address these challenges, the ADOT turned to Azure AI Vision Face API (also known as Azure Face Service), with Liveness Detection. This technology leverages advanced machine learning algorithms to verify the identity of individuals in real time. The Liveness Detection feature aims to verify that the system engages with a physically present, living individual during the verification process. This is achieved by differentiating between a real (live) and fake (spoof) representation which may include photographs, videos, masks, or other means to mimic a real person. By using facial verification and liveness detection, the system can determine whether the person in front of the camera is a live human being and not a photograph or a video. This cutting-edge technology has transformed the way the department operates to make it more efficient, secure, and reliable. Implementation and collaboration The department worked closely with Microsoft's team to ensure a seamless integration of the technology. "We were extremely excited to partner with Microsoft to use their passive liveness verification and facial verification all in one step," said Grant Hawkes, a contracted partner with the department’s Motor Vehicle Modernization (MvM) Project and its Lead Foundation Architect. "The Microsoft engineers were super receptive and super helpful. They would actually tweak the software a little bit for our use case, making our lives much easier. We have this wonderful working relationship with Microsoft, and they were extremely open with us, extremely receptive to ideas and whatever else it took. And we've only seen the ease of use get better and better and better.” Key benefits ADOT MVD has realized numerous benefits from the adoption of Azure AI Vision face liveness and verification functionality: Enhanced security—The technology has helped to reduce the risk of identity theft and fraud by enabling the verification of identities in real time, so the department can ensure that only authorized individuals can access sensitive information and complete transactions. Improved efficiency—By streamlining the verification process, the time required for identity checks has been reduced. In addition, the department is now able to offer some services online that were previously only able to be done in office, such as driver license renewals and title transfers. Accessibility—The technology has made the process easier for individuals with disabilities and the elderly to complete transactions, as they no longer have to make their way to an office for certain services. In this way, it's more inclusive and user-friendly. Cost-effective—The Azure AI Vision face technology works seamlessly across different devices, including laptops and smartphones, without requiring expensive hardware, and fits into ADOT’s existing budget. Verifying mobile driver's licenses (mDLs) is one of the most significant applications of this technology. Arizona was one of the first states to offer ISO 18013-5 compliant mDLs, allowing residents to store their driver's licenses on their mobile devices, making it more convenient and secure. Another notable application is electronic transfer of vehicle titles. Residents can now transfer vehicle titles electronically, eliminating the need for physical presence and paperwork. This will make the process much easier for citizens, while also making it more efficient and secure, reducing the risk of fraud. On-demand authentication ADOT MVD has also developed an innovative solution called on-demand authentication (ODA). This allows residents to verify their identity remotely using their mobile devices. When a resident calls ADOT MVD’s call center, they receive a text message with a link to verify their identity. The system uses Azure AI Vision to perform facial verification and liveness detection, ensuring that the person on the other end of the call is who they claim to be. "This technology has been key in mitigating fraud by increasing our confidence that we're working with the right person," said Grant Hawkes. "The whole process takes maybe a few seconds and is user-friendly for both the call center representative and the customer." Future plans The success of Azure AI Vision has prompted ADOT to explore further applications, and other state agencies are now looking at adopting the technology as well. "We see this growing and growing," said Grant Hawkes. "We're working to roll this technology out to more and more departments within the state as part of a unified identity solution. We see the value in this technology and what can be done with it." The ADOT’s adoption of Azure AI Vision Face liveness and verification functionality has transformed the way the department operates. By enhancing security, improving efficiency, and making services more accessible, the technology has brought significant benefits to both the department and the residents of Arizona. As the department continues to innovate and expand the use of this technology, it sets a benchmark for other states and organizations to follow. Our commitment to Trustworthy AI Organizations across industries are leveraging Azure AI and Copilot capabilities to drive growth, increase productivity, and create value-added experiences. We’re committed to helping organizations use and build AI that is trustworthy, meaning it is secure, private, and safe. We bring best practices and learnings from decades of researching and building AI products at scale to provide industry-leading commitments and capabilities that span our three pillars of security, privacy, and safety. Trustworthy AI is only possible when you combine our commitments, such as our Secure Future Initiative and our Responsible AI principles, with our product capabilities to unlock AI transformation with confidence. Get started: Learn more about Azure AI Vision. Learn more about Face Liveness Detection, a milestone in identity verification. See how face detection works. Try it now. Read about Enhancing Azure AI Vision Face API with Liveness Detection. Learn how Microsoft empowers responsible AI practices.342Views6likes1CommentReal Time, Real You: Announcing General Availability of Face Liveness Detection
A Milestone in Identity Verification We are excited to announce the general availability of our face liveness detection features, a key milestone in making identity verification both seamless and secure. As deepfake technology and sophisticated spoofing attacks continue to evolve, organizations need solutions that can verify the authenticity of an individual in real time. During the preview, we listened to customer feedback, expanded capabilities, and made significant improvements to ensure that liveness detection works across three platforms and for common use cases. What’s New Since the Preview? During the preview, we introduced several features that laid the foundation for secure and seamless identity verification, including active challenge in JavaScript library. Building on that foundation, there are improvements across the board. Here’s what’s new: Feature Parity Across Platforms: Liveness detection’s active challenge is now available on both Android and iOS platforms, achieving full feature parity across all supported devices. This allows a consistent and seamless experience for both developers and end users on all three supported platforms. Easy integration: The liveness detection client SDK now requires only a single function call to start the entire flow, making it easier for developers to integrate. The SDK also includes an integrated UI flow to simplify implementation, allowing a seamless developer experience across platforms. Runtime environment safety: The liveness detection client SDK integrated safety check for untrustworthy runtime environment on both iOS and Android devices. Accuracy and Usability Improvements: We’ve delivered numerous bug fixes and enhancements to improve detection accuracy and user experience across all supported platforms. Our solution is now faster, more intuitive, and more resilient against even the most advanced spoofing techniques. These advancements help that businesses integrate liveness detection with confidence, providing both security and convenience. Security in Focus: Microsoft’s Commitment to Innovation As identity verification threats continue to evolve, general availability is the start of the journey. Microsoft is dedicated to advancing our face liveness detection technology to address evolving security challenges: Continuous Support and Innovation: Our team is actively monitoring emerging spoofing techniques. With ongoing updates and enhancements, we ensure that our liveness detection solution adapts to new challenges. Learn more about liveness detection updates. Security and Privacy by Design: Microsoft’s principles of security and privacy are built into every step. We provide robust support to assist customers in integrating and maintaining these solutions effectively. We process the data securely, respecting user privacy and complying with global regulations. By collaborating closely with our customers, we ensure that together, we build solutions that are not only innovative but also secure. Learn more about shared responsibility in liveness solutions We provide reliable, long-term solutions to help organizations stay ahead of threats. Get Start Today We’re excited for customers to experience the benefits of real-time liveness detection. Whether you’re safeguarding financial transactions, streamlining digital onboarding, or enabling secure logins, our solution can strengthen your security. Explore: Learn more about integrating liveness detection into your applications by this tutorial. Try it Out: Liveness detection is available to experience in Vision Studio Build with Confidence: Empower your organization with secure, real-time identity verification. Try our sample code to see how easy it is to get started: Azure-Samples/azure-ai-vision-sdk A Step Toward a Safer Future With a focus on real-time, reliable identity verification, we’re making identity verification smarter, faster, and safer. As we continue to improve and evolve this solution, our goal remains the same: to protect identities, build trust, and verify that the person behind the screen is really you. Start building with liveness detection today and join us on this journey toward a more secure digital world.839Views6likes0CommentsAnnouncing Azure AI Content Understanding: Transforming Multimodal Data into Insights
Solve Common GenAI Challenges with Content Understanding As enterprises leverage foundation models to extract insights from multimodal data and develop agentic workflows for automation, it's common to encounter issues like inconsistent output quality, ineffective pre-processing, and difficulties in scaling out the solution. Organizations often find that to handle multiple types of data, the effort is fragmented by modality, increasing the complexity of getting started. Azure AI Content Understanding is designed to eliminate these barriers, accelerating success in Generative AI workflows. Handling Diverse Data Formats: By providing a unified service for ingesting and transforming data of different modalities, businesses can extract insights from documents, images, videos, and audio seamlessly and simultaneously, streamlining workflows for enterprises. Improving Output Data Accuracy: Deriving high-quality output for their use-cases requires practitioners to ensure the underlying AI is customized to their needs. Using advanced AI techniques like intent clarification, and a strongly typed schema, Content Understanding can effectively parse large files to extract values accurately. Reducing Costs and Accelerating Time-to-Value: Using confidence scores to trigger human review only when needed minimizes the total cost of processing the content. Integrating the different modalities into a unified workflow and grounding the content when applicable allows for faster reviews. Core Features and Advantages Azure AI Content Understanding offers a range of innovative capabilities that improve efficiency, accuracy, and scalability, enabling businesses to unlock deeper value from their content and deliver a superior experience to their end users. Multimodal Data Ingestion and Content Extraction: The service ingests a variety of data types such as documents, images, audio, and video, transforming them into a structured format that can be easily processed and analyzed. It instantly extracts core content from your data including transcriptions, text, faces, and more. Data Enrichment: Content Understanding offers additional features that enhance content extraction results, such as layout elements, barcodes, and figures in documents, speaker recognition and diarization in audio, and more. Schema Inferencing: The service offers a set of prebuilt schemas and allows you to build and customize your own to extract exactly what you need from your data. Schemas allow you to extract a variety of results, generating task-specific representations like captions, transcripts, summaries, thumbnails, and highlights. This output can be consumed by downstream applications for advanced reasoning and automation. Post Processing: Enhances service capabilities with generative AI tools that ensure the accuracy and usability of extracted information. This includes providing confidence scores for minimal human intervention and enabling continuous improvement through user feedback. Transformative Applications Across Industries Azure AI Content Understanding is ideal for a wide range of use cases and industries, as it is fully customizable and allows for the input of data from multiple modalities. Here are just a few examples of scenarios Content Understanding is powering today: Post call analytics: Customers utilize Azure AI Content Understanding to extract analytics on call center or recorded meeting data, allowing you to aggregate data on the sentiment, speakers, and content discussed, including specific names, companies, user data, and more. Media asset management and content creation assistance: Extract key features from images and videos to better manage media assets and enable search on your data for entities like brands, setting, key products, people, and more. Insurance claims: Analyze and process insurance claims and other low-latency batch processing scenarios to automate previously time-intensive processes. Highlight video reel generation: With Content Understanding, you can automatically identify key moments in a video to extract highlights and summarize the full content. For example, automatically generate a first draft of highlight reels from conferences, seminars, or corporate events by identifying key moments and significant announcements. Retrieval Augmented Generation (RAG): Ingest and enrich content of any modality to effectively find answers to common questions in scenarios like customer service agents, or power content search scenarios across all types of data. Customer Success with Content Understanding Customers all over the world are already finding unique and powerful ways to accelerate their inferencing and unlock insights on their data by leveraging the multi modal capabilities of Content Understanding. Here are a few examples of how customers are unlocking greater value from their data: Philips: Philips Speech Processing Solutions (SPS) is a global leader in dictation and speech-to-text solutions, offering innovative hardware and software products that enhance productivity and efficiency for professionals worldwide. Content Understanding enables Philips to power their speech-to-result solution, allowing customers to use voice to generate accurate, ready-to-use documentation. “With Azure AI Content Understanding, we're taking Philips SpeechLive, our speech-to-result solution to a whole new level. Imagine speaking, and getting fully generated, accurate documents—ready to use right away, thanks to powerful AI speech analytics that work seamlessly with all the relevant data sources.” – Thomas Wagner, CTO Philips Dictation Services WPP: WPP, one of the world’s largest advertising and marketing services providers, is revolutionizing website experiences using Azure AI Content Understanding. SJR, a content tech firm within WPP, is leveraging this technology for SJR Generative Experience Manager (GXM) which extracts data from all types of media on a company's website—including text, audio, video, PDFs, and images—to deliver intelligent, interactive, and personalized web experiences, with the support of WPP's AI technology company, Satalia. This enables them to convert static websites into dynamic, conversational interfaces, unlocking information buried deep within websites and presenting it as if spoken by the company's most knowledgeable salesperson. Through this innovation, WPP's SJR is enhancing customer engagement and driving conversion for their clients. ASC: ASC Technologies is a global leader in providing software and cloud solutions for omni-channel recording, quality management, and analytics, catering to industries such as contact centers, financial services, and public safety organizations. ASC utilizes Content Understanding to enhance their compliance analytics solution, streamlining processes and improving efficiency. "ASC expects to significantly reduce the time-to-market for its compliance analytics solutions. By integrating all the required capture modalities into one request, instead of customizing and maintaining various APIs and formats, we can cover a wide range of use cases in a much shorter time.” - Tobias Fengler, Chief Engineering Officer Numonix: Numonix AI specializes in capturing, analyzing, and managing customer interactions across various communication channels, helping organizations enhance customer experiences and ensure regulatory compliance. They are leveraging Content Understanding to capture insights from recorded call data from both audio and video to transcribe, analyze, and summarize the contents of calls and meetings, allowing them to ensure compliance across all conversations. “Leveraging Azure AI Content Understanding across multiple modalities has allowed us to supercharge the value of the recorded data Numonix captures on behalf of our customers. Enabling smarter communication compliance and security in the financial industry to fully automating quality management in the world’s largest call centers.” – Evan Kahan, CTO & CPO Numonix IPV Curator: A leader in media asset management solutions, IPV is leveraging Content Understanding to improve their metadata extraction capabilities to produce stronger industry specific metadata, advanced action and event analysis, and align video segmentation to specific shots in videos. IPV’s clients are now able to accelerate their video production, reduce editing time, access their content more quickly and easily. To learn more about how Content Understanding empowers video scenarios as well as how our customers such as IPV are using the service to power their unique media applications, check out Transforming Video Content into Business Value. Robust Security and Compliance Built using Azure’s industry-leading enterprise security, data privacy, and Responsible AI guidelines, Azure AI Content Understanding ensures that your data is handled with the utmost care and compliance and generates responses that align with Microsoft’s principles for responsible use of AI. We are excited to see how Azure AI Content Understanding will empower organizations to unlock their data's full potential, driving efficiency and innovation across various industries. Stay tuned as we continue to develop and enhance this groundbreaking service. Getting Started If you are at Microsoft Ignite 2024 or are watching online, check out this breakout session on Content Understanding. Learn more about the new Azure AI Content Understanding service here. Build your own Content Understanding solution in the Azure AI Foundry. For all documentation on Content Understanding, please refer to this page.5.3KViews1like0CommentsAnnouncing Face API Service SDKs for Liveness
The Azure AI Vision Face team updated the public preview of Liveness Detection at //build in May. You can read more about it here You Are Real: More Secure Identity Verification - Microsoft Community Hub. We are now happy to announce the availability of client library for building an app server for Liveness Detection in the four core Azure languages. .NET (C#): https://www.nuget.org/packages/Azure.AI.Vision.Face/ Java: https://central.sonatype.com/artifact/com.azure/azure-ai-vision-face Python: https://pypi.org/project/azure-ai-vision-face/ JavaScript; https://www.npmjs.com/package/@azure-rest/ai-vision-face The liveness solution integration consists of two separate components: a frontend mobile/web application and an app server/orchestrator. With the library, you can now easily build the app server needed for a complete liveness solution in your preferred programming language. This includes operations for creating a liveness session, querying the liveness detection result and audit logs. You can learn more from this tutorial.2.6KViews1like0CommentsYou Are Real: More Secure Identity Verification
Explore the critical role of liveness detection in safeguarding identity verification in a digital era where identity fraud risks are at an all-time high. Discover how this technology could reshape security across industries, from banking and healthcare to remote work and beyond, ensuring authenticity in every digital interaction. Learn about the latest advancements and real-world applications that make liveness detection a cornerstone of modern cybersecurity practices.3.4KViews3likes0CommentsAre You Alive: Enhancing Azure AI Vision Face API with Liveness Detection
We are excited to announce the public preview of Liveness Detection, an addition to the existing Azure AI Face API service. Facial recognition technology has been a longstanding method for verifying a user's identity in device and online account login for its convenience and efficiency. However, the system encounters escalating challenges as malicious actors persist in their attempts to manipulate and deceive the system through various spoofing techniques. This issue is expected to intensify with the emergence of generative AIs such as DALL-E and ChatGPT. Many online services e.g., LinkedIn and Microsoft Entra, now support “Verified Identities” that attest to there being a real human behind the identity.9.6KViews3likes0Comments