Microsoft Developer Community Blog

3 MIN READ

New Responsible AI Features

Microsoft

Nov 27, 2024

We’ve had an amazing week at Microsoft Ignite! With over 80 products and features announcements, including the launch of Azure AI Foundry (formerly, Azure AI Studio), we’re excited to share the latest responsible AI updates with you.

Enhanced Model Benchmarking Experience

We’ve enhanced the model benchmarking experience in Azure AI Foundry, adding new performance metrics (e.g. latency, estimated cost, and throughput) and generation quality metrics. This allows users to compare base models across diverse criteria, to better understand potential trade-offs.

Compare benchmarks across models.

You can also now evaluate and compare base models using your own private data. This capability simplifies the model selection process by allowing organizations to compare how different models behave in real-world settings and assess which models align best with their unique requirements.

Evaluate models against your data.

New Risk & Safety Evaluations for Image & Multimodal Content

New risk and safety evaluations for image and multimodal content provide an out-of-the-box way to assess the frequency and severity of harmful content in generative AI interactions containing imagery. These evaluations can help inform targeted mitigations and demonstrate production-readiness.

Metric dashboard and detailed metric results for image evaluation in Azure AI Foundry.

Azure AI Foundry Evaluations SDK & UI

The Azure AI Foundry Evaluations SDK and UI are now both generally available. Azure AI Foundry supports a code-first Python experience, Azure AI Evaluation SDK which released its 1.0.0 version for local and cloud evaluations – as well as its visualization and low-code wizard for submitting evaluations via the Azure AI Foundry portal.

Metric dashboard and detailed metric results for text evaluation in Azure AI Foundry.

AI-assisted quality and NLP/ML-based evaluators are now all generally available with improved human alignment scores with chain of thought logic and reasoning to provide transparency and explainability to the evaluation scores. We've made significant improvements to our generation quality measurements, enhancing the accuracy and reliability of our AI models. These advancements ensure that our models produce higher-quality outputs, meeting the diverse needs of our users more effectively.

We have a new set of samples available for you to get started!

GitHub Actions for GenAI Evaluations

GitHub Actions for GenAI evaluations enable developers to use GitHub Actions to run automated evaluations of their models and applications, for faster experimentation and iteration within their coding environment. These GitHub Actions can be integrated seamlessly into existing CI/CD workflows in GitHub. With these actions, you can now run automated evaluations after each commit, using the Azure AI Foundry SDK to assess your applications for metrics such as groundedness, coherence, and fluency.

Cloud Evaluations

We now offer cloud-based evaluations to help you scale assessments on large production datasets. You can remotely initiate a job with our Azure AI Evaluation service, which will run the evaluations and return the results once they are complete. This experience is available with the Azure AI Foundry Python SDK.

Example of code used for running remote evaluations in the cloud.

AI Report

Developers can soon compile key project details, including business use case, potential risks, model card, endpoint configuration, content safety settings, and evaluation results into a unified AI report. These reports can be published to the Azure AI Foundry portal's management center for business leaders to track, review, and assess. Additionally, users can export AI reports in PDF and SPDX 3.0 AI BOM formats for integration into GRC workflows. These reports help determine if an AI model or application is ready for production as part of AI impact assessments.

Preview of the AI reports dashboard in Azure AI Foundry.

To request access to the private preview of AI reports, please complete the Interest Form.

Looking Ahead

Over the next few weeks, we’ll be updating our existing responsible AI Learn Modules to reflect recent product announcements. In the interim, we encourage you to review the following resources to familiarize yourself with all our responsible AI updates:

Read the Ignite announcement blog: New evaluation tools for multimodal apps, benchmarking, CI/CD integration and more | Microsoft Community Hub
Microsoft Learn documentation on how to evaluate with Azure AI Evaluation SDK (Python ref docs) and Azure AI Foundry UI
You can watch the recording of the session highlighting this and more capabilities from the Microsoft Ignite 'Trustworthy AI: Advanced AI risk and evaluation' session: https://ignite.microsoft.com/en-US/sessions/BRK113?source=sessions

Updated Nov 26, 2024

Version 1.0

April_Speight

Microsoft

Joined February 26, 2020

View Profile