We’ve had an amazing week at Microsoft Ignite! With over 80 products and features announcements, including the launch of Azure AI Foundry (formerly, Azure AI Studio), we’re excited to share the latest responsible AI updates with you.
Enhanced Model Benchmarking Experience
We’ve enhanced the model benchmarking experience in Azure AI Foundry, adding new performance metrics (e.g. latency, estimated cost, and throughput) and generation quality metrics. This allows users to compare base models across diverse criteria, to better understand potential trade-offs.
You can also now evaluate and compare base models using your own private data. This capability simplifies the model selection process by allowing organizations to compare how different models behave in real-world settings and assess which models align best with their unique requirements.
New Risk & Safety Evaluations for Image & Multimodal Content
New risk and safety evaluations for image and multimodal content provide an out-of-the-box way to assess the frequency and severity of harmful content in generative AI interactions containing imagery. These evaluations can help inform targeted mitigations and demonstrate production-readiness.
Azure AI Foundry Evaluations SDK & UI
The Azure AI Foundry Evaluations SDK and UI are now both generally available. Azure AI Foundry supports a code-first Python experience, Azure AI Evaluation SDK which released its 1.0.0 version for local and cloud evaluations – as well as its visualization and low-code wizard for submitting evaluations via the Azure AI Foundry portal.
AI-assisted quality and NLP/ML-based evaluators are now all generally available with improved human alignment scores with chain of thought logic and reasoning to provide transparency and explainability to the evaluation scores. We've made significant improvements to our generation quality measurements, enhancing the accuracy and reliability of our AI models. These advancements ensure that our models produce higher-quality outputs, meeting the diverse needs of our users more effectively.
We have a new set of samples available for you to get started!
GitHub Actions for GenAI Evaluations
GitHub Actions for GenAI evaluations enable developers to use GitHub Actions to run automated evaluations of their models and applications, for faster experimentation and iteration within their coding environment. These GitHub Actions can be integrated seamlessly into existing CI/CD workflows in GitHub. With these actions, you can now run automated evaluations after each commit, using the Azure AI Foundry SDK to assess your applications for metrics such as groundedness, coherence, and fluency.
Cloud Evaluations
We now offer cloud-based evaluations to help you scale assessments on large production datasets. You can remotely initiate a job with our Azure AI Evaluation service, which will run the evaluations and return the results once they are complete. This experience is available with the Azure AI Foundry Python SDK.
AI Report
Developers can soon compile key project details, including business use case, potential risks, model card, endpoint configuration, content safety settings, and evaluation results into a unified AI report. These reports can be published to the Azure AI Foundry portal's management center for business leaders to track, review, and assess. Additionally, users can export AI reports in PDF and SPDX 3.0 AI BOM formats for integration into GRC workflows. These reports help determine if an AI model or application is ready for production as part of AI impact assessments.
To request access to the private preview of AI reports, please complete the Interest Form.
Looking Ahead
Over the next few weeks, we’ll be updating our existing responsible AI Learn Modules to reflect recent product announcements. In the interim, we encourage you to review the following resources to familiarize yourself with all our responsible AI updates:
- Read the Ignite announcement blog: New evaluation tools for multimodal apps, benchmarking, CI/CD integration and more | Microsoft Community Hub
- Microsoft Learn documentation on how to evaluate with Azure AI Evaluation SDK (Python ref docs) and Azure AI Foundry UI
- You can watch the recording of the session highlighting this and more capabilities from the Microsoft Ignite 'Trustworthy AI: Advanced AI risk and evaluation' session: https://ignite.microsoft.com/en-US/sessions/BRK113?source=sessions