Forum Discussion
stevedorward
May 31, 2024Copper Contributor
Responsible AI
Hi, listed to the recent Responsible AI session at Build. Is it an oversight on MS part that the responsible AI dashboard and score is only in ML Studio and not Open AI studio? What responsible AI tools are available in Open AI studio
Content Safety Studio really interested in picking the right tool.
many thank steve
stevedorward thanks for your question and yes! Azure AI Studio provides many tools to help you mitigate, measure, and manage risks specific to generative AI applications. For example, Azure AI Content Safety is a tool that can detect and filter problematic inputs and outputs for your GenAI application (eg. hallucinations, prompt injection attacks). You can experiment with different configurations and run manual and/or automated evaluations for your app to compare which configuration results in outputs most aligned to your goals.
ICYMI we had two breakout sessions at Build that go into more detail than what was covered in Scott Guthre’s keynote:
Operationalize AI responsibly with Azure AI Studio (microsoft.com)
Safeguard your copilot with Azure AI (microsoft.com)
To your question on RAI dashboard: that is a useful tool for assessing and debugging traditional ML models. We've found LLMs require different techniques. Within Azure AI Studio, we offer evaluations to help you compare/select the right foundation models for your app and continually assess (and improve) your app for quality and safety before deploying to production. We support a range of pre-built evaluation metrics for quality (eg. F1 score, Groundedness, fluency) and safety (eg. jailbreak defect rate, hateful and unfair content, etc.), and customers can also define their own metrics.Documentation and blogs on LLM evaluations, specifically:
Announced at Build: Evaluation and Monitoring for LLMs (microsoft.com)
Announced in March: Introducing AI-assisted safety evaluations in Azure AI Studio (microsoft.com)
LLM evaluation concept docs: Evaluation of generative AI applications with Azure AI Studio - Azure AI Studio | Microsoft LearnLLM evaluation metrics: Evaluation and monitoring metrics for generative AI - Azure AI Studio | Microsoft Learn
Run evaluations with prompt flow SDK: Evaluate with the prompt flow SDK - Azure AI Studio | Microsoft Learn
Run evaluations with Azure AI Studio UI: How to evaluate generative AI apps with Azure AI Studio - Azure AI Studio | Microsoft Learn
If you’re using Azure OpenAI, please also check out these resources:
Azure OpenAI Service content filtering - Azure OpenAI | Microsoft Learn
System message framework and template recommendations for Large Language Models(LLMs) - Azure OpenAI Service | Microsoft Learn
How to use Risks & Safety monitoring in Azure OpenAI Studio - Azure OpenAI Service | Microsoft Learn
- mmonsma
Microsoft
stevedorward thanks for your question and yes! Azure AI Studio provides many tools to help you mitigate, measure, and manage risks specific to generative AI applications. For example, Azure AI Content Safety is a tool that can detect and filter problematic inputs and outputs for your GenAI application (eg. hallucinations, prompt injection attacks). You can experiment with different configurations and run manual and/or automated evaluations for your app to compare which configuration results in outputs most aligned to your goals.
ICYMI we had two breakout sessions at Build that go into more detail than what was covered in Scott Guthre’s keynote:
Operationalize AI responsibly with Azure AI Studio (microsoft.com)
Safeguard your copilot with Azure AI (microsoft.com)
To your question on RAI dashboard: that is a useful tool for assessing and debugging traditional ML models. We've found LLMs require different techniques. Within Azure AI Studio, we offer evaluations to help you compare/select the right foundation models for your app and continually assess (and improve) your app for quality and safety before deploying to production. We support a range of pre-built evaluation metrics for quality (eg. F1 score, Groundedness, fluency) and safety (eg. jailbreak defect rate, hateful and unfair content, etc.), and customers can also define their own metrics.Documentation and blogs on LLM evaluations, specifically:
Announced at Build: Evaluation and Monitoring for LLMs (microsoft.com)
Announced in March: Introducing AI-assisted safety evaluations in Azure AI Studio (microsoft.com)
LLM evaluation concept docs: Evaluation of generative AI applications with Azure AI Studio - Azure AI Studio | Microsoft LearnLLM evaluation metrics: Evaluation and monitoring metrics for generative AI - Azure AI Studio | Microsoft Learn
Run evaluations with prompt flow SDK: Evaluate with the prompt flow SDK - Azure AI Studio | Microsoft Learn
Run evaluations with Azure AI Studio UI: How to evaluate generative AI apps with Azure AI Studio - Azure AI Studio | Microsoft Learn
If you’re using Azure OpenAI, please also check out these resources:
Azure OpenAI Service content filtering - Azure OpenAI | Microsoft Learn
System message framework and template recommendations for Large Language Models(LLMs) - Azure OpenAI Service | Microsoft Learn
How to use Risks & Safety monitoring in Azure OpenAI Studio - Azure OpenAI Service | Microsoft Learn