generative ai
78 TopicsDemystifying Gen AI Models - Transformers Architecture : 'Attention Is All You Need'
The Transformer architecture demonstrated that carefully designed attention mechanisms — without the need for sequential recurrence — could model language and sequences more effectively and efficiently. 1. Transformers Replace Recurrence Traditional models such as RNNs and LSTMs processed data sequentially. Transformers use self-attention mechanisms to process all tokens simultaneously, enabling parallelisation, faster training, and better handling of long-range dependencies. 2. Self-Attention is Central Each token considers (attends to) all other tokens to gather contextual information. Attention scores are calculated between every pair of input tokens, capturing relationships irrespective of their position. 3. Multi-Head Attention Enhances Learning Rather than relying on a single attention mechanism, the model utilises multiple attention heads. Each head independently learns different aspects of relationships (such as syntax or meaning). The outputs from all heads are then combined to produce richer representations. 4. Positional Encoding Introduced As there is no recurrence, positional information must be introduced manually. Positional encodings (using sine and cosine functions of varying frequencies) are added to input embeddings to maintain the order of the sequence. 5. Encoder-Decoder Structure The model is composed of two main parts: Encoder: A stack of layers that processes the input sequence. Decoder: A stack of layers that generates the output, one token at a time (whilst attending to the encoder outputs). 6. Layer Composition Each encoder and decoder layer includes: Multi-Head Self-Attention Feed-Forward Neural Network (applied to each token independently) Residual Connections and Layer Normalisation to stabilise training. 7. Scaled Dot-Product Attention Attention scores are calculated using dot products between queries and keys, scaled by the square root of the dimension to prevent excessively large values, before being passed through a softmax. 8. Simpler, Yet More Powerful Despite removing recurrence, the Transformer outperformed more complex architectures such as stacked LSTMs on translation tasks (for instance, English-German). Training is considerably quicker (thanks to parallelism), particularly on long sequences. 9. Key Achievement Transformers became the state-of-the-art model for many natural language processing tasks — paving the way for later innovations such as BERT, GPT, T5, and others. The latest breakthrough in generative AI models is owed to the development of the Transformer architecture. Transformers were introduced in the Attention is all you need paper by Vaswani, et al. from 2017.27Views0likes0CommentsIntroducing Evaluation API on Azure OpenAI Service
We are excited to announce new Evaluations (Evals) API in Azure OpenAI Service! Evaluation API lets users test and improve model outputs directly through API calls, making the experience simple and customizable for developers to programmatically assess model quality and performance in their development workflows.295Views0likes0CommentsUnlocking Document Intelligence: Mistral OCR Now Available in Azure AI Foundry
Every organization has a treasure trove of information—buried not in databases, but in documents. From scanned contracts and handwritten forms to research papers and regulatory filings, this knowledge often sits locked in static formats, invisible to modern AI systems. Imagine if we could teach machines not just to read, but to truly understand the structure and nuance of these documents. What if equations, images, tables, and multilingual text could be seamlessly extracted, indexed, and acted upon—at scale? That future is here. Today we are announcing the launch of Mistral OCR in the Azure AI Foundry model catalog—a state-of-the-art Optical Character Recognition (OCR) model that brings intelligent document understanding to a whole new level. Designed for speed, precision, and multilingual versatility, Mistral OCR unlocks the potential of unstructured content with unmatched performance. From Patient Charts to Investment Reports—Built for Every Industry Mistral OCR’s ability to extract structure from complex documents makes it transformative across a range of verticals: Healthcare Hospitals and health systems can digitize clinical notes, lab results, and patient intake forms, transforming scanned content into structured data for downstream AI applications—improving care coordination, automation, and insights. Finance & Insurance From loan applications and KYC documents to claims forms and regulatory disclosures, Mistral OCR helps financial institutions process sensitive documents faster, more accurately, and with multilingual support—ensuring compliance and improving operational efficiency. Education & Research Academic institutions and research teams can turn PDFs of scientific papers, course materials, and diagrams into AI-readable formats. Mistral OCR’s support for equations, charts, and LaTeX-style formatting makes it ideal for scientific knowledge extraction. Legal & Government With its multilingual and high-fidelity OCR capabilities, legal teams and public agencies can digitize contracts, historical records, and filings—accelerating review workflows, preserving archival materials, and enabling transparent governance. Key Highlights of Mistral OCR According to Mistral their OCR model stands apart due to the following: State-of-the-Art Document Understanding Mistral OCR excels in parsing complex, multimodal documents—extracting tables, math, and figures with markdown-style clarity. It goes beyond recognition to deliver understanding. benchmark testing. Whether you’re working in Hindi, Arabic, French, or Chinese—this model adapts seamlessly. State-of-the-Art Document Understanding Mistral OCR excels in parsing complex, multimodal documents—extracting tables, math, and figures with markdown-style clarity. It goes beyond recognition to deliver understanding. Multilingual by Design With support for dozens of languages and scripts, Mistral OCR achieves 99%+ fuzzy match scores in benchmark testing. Whether you’re working in Hindi, Arabic, French, or Chinese—this model adapts seamlessly. Fastest in Its Class Process up to 2,000 pages per minute on a single node. This speed makes it ideal for enterprise document pipelines and real-time applications. Doc-as-Prompt + Structured Output Turn documents into intelligent prompts—then extract structured, JSON-formatted output for downstream use in agents, workflows, or analytics engines. Why use Mistral OCR on Azure AI Foundry? Mistral OCR is now available as serverless APIs through Models as a Service (MaaS) in Azure AI Foundry. This enables enterprise-scale workloads with ease. Network Isolation for Inferencing: Protect your data from public network access. Expanded Regional Availability: Access from multiple regions. Data Privacy and Security: Robust measures to ensure data protection. Quick Endpoint Provisioning: Set up an OCR endpoint in Azure AI Foundry in seconds. Azure AI ensures seamless integration, enhanced security, and rapid deployment for your AI needs. How to deploy Mistral OCR model in Azure AI Foundry? Prerequisites: If you don’t have an Azure subscription, get one here: https://azure.microsoft.com/en-us/pricing/purchase-options/pay-as-you-go Familiarize yourself with Azure AI Model Catalog Create an Azure AI Foundry hub and project. Make sure you pick East US, West US3, South Central US, West US, North Central US, East US 2 or Sweden Central as the Azure region for the hub. Create a deployment to obtain the inference API and key: Open the model card in the model catalog on Azure AI Foundry. Click on Deploy and select the Pay-as-you-go option. Subscribe to the Marketplace offer and deploy. You can also review the API pricing at this step. You should land on the deployment page that shows you the API and key in less than a minute. These steps are outlined in detail in the product documentation. From Documents to Decisions The ability to extract meaning from documents—accurately, at scale, and across languages—is no longer a bottleneck. With Mistral OCR now available in Azure AI Foundry, organizations can move beyond basic text extraction to unlock true document intelligence. This isn’t just about reading documents. It’s about transforming how we interact with the knowledge they contain. Try it. Build with it. And see what becomes possible when documents speak your language.4KViews1like7CommentsIntegrate Custom Azure AI Agents with CoPilot Studio and M365 CoPilot
Integrating Custom Agents with Copilot Studio and M365 Copilot In today's fast-paced digital world, integrating custom agents with Copilot Studio and M365 Copilot can significantly enhance your company's digital presence and extend your CoPilot platform to your enterprise applications and data. This blog will guide you through the integration steps of bringing your custom Azure AI Agent Service within an Azure Function App, into a Copilot Studio solution and publishing it to M365 and Teams Applications. When Might This Be Necessary: Integrating custom agents with Copilot Studio and M365 Copilot is necessary when you want to extend customization to automate tasks, streamline processes, and provide better user experience for your end-users. This integration is particularly useful for organizations looking to streamline their AI Platform, extend out-of-the-box functionality, and leverage existing enterprise data and applications to optimize their operations. Custom agents built on Azure allow you to achieve greater customization and flexibility than using Copilot Studio agents alone. What You Will Need: To get started, you will need the following: Azure AI Foundry Azure OpenAI Service Copilot Studio Developer License Microsoft Teams Enterprise License M365 Copilot License Steps to Integrate Custom Agents: Create a Project in Azure AI Foundry: Navigate to Azure AI Foundry and create a project. Select 'Agents' from the 'Build and Customize' menu pane on the left side of the screen and click the blue button to create a new agent. Customize Your Agent: Your agent will automatically be assigned an Agent ID. Give your agent a name and assign the model your agent will use. Customize your agent with instructions: Add your knowledge source: You can connect to Azure AI Search, load files directly to your agent, link to Microsoft Fabric, or connect to third-party sources like Tripadvisor. In our example, we are only testing the CoPilot integration steps of the AI Agent, so we did not build out additional options of providing grounding knowledge or function calling here. Test Your Agent: Once you have created your agent, test it in the playground. If you are happy with it, you are ready to call the agent in an Azure Function. Create and Publish an Azure Function: Use the sample function code from the GitHub repository to call the Azure AI Project and Agent. Publish your Azure Function to make it available for integration. azure-ai-foundry-agent/function_app.py at main · azure-data-ai-hub/azure-ai-foundry-agent Connect your AI Agent to your Function: update the "AIProjectConnString" value to include your Project connection string from the project overview page of in the AI Foundry. Role Based Access Controls: We have to add a role for the function app on OpenAI service. Role-based access control for Azure OpenAI - Azure AI services | Microsoft Learn Enable Managed Identity on the Function App Grant "Cognitive Services OpenAI Contributor" role to the System-assigned managed identity to the Function App in the Azure OpenAI resource Grant "Azure AI Developer" role to the System-assigned managed identity for your Function App in the Azure AI Project resource from the AI Foundry Build a Flow in Power Platform: Move into the Power Platform (https://make.powerapps.com) to build out a flow that connects your Copilot Studio solution to your Azure Function App. When creating a new flow, select 'Build an instant cloud flow' and trigger the flow using 'Run a flow from Copilot'. Add an HTTP action to call the Function using the URL and pass the message prompt from the end user with your URL. The output of your function is plain text, so you can pass the response from your Azure AI Agent directly to your Copilot Studio solution. Create Your Copilot Studio Agent: Navigate to Microsoft Copilot Studio and select 'Agents', then 'New Agent'. Now select ‘Create’ button at the top of the screen From the top menu, navigate to ‘Topics’ and ‘System’. We will open up the ‘Conversation boosting’ topic. When you first open the Conversation boosting topic, you will see a template of connected nodes. Delete all but the initial ‘Trigger’ node. Now we will rebuild the conversation boosting agent to call the Flow you built in the previous step. Select 'Add an Action' and then select the option for existing Power Automate flow. Pass the response from your Custom Agent to the end user and end the current topic. My existing Cloud Flow: Add action to connect to existing Cloud Flow: When this menu pops up, you should see the option to Run the flow you created. Here, mine does not have a very unique name, but you see my flow 'Run a flow from Copilot' as a Basic action menu item. Now complete building out the conversation boosting topic: Make Agent Available in M365 Copilot: Navigate to the 'Channels' menu and select 'Teams + Microsoft 365'. Be sure to select the box to 'Make agent available in M365 Copilot'. Save and re-publish your Copilot Agent. It may take up to 24 hours for the Copilot Agent to appear in M365 Teams agents list. Once it has loaded, select the 'Get Agents' option from the side menu of Copilot and pin your Copilot Studio Agent to your featured agent list Now, you can chat with your custom Azure AI Agent, directly from M365 Copilot! Conclusion: By following these steps, you can successfully integrate custom Azure AI Agents with Copilot Studio and M365 Copilot, enhancing you’re the utility of your existing platform and improving operational efficiency. This integration allows you to automate tasks, streamline processes, and provide better user experience for your end-users. Give it a try! Curious of how to bring custom models from your AI Foundry to your CoPilot Studio solutions? Check out this blog2KViews1like3CommentsThe Future of AI: Computer Use Agents Have Arrived
Discover the groundbreaking advancements in AI with Computer Use Agents (CUAs). In this blog, Marco Casalaina shares how to use the Responses API from Azure OpenAI Service, showcasing how CUAs can launch apps, navigate websites, and reason through tasks. Learn how CUAs utilize multimodal models for computer vision and AI frameworks to enhance automation. Explore the differences between CUAs and traditional Robotic Process Automation (RPA), and understand how CUAs can complement RPA systems. Dive into the future of automation and see how CUAs are set to revolutionize the way we interact with technology.1.9KViews3likes0CommentsThe Future of AI: Harnessing AI for E-commerce - personalized shopping agents
Explore the development of personalized shopping agents that enhance user experience by providing tailored product recommendations based on uploaded images. Leveraging Azure AI Foundry, these agents analyze images for apparel recognition and generate intelligent product recommendations, creating a seamless and intuitive shopping experience for retail customers.697Views5likes3CommentsBuild an AI Powered Image App – Microsoft Learn Challenge
A new Microsoft Learn challenge module just dropped and it’s a perfect bite-sized project to get a taste for the latest AI technology and how you can start using it in fun and fast ways. In this module, Challenge project - Add image analysis and generation capabilities to your application, you will combine different AI image technology while deploying it to a web app, resulting in a great demo project to show off your skills in an image analysing and image generating app.12KViews3likes1CommentThe Future of AI: Unleashing the Potential of AI Translation
The Co-op Translator automates the translation of markdown files and text within images using Azure AI Foundry. This open-source tool leverages advanced Large Language Model (LLM) technology through Azure OpenAI Services and Azure AI Vision to provide high-quality translations. Designed to break language barriers, the Co-op Translator features an easy-to-use command line interface and Python package, making technical content globally accessible with minimal manual effort.411Views0likes0Comments