azure ai
280 TopicsGemma 4 now available in Microsoft Foundry
Experimenting with open-source models has become a core part of how innovative AI teams stay competitive: experimenting with the latest architectures and often fine-tuning on proprietary data to achieve lower latencies and cost. Today, we’re happy to announce that the Gemma 4 family, Google DeepMind’s newest model family, is now available in Microsoft Foundry via the Hugging Face collection. Azure customers can now discover, evaluate, and deploy Gemma 4 inside their Azure environment with the same policies they rely on for every other workload. Foundry is the only hyperscaler platform where developers can access OpenAI, Anthropic, Gemma, and over 11,000+ models under a single control plane. Through our close collaboration with Hugging Face, Gemma 4 joining that collection continues Microsoft’s push to bring customers the widest selection of models from any cloud – and fits in line with our enhanced investments in open-source development. Frontier Intelligence, open-source weights Released by Google DeepMind on April 2, 2026, Gemma 4 is built from the same research foundation as Gemini 3 and packaged as open weights under an Apache 2.0 license. Key capabilities across the Gemma 4 family: Native multimodal: Text + image + video inputs across all sizes; analyze video by processing sequences of frames; audio input on edge models (E2B, E4B) Enhanced reasoning & coding capabilities: Multi-step planning, deep logic, and improvements in math and instruction-following enabling autonomous agents Trained for global deployment: Pretrained on 140+ languages with support for 35+ languages out of the box Long context: Context windows of up to 128K tokens (E2B/E4B) and 256K tokens (26B A4B/31B) allow developers to reason across extensive codebases, lengthy documents, or multi-session histories Why choose Foundry? Foundry is built to give developers breadth -- access to models from major model providers, open and proprietary, under one roof. Stay within Azure to work leading models. When you deploy through Foundry, models run inside your Azure environment and are subject to the same network policies, identity controls, and audit processes your organization already has in place. Managed online endpoints handle serving, scaling, and monitoring without manually setting up and managing the underlying infrastructure. Serverless deployment with Azure Container Apps allows developers to deploy and run containerized applications while reducing infrastructure management and saving costs. Gated model access integrates directly with Hugging Face user tokens, so models that require license acceptance stay compliant can be accessed without manual approvals. Foundry Local lets you run optimized Hugging Face models directly on your own hardware using the same model catalog and SDK patterns as your cloud deployments. Read the documentation here: https://aka.ms/foundrylocal and https://aka.ms/HF/foundrylocal Microsoft’s approach to Responsible AI is grounded in our AI principles of fairness, reliability and safety, privacy and security, inclusiveness, transparency, and accountability. Microsoft Foundry provides governance controls, monitoring, and evaluation capabilities to help organizations deploy new models responsibly in production environments. What are teams building with Gemma 4 in Foundry Gemma 4’s combination of multimodal input, agentic function calling, and long context offers a wide range of production use cases: Document intelligence: Processing PDFs, charts, invoices, and complex tables using native vision capabilities Multilingual enterprise apps: 140+ natively trained languages — ideal for multinational customer support, content platforms as well as language learning tools for grammar correction and writing practice Long-context analytics: Reasoning across entire codebases, legal documents, or multi-session conversation histories Getting started Try Gemma 4 in Microsoft Foundry today. New models from Hugging Face continue to roll out to Foundry on a regular basis through our ongoing collaboration. If there's a model you want to see added, let us know here. Stay connected to our developer community on Discord and stay up to date on what is new in Foundry through the Model Mondays series.202Views1like0CommentsNow in Foundry: Microsoft Harrier and NVIDIA EGM-8B
This week's Model Mondays edition highlights three models that share a common thread: each achieves results comparable to larger leading models, as a result of targeted training strategies rather than scale. Microsoft Research's harrier-oss-v1-0.6b from achieves state-of-the-art results on the Multilingual MTEB v2 embedding benchmark at 0.6B parameters through contrastive learning and knowledge distillation. NVIDIA's EGM-8B scores 91.4 average IoU on the RefCOCO visual grounding benchmark by training a small Vision Language Model (VLM) with reinforcement learning to match the output quality of much larger models. Together they represent a practical argument for efficiency-first model development: the gap between small and large models continues to narrow when training methodology is the focus rather than parameter count alone. Models of the week Microsoft Research: harrier-oss-v1-0.6b Model Specs Parameters / size: 0.6B Context length: 32,768 tokens Primary task: Text embeddings (retrieval, semantic similarity, classification, clustering, reranking) Why it's interesting State-of-the-art on Multilingual MTEB v2 from Microsoft Research: harrier-oss-v1-0.6b is a new embedding model released by Microsoft Research, achieving a 69.0 score on the Multilingual MTEB v2 (Massive Text Embedding Benchmark) leaderboard—placing it at the top of its size class at release. It is part of the harrier-oss family spanning harrier-oss-v1-270m (66.5 MTEB v2), harrier-oss-v1-0.6b (69.0), and harrier-oss-v1-27b (74.3), with the 0.6B variant further trained with knowledge distillation from the larger family members. Benchmarks: Multilingual MTEB v2 Leaderboard. Decoder-only architecture with task-instruction queries: Unlike most embedding models that use encoder-only transformers, harrier-oss-v1-0.6b uses a decoder-only architecture with last-token pooling and L2 normalization. Queries are prefixed with a one-sentence task instruction (e.g., "Instruct: Retrieve relevant passages that answer the query\nQuery: ...") while documents are encoded without instructions—allowing the same deployed model to be specialized for retrieval, classification, or similarity tasks through the prompt alone. Broad task coverage across six embedding scenarios: The model is trained and evaluated on retrieval, clustering, semantic similarity, classification, bitext mining, and reranking—making it suitable as a general embedding backbone for multi-task pipelines rather than a single-use retrieval model. One endpoint, consistent embeddings across the stack. 100+ language support: Trained on a large-scale mixture of multilingual data covering Arabic, Chinese, Japanese, Korean, and 100+ additional languages, with strong cross-lingual transfer for tasks that span language boundaries. Try it Use Case Prompt Pattern Multilingual semantic search Prepend task instruction to query; encode documents without instruction; rank by cosine similarity Cross-lingual document clustering Embed documents across languages; apply clustering to group semantically related content Text classification with embeddings Encode labeled examples + new text; classify by nearest-neighbor similarity in embedding space Bitext mining Encode parallel corpora in source and target languages; align segments by embedding similarity Sample prompt for a global enterprise knowledge base deployment: You are building a multilingual internal knowledge base for a global professional services firm. Using the harrier-oss-v1-0.6b endpoint deployed in Microsoft Foundry, encode all internal documents—policy guides, project case studies, and technical documentation—across English, French, German, and Japanese. At query time, prepend the task instruction to each employee query: "Instruct: Retrieve relevant internal documents that answer the employee's question\nQuery: {question}". Retrieve the top-5 most similar documents by cosine similarity and pass them to a language model with the instruction: "Using only the provided documents, answer the question and cite the source document title for each claim. If no document addresses the question, say so." NVIDIA: EGM-8B Model Specs Parameters / size: ~8.8B Context length: 262,144 tokens Primary task: Image-text-to-text (visual grounding) Why it's interesting Preforms well on visual grounding compared to larger models even at its small size: EGM-8B achieves 91.4 average Intersection over Union (IoU) on the RefCOCO benchmark—the standard measure of how accurately a model localizes a described region within an image. Compared to its base model Qwen3-VL-8B-Thinking (87.8 IoU), EGM-8B achieves a +3.6 IoU gain through targeted Reinforcement Learning (RL) fine-tuning. Benchmarks: EGM Project Page. 5.9x faster than larger models at inference: EGM-8B achieves 737ms average latency. The research demonstrates that test-time compute can be scaled horizontally across small models—generating many medium-quality responses and selecting the best—rather than relying on a single expensive forward pass through a large model. Two-stage training: EGM-8B is trained first with Supervised Fine-Tuning (SFT) on detailed chain-of-thought reasoning traces generated by a proprietary VLM, then refined with Group Relative Policy Optimization (GRPO) using a reward function combining IoU accuracy and task success. The intermediate SFT checkpoint is available as nvidia/EGM-8B-SFT for developers who want to experiment with the intermediate stage. Addresses a root cause of small model grounding errors: The EGM research identifies that 62.8% of small model errors on visual grounding stem from complex multi-relational descriptions—where a model must reason about spatial relationships, attributes, and context simultaneously. By focusing test-time compute on reasoning through these complex prompts, EGM-8B closes the gap without increasing the underlying model size. Try it Use Case Prompt Pattern Object localization Submit image + natural language description; receive bounding box coordinates Document region extraction Provide scanned document image + field description; extract specific regions Visual quality control Submit product image + defect description; localize defect region for downstream classification Retail shelf analysis Provide shelf image + product description; return location of specified SKU Sample prompt for a retail and logistics deployment: You are building a visual inspection system for a logistics warehouse. Using the EGM-8B endpoint deployed in Microsoft Foundry, submit each incoming package scan image along with a natural language grounding query describing the region of interest: "Please provide the bounding box coordinate of the region this sentence describes: {description}". For example: "the label on the upper-left side of the box", "the barcode on the bottom face", or "the damaged corner on the right side". Use the returned bounding box coordinates to route each package to the appropriate inspection station based on the identified region. Getting started You can deploy open-source Hugging Face models directly in Microsoft Foundry by browsing the Hugging Face collection in the Foundry model catalog and deploying to managed endpoints in just a few clicks. You can also start from the Hugging Face Hub. First, select any supported model and then choose "Deploy on Microsoft Foundry", which brings you straight into Azure with secure, scalable inference already configured. Learn how to discover models and deploy them using Microsoft Foundry documentation: Follow along the Model Mondays series and access the GitHub to stay up to date on the latest Read Hugging Face on Azure docs Learn about one-click deployments from the Hugging Face Hub on Microsoft Foundry Explore models in Microsoft Foundry209Views0likes0CommentsPower Up Your Open WebUI with Azure AI Speech: Quick STT & TTS Integration
Introduction Ever found yourself wishing your web interface could really talk and listen back to you? With a few clicks (and a bit of code), you can turn your plain Open WebUI into a full-on voice assistant. In this post, you’ll see how to spin up an Azure Speech resource, hook it into your frontend, and watch as user speech transforms into text and your app’s responses leap off the screen in a human-like voice. By the end of this guide, you’ll have a voice-enabled web UI that actually converses with users, opening the door to hands-free controls, better accessibility, and a genuinely richer user experience. Ready to make your web app speak? Let’s dive in. Why Azure AI Speech? We use Azure AI Speech service in Open Web UI to enable voice interactions directly within web applications. This allows users to: Speak commands or input instead of typing, making the interface more accessible and user-friendly. Hear responses or information read aloud, which improves usability for people with visual impairments or those who prefer audio. Provide a more natural and hands-free experience especially on devices like smartphones or tablets. In short, integrating Azure AI Speech service into Open Web UI helps make web apps smarter, more interactive, and easier to use by adding speech recognition and voice output features. If you haven’t hosted Open WebUI already, follow my other step-by-step guide to host Ollama WebUI on Azure. Proceed to the next step if you have Open WebUI deployed already. Learn More about OpenWeb UI here. Deploy Azure AI Speech service in Azure. Navigate to the Azure Portal and search for Azure AI Speech on the Azure portal search bar. Create a new Speech Service by filling up the fields in the resource creation page. Click on “Create” to finalize the setup. After the resource has been deployed, click on “View resource” button and you should be redirected to the Azure AI Speech service page. The page should display the API Keys and Endpoints for Azure AI Speech services, which you can use in Open Web UI. Settings things up in Open Web UI Speech to Text settings (STT) Head to the Open Web UI Admin page > Settings > Audio. Paste the API Key obtained from the Azure AI Speech service page into the API key field below. Unless you use different Azure Region, or want to change the default configurations for the STT settings, leave all settings to blank. Text to Speech settings (TTS) Now, let's proceed with configuring the TTS Settings on OpenWeb UI by toggling the TTS Engine to Azure AI Speech option. Again, paste the API Key obtained from Azure AI Speech service page and leave all settings to blank. You can change the TTS Voice from the dropdown selection in the TTS settings as depicted in the image below: Click Save to reflect the change. Expected Result Now, let’s test if everything works well. Open a new chat / temporary chat on Open Web UI and click on the Call / Record button. The STT Engine (Azure AI Speech) should identify your voice and provide a response based on the voice input. To test the TTS feature, click on the Read Aloud (Speaker Icon) under any response from Open Web UI. The TTS Engine should reflect Azure AI Speech service! Conclusion And that’s a wrap! You’ve just given your Open WebUI the gift of capturing user speech, turning it into text, and then talking right back with Azure’s neural voices. Along the way you saw how easy it is to spin up a Speech resource in the Azure portal, wire up real-time transcription in the browser, and pipe responses through the TTS engine. From here, it’s all about experimentation. Try swapping in different neural voices or dialing in new languages. Tweak how you start and stop listening, play with silence detection, or add custom pronunciation tweaks for those tricky product names. Before you know it, your interface will feel less like a web page and more like a conversation partner.2KViews3likes2CommentsVector Drift in Azure AI Search: Three Hidden Reasons Your RAG Accuracy Degrades After Deployment
What Is Vector Drift? Vector drift occurs when embeddings stored in a vector index no longer accurately represent the semantic intent of incoming queries. Because vector similarity search depends on relative semantic positioning, even small changes in models, data distribution, or preprocessing logic can significantly affect retrieval quality over time. Unlike schema drift or data corruption, vector drift is subtle: The system continues to function Queries return results But relevance steadily declines Cause 1: Embedding Model Version Mismatch What Happens Documents are indexed using one embedding model, while query embeddings are generated using another. This typically happens due to: Model upgrades Shared Azure OpenAI resources across teams Inconsistent configuration between environments Why This Matters Embeddings generated by different models: Exist in different vector spaces Are not mathematically comparable Produce misleading similarity scores As a result, documents that were previously relevant may no longer rank correctly. Recommended Practice A single vector index should be bound to one embedding model and one dimension size for its entire lifecycle. If the embedding model changes, the index must be fully re-embedded and rebuilt. Cause 2: Incremental Content Updates Without Re-Embedding What Happens New documents are continuously added to the index, while existing embeddings remain unchanged. Over time, new content introduces: Updated terminology Policy changes New product or domain concepts Because semantic meaning is relative, the vector space shifts—but older vectors do not. Observable Impact Recently indexed documents dominate retrieval results Older but still valid content becomes harder to retrieve Recall degrades without obvious system errors Practical Guidance Treat embeddings as living assets, not static artifacts: Schedule periodic re-embedding for stable corpora Re-embed high-impact or frequently accessed documents Trigger re-embedding when domain vocabulary changes meaningfully Declining similarity scores or reduced citation coverage are often early signals of drift. Cause 3: Inconsistent Chunking Strategies What Happens Chunk size, overlap, or parsing logic is adjusted over time, but previously indexed content is not updated. The index ends up containing chunks created using different strategies. Why This Causes Drift Different chunking strategies produce: Different semantic density Different contextual boundaries Different retrieval behavior This inconsistency reduces ranking stability and makes retrieval outcomes unpredictable. Governance Recommendation Chunking strategy should be treated as part of the index contract: Use one chunking strategy per index Store chunk metadata (for example, chunk_version) Rebuild the index when chunking logic changes Design Principles Versioned embedding deployments Scheduled or event-driven re-embedding pipelines Standardized chunking strategy Retrieval quality observability Prompt and response evaluation Key Takeaways Vector drift is an architectural concern, not a service defect It emerges from model changes, evolving data, and preprocessing inconsistencies Long-lived RAG systems require embedding lifecycle management Azure AI Search provides the controls needed to mitigate drift effectively Conclusion Vector drift is an expected characteristic of production RAG systems. Teams that proactively manage embedding models, chunking strategies, and retrieval observability can maintain reliable relevance as their data and usage evolve. Recognizing and addressing vector drift is essential to building and operating robust AI solutions on Azure. Further Reading The following Microsoft resources provide additional guidance on vector search, embeddings, and production-grade RAG architectures on Azure. Azure AI Search – Vector Search Overview - https://learn.microsoft.com/azure/search/vector-search-overview Azure OpenAI – Embeddings Concepts - https://learn.microsoft.com/en-us/azure/ai-foundry/openai/how-to/embeddings?view=foundry-classic&tabs=csharp Retrieval-Augmented Generation (RAG) Pattern on Azure - https://learn.microsoft.com/en-us/azure/search/retrieval-augmented-generation-overview?tabs=videos Azure Monitor – Observability Overview - https://learn.microsoft.com/azure/azure-monitor/overview446Views2likes1CommentAzure IoT Operations 2603 is now available: Powering the next era of Physical AI
Industrial AI is entering a new phase. For years, AI innovation has largely lived in dashboards, analytics, and digital decision support. Today, that intelligence is moving into the real world, onto factory floors, oil fields, and production lines, where AI systems don’t just analyze data, but sense, reason, and act in physical environments. This shift is increasingly described as Physical AI: intelligence that operates reliably where safety, latency, and real‑world constraints matter most. With the Azure IoT Operations 2603 (v1.3.38) release, Microsoft is delivering one of its most significant updates to date, strengthening the platform foundation required to build, deploy, and operate Physical AI systems at industrial scale. Why Physical AI needs a new kind of platform Physical AI systems are fundamentally different from digital‑only AI. They require: Real‑time, low‑latency decision‑making at the edge Tight integration across devices, assets, and OT systems End‑to‑end observability, health, and lifecycle management Secure cloud‑to‑edge control planes with governance built in Industry leaders and researchers increasingly agree that success in Physical AI depends less on isolated models, and more on software platforms that orchestrate data, assets, actions, and AI workloads across the physical world. Azure IoT Operations was built for exactly this challenge. What’s new in Azure IoT Operations 2603 The 2603 release delivers major advancements across data pipelines, connectivity, reliability, and operational control, enabling customers to move faster from experimentation to production‑grade Physical AI. Cloud‑to‑edge management actions Cloud‑to‑edge management actions enable teams to securely execute control and configuration operations on on‑premises assets, such as invoking methods, writing values, or adjusting settings, using Azure Resource Manager and Event Grid–based MQTT messaging. This capability extends the Azure control plane beyond the cloud, allowing intent, policy, and actions to be delivered reliably to physical systems while remaining decoupled from protocol and device specifics. For Physical AI, this closes the loop between perception and action: insights and decisions derived from models can be translated into governed, auditable changes in the physical world, even when assets operate in distributed or intermittently connected environments. Built‑in RBAC, managed identity, and activity logs ensure every action is authorized, traceable, and compliant, preserving safety, accountability, and human oversight as intelligence increasingly moves from observation to autonomous execution at the edge. No‑code dataflow graphs Azure IoT Operations makes it easier to build real‑time data pipelines at the edge without writing custom code. No‑code data flow graphs let teams design visual processing pipelines using built‑in transforms, with improved reliability, validation, and observability. Visual Editor – Build multi-stage data processing systems in the Operations Experience canvas. Drag and connect sources, transforms, and destinations visually. Configure map rules, filter conditions, and window durations inline. Deploy directly from the browser or define in Bicep/YAML for GitOps. Composable Transforms, Any Order – Chain map, filter, branch, concatenate, and window transforms in any sequence. Branch splits messages down parallel paths based on conditions. Concatenate merges them back. Route messages to different MQTT topics based on content. No fixed pipeline shape. Expressions, Enrichment, and Aggregation – Unit conversions, math, string operations, regex, conditionals, and last-known-value lookups, all built into the expression language. Enrich messages with external data from a state store. Aggregate high-frequency sensor data over tumbling time windows to compute averages, min/max, and counts. Open and Extensible – Connect to MQTT, Kafka, and OpenTelemetry (OTel) endpoints with built-in security through Azure Key Vault and managed identities. Need logic beyond what no-code covers? Drop a custom Wasm module (even embed and run ONNX AI ML models) into the middle of any graph alongside built-in transforms. You're never locked into declarative configuration. Together, these capabilities allow teams to move from raw telemetry to actionable signals directly at the edge without custom code or fragile glue logic. Expanded, production‑ready connectivity The MQTT connector enables customers to onboard MQTT devices as assets and route data to downstream workloads using familiar MQTT topics, with the flexibility to support unified namespace (UNS) patterns when desired. By leveraging MQTT’s lightweight publish/subscribe model, teams can simplify connectivity and share data across consumers without tight coupling between producers and applications. This is especially important for Physical AI, where intelligent systems must continuously sense state changes in the physical world and react quickly based on a consistent, authoritative operational context rather than fragmented data pipelines. Alongside MQTT, Azure IoT Operations continues to deliver broad, industrial‑grade connectivity across OPC UA, ONVIF, Media, REST/HTTP, and other connectors, with improved asset discovery, payload transformation, and lifecycle stability, providing the dependable connectivity layer Physical AI systems rely on to understand and respond to real‑world conditions. Unified health and observability Physical AI systems must be trustworthy. Azure IoT Operations 2603 introduces unified health status reporting across brokers, dataflows, assets, connectors, and endpoints, using consistent states and surfaced through both Kubernetes and Azure Resource Manager. This enables operators to see—not guess—when systems are ready to act in the physical world. Optional OPC UA connector deployment Azure IoT Operations 2603 introduces optional OPC UA connector deployment, reinforcing a design goal to keep deployments as streamlined as possible for scenarios that don’t require OPC UA from day one. The OPC UA connector is a discrete, native component of Azure IoT Operations that can be included during initial instance creation or added later as needs evolve, allowing teams to avoid unnecessary footprint and complexity in MQTT‑only or non‑OPC deployments. This reflects the broader architectural principle behind Azure IoT Operations: a platform built for composability and decomposability, where capabilities are assembled based on scenario requirements rather than assumed defaults, supporting faster onboarding, lower resource consumption, and cleaner production rollouts without limiting future expansion. Broker reliability and platform hardening The 2603 release significantly improves broker reliability through graceful upgrades, idempotent replication, persistence correctness, and backpressure isolation—capabilities essential for always‑on Physical AI systems operating in production environments. Physical AI in action: What customers are achieving today Azure IoT Operations is already powering real‑world Physical AI across industries, helping customers move beyond pilots to repeatable, scalable execution. Procter & Gamble Consumer goods leader P&G continually looks for ways to drive manufacturing efficiency and improve overall equipment effectiveness—a KPI encompassing availability, performance, and quality that’s tracked in P&G facilities around the world. P&G deployed Azure IoT Operations, enabled by Azure Arc, to capture real-time data from equipment at the edge, analyze it in the cloud, and deploy predictive models that enhance manufacturing efficiency and reduce unplanned downtime. Using Azure IoT Operations and Azure Arc, P&G is extrapolating insights and correlating them across plants to improve efficiency, reduce loss, and continue to drive global manufacturing technology forward. More info. Husqvarna Husqvarna Group faced increasing pressure to modernize its fragmented global infrastructure, gain real-time operational insights, and improve efficiency across its supply chain to stay competitive in a rapidly evolving digital and manufacturing landscape. Husqvarna Group implemented a suite of Microsoft Azure solutions—including Azure Arc, Azure IoT Operations, and Azure OpenAI—to unify cloud and on-premises systems, enable real-time data insights, and drive innovation across global manufacturing operations. With Azure, Husqvarna Group achieved 98% faster data deployment and 50% lower infrastructure imaging costs, while improving productivity, reducing downtime, and enabling real-time insights across a growing network of smart, connected factories. More info. Chevron With its Facilities and Operations of the Future initiative, Chevron is reimagining the monitoring of its physical operations to support remote and autonomous operations through enhanced capabilities and real-time access to data. Chevron adopted Microsoft Azure IoT Operations, enabled by Azure Arc, to manage and analyze data locally at remote facilities at the edge, while still maintaining a centralized, cloud-based management plane. Real-time insights enhance worker safety while lowering operational costs, empowering staff to focus on complex, higher-value tasks rather than routine inspections. More info. A platform purpose‑built for Physical AI Across manufacturing, energy, and infrastructure, the message is clear: the next wave of AI value will be created where digital intelligence meets the physical world. Azure IoT Operations 2603 strengthens Microsoft’s commitment to that future—providing the secure, observable, cloud‑connected edge platform required to build Physical AI systems that are not only intelligent, but dependable. Get started To explore the full Azure IoT Operations 2603 release, review the public documentation and release notes, and start building Physical AI solutions that operate and scale confidently in the real world.284Views2likes0CommentsNow in Foundry: Cohere Transcribe, Nanbeige 4.1-3B, and Octen Embedding
This week's Model Mondays edition spans three distinct layers of the AI application stack: Cohere's cohere-transcribe, a 2B Automatic Speech Recognition (ASR) model that ranks first on the Open ASR Leaderboard across 14 languages; Nanbeige's Nanbeige4.1-3B, a compact 3B reasoning model that outperforms models ten times its size on coding, math, and deep-search benchmarks; and Octen's Octen-Embedding-0.6B, a lightweight text embedding model that achieves strong retrieval scores across 100+ languages and industry-specific domains. Together, these three models illustrate how developers can build full AI pipelines—from audio ingestion to language reasoning to semantic retrieval—entirely with open-source models deployed through Microsoft Foundry. Each operates in a different modality and fills a distinct architectural role, making this week's selection especially well-suited for teams assembling production-grade systems across speech, text, and search. Models of the week Cohere's cohere-transcribe-03-2026 Model Specs Parameters / size: 2B Primary task: Automatic Speech Recognition (audio-to-text) Why it's interesting Top-ranked on the Open ASR Leaderboard: cohere-transcribe-03-2026 achieves a 5.42% average Word Error Rate (WER) across 8 English benchmark datasets as of March 26, 2026—placing it first among open models. It reaches 1.25% WER on LibriSpeech Clean and 8.15% on AMI (meeting transcription), demonstrating consistent accuracy across both clean speech and real-world, multi-speaker environments. Benchmarks: Open ASR Leaderboard. 14 languages with a dedicated encoder-decoder architecture: The model uses a large Conformer encoder for acoustic representation extraction paired with a lightweight Transformer decoder for token generation, trained from scratch on 14 languages covering European, East Asian (Chinese Mandarin, Japanese, Korean, Vietnamese), and Arabic. Unlike general-purpose models adapted for ASR, this dedicated architecture makes it efficient without sacrificing accuracy. Long-form audio with automatic chunking: Audio longer than 35 seconds is automatically split into overlapping chunks and reassembled into a coherent transcript—no manual preprocessing required. Batched inference, punctuation control, and per-language configuration are all supported through the standard API. Try it Click on the window above, upload an audio file, and watch how quickly the model transcribes it for you. Or click the link to experiment with the Cohere Transcribe Space and record audio directly from your device. Use Case Prompt Pattern Meeting transcription Submit recorded audio with language tag; retrieve timestamped transcript per speaker turn Call center quality review Batch-process customer call recordings, extract transcript, pass to classification model Medical documentation Transcribe clinical encounters; feed transcript into summarization or structured note pipeline Multilingual content indexing Process podcasts or video audio in any of 14 supported languages; store as searchable text Sample prompt for a legal services deployment: You are building a contract negotiation assistant. A client submits a recorded audio of a 45-minute supplier negotiation call. Using the cohere-transcribe-03-2026 endpoint deployed in Microsoft Foundry, transcribe the call with punctuation enabled for the English audio. Once the transcript is available, pass it to a downstream language model with the following instruction: "Identify all pricing commitments, delivery deadlines, and liability clauses mentioned in this negotiation transcript. For each, note the speaker's position (client or supplier) and flag any terms that appear ambiguous or require legal review." Nanbeige's Nanbeige4.1-3B Model Specs Parameters / size: 3B Context length: 131,072 tokens Primary task: Text generation (reasoning, coding, tool use, deep search) Why it's interesting Reasoning performance that exceeds its size class: Nanbeige4.1-3B scores 76.9 on LiveCodeBench-V6, these results suggest that targeted post-training using Supervised Fine-Tuning (SFT) and Reinforcement Learning (RL) on a focused dataset can yield improvements that scale-based approaches cannot replicate at equivalent parameter counts. Read the technical report: https://huggingface.co/papers/2602.13367. Strong preference alignment at the 3B scale: On Arena-Hard-v2, Nanbeige4.1-3B scores 73.2, compared to 56.0 for Qwen3-32B and 60.2 for Qwen3-30B-A3B—both significantly larger models. This indicates that the model's outputs consistently match human preference for response quality and helpfulness, not just accuracy on structured tasks. Deep-search capability previously absent from small general models: On xBench-DeepSearch-2505, Nanbeige4.1-3B scores 75—matching search-specialized small agents. The model can sustain complex agentic tasks involving more than 500 sequential tool invocations, a capability gap that previously required either specialized search agents or significantly larger models. Native tool-use support: The model's chat template and generation pipeline natively support tool call formatting, making it straightforward to connect to external APIs and build multi-step agentic workflows without additional scaffolding. Try it Use Case Prompt Pattern Code review and fix Provide failing test + stack trace; ask model to diagnose root cause and write corrected implementation Competition-style math Submit problem as structured prompt; use temperature 0.6, top-p 0.95 for consistent reasoning steps Agentic task execution Provide tool definitions as JSON + goal; let model plan and execute tool calls sequentially Long-document Q&A Pass full document (up to 131K tokens) with targeted factual questions; extract structured answers Sample prompt for a software engineering deployment: You are automating pull request review for a backend engineering team. Using the Nanbeige4.1-3B endpoint deployed in Microsoft Foundry, provide the model with a unified diff of a proposed code change and the following system instruction: "You are a senior software engineer reviewing a pull request. For each modified function: (1) summarize what the change does, (2) identify any edge cases that are not handled, (3) flag any security or performance regressions relative to the original, and (4) suggest a specific improvement if one is warranted. Format your output as a structured list per function." Octen's Octen-Embedding-0.6B Model Specs Parameters / size: 0.6B Context length: 32,768 tokens Primary task: Text embeddings (semantic search, retrieval, similarity) Why it's interesting Retrieval performance above larger proprietary models at 0.6B: On the RTEB (Retrieval Text Embedding Benchmark) public leaderboard, Octen-Embedding-0.6B achieves a mean task score of 0.7241—above voyage-3.5 (0.7139), Cohere-embed-v4.0 (0.6534), and text-embedding-3-large (0.6110), despite being a fraction of their parameter count. The model is fine-tuned from Qwen3-Embedding-0.6B via Low-Rank Adaptation (LoRA), demonstrating that targeted fine-tuning on retrieval-specific data can close the gap with larger embedding models. Vertical domain coverage across legal, finance, healthcare, and code: Octen-Embedding-0.6B was trained with explicit coverage of domain-specific retrieval scenarios—legal document matching, financial report Q&A, clinical dialogue retrieval, and code search including SQL. This makes it suitable for regulated-industry applications where generic embedding models tend to underperform on specialized terminology. 32,768-token context for long-document retrieval: The extended context window supports encoding entire legal contracts, earnings reports, or clinical case notes as single embeddings—removing the need to chunk long documents and re-aggregate scores at query time, which can introduce ranking errors. 100+ language support with cross-lingual retrieval: The model handles multilingual and cross-lingual retrieval natively, with strong coverage across languages including English, Chinese, and other major languages via its Qwen3-based architecture—practical for global enterprise applications that span multiple languages. Use Case Prompt Pattern Semantic search Encode user query and document corpus; rank documents by cosine similarity to query embedding Legal precedent retrieval Embed case briefs and query with legal question; retrieve most semantically relevant precedents Cross-lingual document search Encode multilingual document set; submit query in any supported language for cross-lingual retrieval Financial Q&A pipeline Embed earnings reports or filings; retrieve relevant passages to ground downstream language model responses Sample prompt for a global enterprise knowledge base deployment: You are building a clinical decision support tool. Using the Octen-Embedding-0.6B endpoint deployed in Microsoft Foundry, embed a corpus of 10,000 clinical case notes at ingestion time and store the resulting 1024-dimensional vectors in a vector database. At query time, encode an incoming patient presentation summary and retrieve the 5 most semantically similar historical cases. Pass the retrieved cases and the current presentation to a language model with the following instruction: "Based on these five similar cases and their documented outcomes, summarize the most common treatment approaches and flag any cases where the outcome differed significantly from the initial prognosis." Getting started You can deploy open-source Hugging Face models directly in Microsoft Foundry by browsing the Hugging Face collection in the Foundry model catalog and deploying to managed endpoints in just a few clicks. You can also start from the Hugging Face Hub. First, select any supported model and then choose "Deploy on Microsoft Foundry", which brings you straight into Azure with secure, scalable inference already configured. Learn how to discover models and deploy them using Microsoft Foundry documentation: Follow along the Model Mondays series and access the GitHub to stay up to date on the latest Read Hugging Face on Azure docs Learn about one-click deployments from the Hugging Face Hub on Microsoft Foundry Explore models in Microsoft Foundry346Views1like0CommentsTurn Enterprise Knowledge into Answers with Copilot Studio and Azure AI Search
From the Field: Why This Integration Works As an experienced AI Cloud Solution Architect working in Greater China Region (GCR), I’ve seen one emerging pattern that delivers quick wins for some of my customers: combining Microsoft Copilot Studio with an existing Azure AI Search index. Teams choose this approach because it delivers two outcomes immediately: business users get grounded, reliable answers, and enterprises avoid re-building pipelines or re-platforming knowledge stores. This guide shows exactly how to connect Copilot Studio to an Azure AI Search index that is already live, so your copilot can answer confidently using your enterprise documents. What We Assume Is Already Ready To stay focused on the integration step, we assume: You have an Azure AI Search service deployed You have an index containing vectorized content (manuals, PDFs, policies, FAQs) Your platform/data team already handled ingestion, embeddings, and indexing In short, your Azure AI Search endpoint and admin key are ready, and the index already contains chunked content with embeddings. Step 1 - Collect Your Azure AI Search Connection Details From the Azure AI Search resource: Endpoint URL Azure AI Search → Overview → Url: https://<your-search-service>.search.windows.net Admin Key Azure AI Search → Keys Use either the primary or secondary key. Governance tip: For production, rotate keys regularly and use managed identities when possible. Step 2 - Add Azure AI Search as Knowledge Inside Copilot Studio Open your Copilot Studio agent Go to the Knowledge tab Select Add knowledge, choose Azure AI Search Provide: Endpoint URL Admin key Create or select the connection Choose your existing index from the dropdown Select Add to agent Step 3 - Test a Grounded Response Open the Test copilot pane and ask a question your indexed content can answer, such as: “What are the different licensing options available for Power Platform?” Verify that: The Activity Map shows Azure AI Search being invoked The answer reflects the correct document in your index Citations or references appear where applicable Conclusion Business value: You can activate grounded, explainable answers in Copilot Studio immediately by reusing your existing Azure AI Search index - no re-platforming, no new pipelines. Team model: Data/Platform teams own ingestion, enrichment, and vectorization. Business teams build and refine the copilot experience in Copilot Studio. Scale and governance: All components stay inside Azure, with enterprise-grade security, RBAC, and operational monitoring, while enabling low‑code agility for makers. For the full end-to-end lab (storage setup, embeddings, index creation), see: 🔗 https://github.com/Azure/Copilot-Studio-and-Azure (Lab 1.4). Acknowledgements This tutorial builds on foundational work by my EMEA colleague Pablo Carceller, whose GitHub repo on Copilot Studio and Azure has helped teams worldwide accelerate real customer implementations. 👉 GitHub - Copilot Studio and Azure: https://github.com/Azure/Copilot-Studio-and-Azure I would also like to thank the broader Cloud Accelerate Factory GCR team for their contributions, insights, and active collaboration in validating this pattern across customer engagements. Special appreciation to our AI Architects Dr. Longyu Qi, Jian (Jason) Shao, Lei (Leo) Ma, and Ethan Tseng, as well as our PM partners Yunxi (Rayne) Jin and Emma Wang, whose feedback and field experiences helped shape and refine this guide. Image credits: demo visuals adapted from materials by Pablo Carceller (GitHub Lab 1.4).303Views0likes0CommentsMicrosoft Foundry: Unlock Adaptive, Personalized Agents with User-Scoped Persistent Memory
From Knowledgeable to Personalized: Why Memory Matters Most AI agents today are knowledgeable — they ground responses in enterprise data sources and rely on short‑term, session‑based memory to maintain conversational coherence. This works well within a single interaction. But once the session ends, the context disappears. The agent starts fresh, unable to recall prior interactions, user preferences, or previously established context. In reality, enterprise users don’t interact with agents exclusively in one‑off sessions. Conversations can span days, weeks, evolving across multiple interactions rather than isolated sessions. Without a way to persist and safely reuse relevant context across interactions, AI agents remain efficient in the short term be being stateful within a session, but lose continuity over time due to their statelessness across sessions. Bridging this gap between short-term efficiency and long‑term adaptation exposes a deeper challenge. Persisting memory across sessions is not just a technical decision; in enterprise environments, it introduces legitimate concerns around privacy, data isolation, governance, and compliance — especially when multiple users interact with the same agent. What seems like an obvious next step quickly becomes a complex architectural problem, requiring organizations to balance the ability for agents to learn and adapt over time with the need to preserve trust, enforce isolation boundaries, and meet enterprise compliance requirements. In this post, I’ll walk through a practical design pattern for user‑scoped persistent memory, including a reference architecture and a deployable sample implementation that demonstrates how to apply this pattern in a real enterprise setting while preserving isolation, governance, and compliance. The Challenge of Persistent Memory in Enterprise AI Agents Extending memory beyond a single session seems like a natural way to make AI agents more adaptive. Retaining relevant context over time — such as preferences, prior decisions, or recurring patterns — would allow an agent to progressively tailor its behavior to each user, moving from simple responsiveness toward genuine adaptation. In enterprise environments, however, persistence introduces a different class of risk. Storing and reusing user context across interactions raises questions of privacy, data isolation, governance, and compliance — particularly when multiple users interact with shared systems. Without clear ownership and isolation boundaries, naïvely persisted memory can lead to cross‑user data leakage, policy violations, or unclear retention guarantees. As a result, many systems default to ephemeral, session‑only memory. This approach prioritizes safety and simplicity — but does so at the cost of long‑term personalization and continuity. The challenge, then, is not whether agents should remember, but how memory can be introduced without violating enterprise trust boundaries. Persistent Memory: Trade‑offs Between Abstraction and Control As AI agents evolve toward more adaptive behavior, several approaches to agent memory are emerging across the ecosystem. Each reflects a different set of trade-offs between abstraction, flexibility, and control — making it useful to briefly acknowledge these patterns before introducing the design presented here. Microsoft Foundry Agent Service includes a built‑in memory capability (currently in Preview) that enables agents to retain context beyond a single interaction. This approach integrates tightly with the Foundry runtime and abstracts much of the underlying memory management, making it well suited for scenarios that align closely with the managed agent lifecycle. Another notable approach combines Mem0 with Azure AI Search, where memory entries are stored and retrieved through vector search. In this model, memory is treated as an embedding‑centric store that emphasizes semantic recall and relevance. Mem0 is intentionally opinionated, defining how memory is structured, summarized, and retrieved to optimize for ease of use and rapid iteration. Both approaches represent meaningful progress. At the same time, some enterprises require an approach where user memory is explicitly owned, scoped, and governed within their existing data architecture — rather than implicitly managed by an agent framework or memory library. These requirements often stem from stricter expectations around data isolation, compliance, and long‑term control. User-Scoped Persistent Memory with Azure Cosmos DB The solution presented in this post provides a practical reference implementation for organizations that require explicit control over how user memory is stored, scoped, and governed. Rather than embedding long‑term memory implicitly within the agent runtime, this design models memory as a first‑class system component built on Azure Cosmos DB. At a high level, the architecture introduces user‑scoped persistent memory: a durable memory layer in which each user’s context is isolated and managed independently. Persistent memory is stored in Azure Cosmos DB containers partitioned by user identity and consists of curated, long‑lived signals — such as preferences, recurring intent, or summarized outcomes from prior interactions — rather than raw conversational transcripts. This keeps memory intentional, auditable, and easy to evolve over time. Short‑term, in‑session conversation state remains managed by Microsoft Foundry on the server side through its built‑in conversation and thread model. By separating ephemeral session context from durable user memory, the system preserves conversational coherence while avoiding uncontrolled accumulation of long‑term state within the agent runtime. This design enables continuity and personalization across sessions while deliberately avoiding the risks associated with shared or global memory models, including cross‑user data leakage, unclear ownership, and unintended reuse of context. Azure Cosmos DB provides enterprises with direct control over memory isolation, data residency, retention policies, and operational characteristics such as consistency, availability, and scale. In this architecture, knowledge grounding and memory serve complementary roles. Knowledge grounding ensures correctness by anchoring responses in trusted enterprise data sources. User‑scoped persistent memory ensures relevance by tailoring interactions to the individual user over time. Together, they enable trustworthy, adaptive AI agents that improve with use — without compromising enterprise boundaries. Architecture Components and Responsibilities Identity and User Scoping Microsoft Entra ID (App Registrations) — provides the frontend a client ID and tenant ID so the Microsoft Authentication Library (MSAL) can authenticate users via browser redirect. The oid (Object ID) claim from the ID token is used as the user identifier throughout the system. Agent Runtime and Orchestration Microsoft Foundry — serves as the unified AI platform for hosting models, managing agents, and maintaining conversation state. Foundry manages in‑session and thread‑level memory on the server side, preserving conversational continuity while keeping ephemeral context separate from long‑term user memory. Backend Agent Service — implements the AI agent using Microsoft Foundry’s agent and conversation APIs. The agent is responsible for reasoning, tool‑calling decisions, and response generation, delegating memory and search operations to external MCP servers. Memory and Knowledge Services MCP‑Memory — MCP server that hosts tools for extracting structured memory signals from conversations, generating embeddings, and persisting user‑scoped memories. Memories are written to and retrieved from Azure Cosmos DB, enforcing strict per‑user isolation. MCP‑Search — MCP server exposing tools for querying enterprise knowledge sources via Azure AI Search. This separation ensures that knowledge grounding and memory retrieval remain distinct concerns. Azure Cosmos DB for NoSQL — provides the durable, serverless document store for user‑scoped persistent memory. Memory containers are partitioned by user ID, enabling isolation, auditable access, configurable retention policies, and predictable scalability. Vector search is used to support semantic recall over stored memory entries. Azure AI Search — supplies hybrid retrieval (keyword and vector) with semantic reranking over the enterprise knowledge index. An integrated vectorizer backed by an embedding model is used for query‑time vectorization. Models text‑embedding‑3‑large — used for generating vector embeddings for both user‑scoped memories and enterprise knowledge search. gpt‑5‑mini — used for lightweight analysis tasks, such as extracting structured memory facts from conversational context. gpt‑5.1 — powers the AI agent, handling multi‑turn conversations, tool invocation, and response synthesis. Application and Hosting Infrastructure Frontend Web Application — a React‑based web UI that handles user authentication and presents a conversational chat interface. Azure Container Apps Environment — provides a shared execution environment for all services, including networking, scaling, and observability. Azure Container Apps — hosts the frontend, backend agent service, and MCP servers as independently scalable containers. Azure Container Registry — stores container images for all application components. Try It Yourself Demonstration of user‑scoped persistent memory across sessions. To make these concepts concrete, I’ve published a working reference implementation that demonstrates the architecture and patterns described above. The complete solution is available in the Agent-Memory GitHub repository. The repository README includes prerequisites, environment setup notes, and configuration details. Start by cloning the repository and moving into the project directory: git clone https://github.com/mardianto-msft/azure-agent-memory.git cd azure-agent-memory Next, sign in to Azure using the Azure CLI: az login Then authenticate the Azure Developer CLI: azd auth login Once authenticated, deploy the solution: azd up After deployment is complete, sign in using the provided demo users and interact with the agent across multiple sessions. Each user’s preferences and prior context are retained independently, the interaction continues seamlessly after signing out and returning later, and user context remains fully isolated with no cross‑identity leakage. The solution also includes a knowledge index initialized with selected Microsoft Outlook Help documentation, which the agent uses for knowledge grounding. This index can be easily replaced or extended with your own publicly accessible URLs to adapt the solution to different domains. Looking Ahead: Personalized Memory as a Foundation for Adaptive Agents As enterprise AI agents evolve, many teams are looking beyond larger models and improved retrieval toward human‑centered personalization at scale — building agents that adapt to individual users while operating within clearly defined trust boundaries. User‑scoped persistent memory enables this shift. By treating memory as a first‑class, user‑owned component, agents can maintain continuity across sessions while preserving isolation, governance, and compliance. Personalization becomes an intentional design choice, aligning with Microsoft’s human‑centered approach to AI, where users retain control over how systems adapt to them. This solution demonstrates how knowledge grounding and personalized memory serve complementary roles. Knowledge grounding ensures correctness by anchoring responses in trusted enterprise data. Personalized memory ensures relevance by tailoring interactions to the individual user. Together, they enable context‑aware, adaptive, and personalized agents — without compromising enterprise trust. Finally, this solution is intentionally presented as a reference design pattern, not a prescriptive architecture. It offers a practical starting point for enterprises designing adaptive, personalized agents, illustrating how user‑scoped memory can be modeled, governed, and integrated as a foundational capability for scalable enterprise AI.455Views1like1CommentIntroducing OpenAI’s GPT-image-1.5 in Microsoft Foundry
Developers building with visual AI can often run into the same frustrations: images that drift from the prompt, inconsistent object placement, text that renders unpredictably, and editing workflows that break when iterating on a single asset. That’s why we are excited to announce OpenAI's GPT Image 1.5 is now generally available in Microsoft Foundry. This model can bring sharper image fidelity, stronger prompt alignment, and faster image generation that supports iterative workflows. Starting today, customers can request access to the model and start building in the Foundry platform. Meet GPT Image 1.5 AI driven image generation began with early models like OpenAI's DALL-E, which introduced the ability to transform text prompts into visuals. Since then, image generation models have been evolving to enhance multimodal AI across industries. GPT Image 1.5 represents continuous improvement in enterprise-grade image generation. Building on the success of GPT Image 1 and GPT Image 1 mini, these enhanced models introduce advanced capabilities that cater to both creative and operational needs. The new image model offer: Text-to-image: Stronger instruction following and highly precise editing. Image-to-image: Transform existing images to iteratively refine specific regions Improved visual fidelity: More detailed scenes and realistic rendering. Accelerated creation times: Up to 4x faster generation speed. Enterprise integration: Deploy and scale securely in Microsoft Foundry. GPT Image 1.5 delivers stronger image preservation and editing capabilities, maintaining critical details like facial likeness, lighting, composition, and color tone across iterative changes. You’ll see more consistent preservation of branded logos and key visuals, making it especially powerful for marketing, brand design, and ecommerce workflows—from graphics and logo creation to generating full product catalogs (variants, environments, and angles) from a single source image. Benchmarks Based on an internal Microsoft dataset, GPT Image 1.5 performs higher than other image generation models in prompt alignment and infographics tasks. It focuses on making clear, strong edits – performing best on single-turn modification, delivering the higher visual quality in both single and multi-turn settings. The following results were found across image generation and editing: Text to image Prompt alignment Diagram / Flowchart GPT Image 1.5 91.2% 96.9% GPT Image 1 87.3% 90.0% Qwen Image 83.9% 33.9% Nano Banana Pro 87.9% 95.3% Image editing Evaluation Aspect Modification Preservation Visual Quality Face Preservation Metrics BinaryEval SC (semantic) DINO (Visual) BinaryEval AuraFace Single-turn GPT image 1 99.2% 51.0% 0.14 79.5% 0.30 Qwen image 81.9% 63.9% 0.44 76.0% 0.85 GPT Image 1.5 100% 56.77% 0.14 89.96% 0.39 Multi-turn GPT Image 1 93.5% 54.7% 0.10 82.8% 0.24 Qwen image 77.3% 68.2% 0.43 77.6% 0.63 GPT image 1.5 92.49% 60.55% 0.15 89.46% 0.28 Using GPT Image 1.5 across industries Whether you’re creating immersive visuals for campaigns, accelerating UI and product design, or producing assets for interactive learning GPT Image 1.5 gives modern enterprises the flexibility and scalability they need. Image models can allow teams to drive deeper engagement through compelling visuals, speed up design cycles for apps, websites, and marketing initiatives, and support inclusivity by generating accessible, high‑quality content for diverse audiences. Watch how Foundry enables developers to iterate with multimodal AI across Black Forest Labs, OpenAI, and more: Microsoft Foundry empowers organizations to deploy these capabilities at scale, integrating image generation seamlessly into enterprise workflows. Explore the use of AI image generation here across industries like: Retail: Generate product imagery for catalogs, e-commerce listings, and personalized shopping experiences. Marketing: Create campaign visuals and social media graphics. Education: Develop interactive learning materials or visual aids. Entertainment: Edit storyboards, character designs, and dynamic scenes for films and games. UI/UX: Accelerate design workflows for apps and websites. Microsoft Foundry provides security and compliance with built-in content safety filters, role-based access, network isolation, and Azure Monitor logging. Integrated governance via Azure Policy, Purview, and Sentinel gives teams real-time visibility and control, so privacy and safety are embedded in every deployment. Learn more about responsible AI at Microsoft. Pricing Model Pricing (per 1M tokens) - Global GPT-image-1.5 Input Tokens: $8 Cached Input Tokens: $2 Output Tokens: $32 Cost efficiency improves as well: image inputs and outputs are now cheaper compared to GPT Image 1, enabling organizations to generate and iterate on more creative assets within the same budget. For detailed pricing, refer here. Getting started Learn more about image generation, explore code samples, and read about responsible AI protections here. Try GPT Image 1.5 in Microsoft Foundry and start building multimodal experiences today. Whether you’re designing educational materials, crafting visual narratives, or accelerating UI workflows, these models deliver the flexibility and performance your organization needs.8.7KViews2likes1CommentSecurity: An essential part of your Azure and AI journey
Robust security is crucial at every stage of your cloud and AI journey. Take a closer look at the importance of embedding security from the initial consideration of Azure to the ongoing management of workloads and AI applications. Join Uli Homann and David Blank-Edelman as they share insights on how to make Zero Trust actionable as you leverage Microsoft’s platform, products and tools to achieve secure and successful Azure and AI adoption. Learn from the leaders about how to build security into your next Azure or AI project. This session is part of the Azure security and AI adoption Tech Accelerator. Increase your skills by checking out more sessions to help you plan, build, manage and optimize your Azure deployments and AI projects with a security-first mindset. Learn more with Microsoft Secure! Find solutions your organization can use to protect your data, defend against cyberthreats, and stay compliant.1.1KViews2likes1Comment