application
110 TopicsWAR, Azure Advisor, and Us (Azure Arch Diagram Builder): Three Ways to Score an Azure Architecture
Author: Arturo Quiroga, Azure AI services Engineer - Senior Partner Solutions Architect — Microsoft A few days ago I published From Prompt to Production: Building Azure Architecture Diagrams with AI, introducing the open-source Azure Architecture Diagram Builder. One feature got more follow-up questions than any other: the Well-Architected Framework (WAF) validation. Architects from partners and customers — many of whom already use Azure Advisor and the Well-Architected Review — wanted to know exactly what scoring algorithm we use, how it compares to Microsoft's official tools, and whether they should be using all three. This post is that answer. It's a deep dive into how design-time WAF validation works, how Microsoft's two official WAF assessment algorithms work, and where each fits in the architecture lifecycle. TL;DR. Microsoft ships two WAF assessment vehicles — the Well-Architected Review (questionnaire, scored from human answers) and the Azure Advisor score (healthy-resources-÷-applicable-resources weighted per subcategory, with Defender Secure Score for Security and cost-weighted math for Cost). Both require either a human filling in a form or live Azure telemetry. Our app runs at design time on a diagram, before anything is deployed, using a hybrid pipeline: a deterministic rule pre-scan followed by an LLM refinement pass. Same five WAF pillars, different lifecycle stage. Complementary, not competitive. Why design-time validation matters Every cost overrun, reliability gap, and security incident I've ever debugged was cheaper to fix on a whiteboard than in production. Yet most WAF tooling assumes the architecture already exists — either because there are deployed resources to scan (Advisor) or because someone has built enough of it to answer 60 specific questions about it (WAR). That leaves a gap. Between "rough sketch" and "deployed resource group" there is no algorithmic WAF feedback loop. That's the gap the Diagram Builder fills. Microsoft's two official WAF assessment algorithms Before describing our approach, it's worth being precise about what Microsoft already ships, because the term "WAF assessment algorithm" can mean either of two very different things. 1. Azure Well-Architected Review (WAR) — questionnaire-based The Well-Architected Review is a free self-assessment hosted on Microsoft Learn. Aspect Detail Input Human answers to ~60 questions mapped to the WAF pillar checklists Workload variants Core WAR, plus AI/ML, IoT, SAP on Azure, Azure Stack Hub, SaaS, Mission Critical Scoring Derived from the answers — each "no" or unanswered question subtracts from the pillar score Output Per-pillar maturity score + prioritized recommendations + optional Advisor integration Improvement tracking "Milestones" (point-in-time snapshots) When to use Periodic deep reviews; greenfield design baselining; brownfield audits WAR is human-driven. The algorithm is essentially "how many of the recommended practices have you confirmed you do?" — which is exactly the right algorithm when the assessor is the workload team itself. 2. Azure Advisor Score — telemetry-based The Advisor score is the closest thing Microsoft ships to a real, deterministic WAF algorithm. It runs continuously over your deployed Azure resources. The math: Pillar-specific overrides: Security uses Microsoft Defender for Cloud's Secure Score model. Cost weights by retail $ cost of healthy resources, plus age-of-recommendation weighting; postponed/dismissed items are removed from the denominator. Reliability / Performance / Operational Excellence use the healthy-resources ratio above. Key terms: Healthy resource — a deployed resource with no open Advisor recommendation against it for that pillar. Total applicable — resources Advisor was able to evaluate (excludes dismissed/snoozed). Advisor is the right tool once you're in production. It cannot help you before deployment, because there is nothing to count as "healthy" or "applicable." The missing stage: design time Here's the lifecycle, with each tool's domain shaded: Design / Diagram — Diagram Builder validation runs here. Operate / Observe — Azure Advisor runs here continuously. Periodic Review — WAR runs here, typically quarterly or at major milestones. These three stages are sequential and complementary. Our app does not replace Advisor or WAR — it adds a feedback loop earlier in the lifecycle, where corrections are cheapest. How design-time validation works in the Azure Architecture Diagram Builder The validator is a two-phase hybrid pipeline: deterministic local rules first, then LLM refinement. The full source lives in three files: src/services/architectureValidator.ts — orchestrator and prompt src/services/wafPatternDetector.ts — topology + service rule engine src/data/wafRules.ts — the rule knowledge base Phase 1 — Deterministic rule pre-scan (~1 ms, no LLM) When you click Validate Architecture, the validator runs a fully client-side rule engine against the diagram's services, connections, and groups. There are two kinds of rules: Architecture-pattern rules These fire when a topology anti-pattern is detected: Pattern Detection trigger single-region No global LB (Traffic Manager / Front Door) with ≥3 services single-database Exactly one database service, no replication signal no-cache Compute + database present, no Redis/CDN no-monitoring No Azure Monitor / App Insights / Log Analytics no-identity No Microsoft Entra ID no-waf Public web tier without WAF / Front Door / App Gateway direct-db-access An edge from a frontend service directly into a database no-key-vault 4+ services and no Key Vault no-backup Database present, no Azure Backup / Recovery Services no-api-gateway 2+ compute services and no APIM / App Gateway / Front Door Service-specific rules Every service in the in the generated Azure Architecture diagram is matched against SERVICE_SPECIFIC_RULES by normalized type — App Service, Functions, AKS, Cosmos DB, SQL Database, Storage, Key Vault, and 22 more. The knowledge base at a glance Metric Count Total rules 73 Architecture-pattern rules 10 Service-specific rules 63 Distinct Azure services covered 29 Rules tagged Reliability 18 Rules tagged Security 34 Rules tagged Cost Optimization 5 Rules tagged Operational Excellence 7 Rules tagged Performance Efficiency 9 The preliminary score Each finding has a severity, and severity drives a fixed point deduction from a starting score of 100: Severity Deduction critical −12 high −7 medium −3 low −1 Result is floored at 10 (so even a deliberately bad architecture scores at least 10) and ceilinged at 95 (no findings ≠ perfect — there's always something the model might still catch). This is the deterministic baseline before the LLM ever sees the architecture, and it's what makes the pipeline reproducible. Phase 2 — LLM contextual refinement The pre-scan output, the topology, and the optional natural-language description are folded into a focused prompt sent to one of seven Azure OpenAI models (GPT-5.1 through 5.4, GPT-5.x Codex variants, DeepSeek V3.2 Speciale, Grok 4.1 Fast). The system prompt gives the model explicit scoring guardrails: Score based on what IS present, not what COULD be added. A well-connected architecture with appropriate services should score 60–80. Score below 50 only for critical gaps (no auth, no monitoring, single points of failure). Findings are improvement suggestions, not reasons to penalize the score severely. The model returns strict JSON: { "overallScore": 0-100, "summary": "2–3 sentence assessment", "pillars": [ { "pillar": "Reliability | Security | Cost Optimization | Operational Excellence | Performance Efficiency", "score": 0-100, "findings": [ { "severity": "critical | high | medium | low", "category": "...", "issue": "...", "recommendation": "...", "resources": ["service-name-1", "service-name-2"], "source": "rule-based | ai-analysis" } ] } ], "quickWins": [ /* same shape as findings */ ] } Two things to call out: Every finding is tagged rule-based or ai-analysis . That tag is the credibility lever. You can always see what the deterministic engine produced versus what the model contributed on top. If you don't trust the AI layer, you can ignore it entirely — the rule layer still stands. The LLM is given pattern hints, not the entire rule catalog. The prompt stays small and focused, which is roughly 3–5× faster and cheaper than asking the LLM to do everything from scratch. What the user sees On every run the modal reports: Overall WAF score (0–100) Per-pillar score × 5 (0–100 each) Severity breakdown — counts of critical / high / medium / low across all findings Quick wins — high-impact, low-effort items the model surfaces separately Hybrid metadata — local findings count, patterns detected, KB rules used, preliminary score, local elapsed ms AI metrics — model used, reasoning effort, prompt/completion/total tokens, elapsed time App Insights telemetry — an Architecture_Validated event with model, overall score, finding count, elapsed time Worked example Take this prompt, which I've used in demos with partners: "A multi-region web application: Azure Front Door in front of two App Service instances in West US 2 and East US 2, both reading from an Azure SQL Database with geo-replication, with Application Insights for telemetry. No Entra ID, no Key Vault." After generation, Validate Architecture runs: Phase 1 — pre-scan (deterministic), ~1 ms Patterns detected: no-identity , no-key-vault Findings produced: 8 (1 critical, 1 high, 3 medium, 3 low) Preliminary score: 100 − 12 − 7 − (3×3) − (1×3) = 69 Phase 2 — LLM refinement, ~6–9 s depending on model The model accepts the two pattern hints, validates them in context, and adds three more findings of its own: Finding Source Pillar Severity No Microsoft Entra ID for authentication rule-based Security critical No Key Vault for secret management rule-based Security high App Service slots not used for safe deploys ai-analysis Operational Excellence medium SQL DB geo-replication present but RTO/RPO not documented ai-analysis Reliability medium No CDN for static assets behind Front Door ai-analysis Performance Efficiency low Final scores returned by the model: Pillar Score Reliability 78 Security 52 Cost Optimization 80 Operational Excellence 70 Performance Efficiency 75 Overall 71 The Security score is the lowest because two of the highest-severity findings landed there — exactly what a human reviewer would flag first. Multi-model comparison Because the deterministic floor is identical across runs, the Validation Comparison view becomes a fair shootout of what each LLM adds on top of the same baseline. The same diagram is scored by all seven models, and the UI surfaces: Overall score per model Per-pillar score per model Severity-count deltas Number of ai-analysis findings each model contributed Quick wins each model identified This is genuinely useful for two reasons. First, it shows that LLM scores vary — typically by ±5–10 points on the same architecture — which is exactly why we publish the rule-based vs ai-analysis tag. Second, it lets architects pick the model whose review style matches their own. How we align with Microsoft's algorithms Alignment point What it means Same five pillars Identical names and scope to the official WAF Same source material Rules derived from WAF docs and Azure Architecture Center service guides Severity-graded findings Map conceptually to Advisor's high/medium/low impact recommendations Per-pillar + overall scoring Mirrors WAR/Advisor output shape, so the results feel familiar Where we deliberately differ — and why Concern Microsoft Diagram Builder Why we differ Needs deployed resources Advisor: yes No — works on a diagram We're a design-time tool; the architecture doesn't exist yet Needs human Q&A WAR: yes No — derived from the diagram One-click validation inside the design flow Healthy/Applicable ratio Advisor: yes No No resource-health signal exists pre-deployment Subcategory fixed weights Advisor: yes No explicit weights Severity is the de-facto weight (12/7/3/1) Defender Secure Score for Security Advisor: yes No Defender requires deployed resources Cost-weighted scoring Advisor: yes No (separate Cost Estimation feature) Cost is a separate pipeline in our app AI/LLM refinement Neither Yes Catches context-specific issues a static catalog misses, and explains findings in natural language Multi-model comparison Neither Yes Lets architects see scoring variance across models Honest limitations I'd rather you hear these from me than discover them in production: LLM scores drift. ±5–10 points across models on the same diagram is normal. Treat the score as directional, the findings as actionable. The rule-based tag is your anchor. No live telemetry. We can't know if your App Service is actually using availability zones — only that you have App Service in the diagram. Advisor will tell you the truth post-deployment. Generic ruleset. No specialized workload branches yet (AI/ML, IoT, SAP, SaaS). WAR has those. No milestone tracking. Each validation run is independent. Compare runs manually using the Validation Comparison view. Rule coverage is finite. 29 services and 73 rules is a strong start but not exhaustive — the LLM layer exists in part to compensate for that gap. How to use all three together A lifecycle that actually works: Design — Use the Diagram Builder to sketch the architecture and validate at design time. Iterate until the per-pillar scores look reasonable and the critical/high findings are addressed. Deploy — Generate Bicep from the diagram, deploy, and let Azure Advisor start scoring real resources. Operate — Use Azure Advisor continuously. Use Defender Secure Score for security posture. Periodic review — Run a Core WAR every quarter or at major milestones to capture the things only humans know (business context, tradeoffs, planned debt). None of these three replace the others. They cover different stages of the same loop. What's next A few things on the roadmap I'd love feedback on: Milestone tracking so design-time scores can be compared over time the way WAR milestones work. Workload-specific rulesets mirroring WAR's branches — starting with AI/ML. Direct Advisor handoff — once a diagram is deployed, surface the corresponding Advisor recommendations in the same UI to close the loop. Try it, fork it, tell me where it's wrong Live app: https://aka.ms/diagram-builder Source: github.com/Arturo-Quiroga-MSFT/azure-architecture-diagram-builder Useful references: Azure Well-Architected Framework pillars Azure Well-Architected Review tool Azure Advisor score — calculation Use Azure WAF assessments (Advisor) Complete an Azure Well-Architected Review assessment If you're a partner or customer architect who's already living in Advisor and WAR, I'd genuinely value your reaction — does the design-time stage feel like a real gap to you, or are you already covering it some other way? Open an issue on the repo or reply on LinkedIn. Posted on the Azure Architecture Blog · Comments and issues welcome on the repo.47Views0likes0CommentsFrom Prompt to Production: Building Azure Architecture Diagrams with AI
Author: Arturo Quiroga, Senior Partner Solutions Architect — Microsoft Cloud architects spend significant time translating ideas into architecture diagrams. They toggle between Visio, draw.io, pricing calculators, and documentation. According to the 2024 Stack Overflow Developer Survey, 61% of developers spend more than 30 minutes a day searching for answers or solutions, time lost to context-switching rather than design. What if you could describe your architecture in plain English and get a diagram, cost estimate, and deployment guide in minutes? The Challenge: Fragmented Architecture Workflows Designing Azure architectures today typically involves multiple disconnected steps: Sketch the architecture in a diagramming tool Look up official Azure icons and drag them into place Research pricing across regions using the Azure Pricing Calculator Validate the design against the Well-Architected Framework (WAF) Write deployment documentation and Infrastructure as Code templates Compare alternative designs manually Each step lives in a different tool, and keeping them in sync as designs evolve is costly. The Azure Architecture Diagram Builder brings these workflows together in a single browser-based experience. How It Works Describe your architecture in natural language, for example "A HIPAA-compliant healthcare platform with FHIR APIs, event-driven processing, and multi-region disaster recovery", and the AI generates a diagram with grouped services, data flow connections, and logical organization. Figure 1. Enter a natural-language prompt describing your architecture. Curated example prompts help you get started, and you can optionally upload an existing diagram for the AI to analyze. The tool uses Azure OpenAI to power generation across multiple models, enabling you to choose the model that best fits your scenario — from fast iterations to deeper reasoning. Key Features AI-Powered Architecture Generation Describe what you need in plain English, and the AI creates an architecture diagram with: 714 official Azure service icons across 29 categories Smart grouping: services are logically organized (Frontend, Backend, Data, Security) Data flow connections: labeled edges showing how data moves through the system 13 curated example prompts: from simple web apps to complex enterprise scenarios like Zero Trust networks, Industrial IoT with 5,000+ sensors, and global multiplayer gaming backends Figure 2. A generated industrial IoT architecture. Top: the clean diagram view as initially produced. Bottom: the same diagram with per-service monthly cost overlays toggled on, plus a running subscription total in the toolbar. Architecture Image Import Already have an architecture on a whiteboard or in a screenshot? Upload the image and let the AI analyze it, mapping services to official Azure icons and recreating the architecture as an editable, interactive diagram. Figure 3. Upload a photo of a whiteboard sketch (top-right reference panel) and the AI recreates it as an editable diagram with official Azure service icons and labeled data flow connections. ARM Template Import Import existing ARM templates to visualize your current infrastructure. The AI parses resource definitions and dependencies, groups related resources into logical layers, and produces a meaningful diagram of what you actually have deployed — a fast way to document an inherited environment or sanity-check a template before deployment. Figure 4. ARM template import in action. Top: the parser status banner while resources and dependencies are being analyzed. Bottom: the resulting diagram, with resources auto-grouped into logical layers (Web Tier, Data Layer, Container Platform, Observability & Logging) and a Generated from: ARM Template badge linking the diagram back to its source file. Well-Architected Framework Validation Validate your architecture against all five WAF pillars — Security, Reliability, Performance Efficiency, Cost Optimization, and Operational Excellence. The validator provides: An overall WAF score with pillar-level breakdowns Specific findings with severity levels Actionable recommendations you can select and apply Select the recommendations you agree with, and the AI regenerates an improved architecture incorporating those changes. Figure 5. WAF validation results showing the overall score, per-pillar breakdowns, and individual findings with severity badges. Tick the recommendations you want and the AI rebuilds the diagram with those changes applied. Multi-Model Comparison Run the same architecture prompt through multiple AI models side-by-side and compare: Architecture Comparison: service counts, connection counts, groups, token usage, and latency Validation Comparison: WAF scores across models, severity breakdowns, and finding counts Apply Winner: pick the best result and apply it to the canvas with one click Present Critique: a talking avatar narrates the AI-generated ranking with live closed captions Figure 6. Multi-model comparison. Top: select the models and reasoning effort, then enter the prompt. Bottom: side-by-side results across all selected models with service counts, latency, token usage, and Fastest / Cheapest / Most Thorough badges. Multi-Region Cost Estimation Get cost estimates from the Azure Retail Prices API across 8 Azure regions: East US 2, Australia East, Canada Central, Brazil South, Mexico Central, West Europe, Sweden Central, and Southeast Asia. Features include: Color-coded cost legend (green / yellow / red thresholds) SKU and tier information for each service Export options: CSV, JSON, plain-text summary, and an analysis report with top cost drivers, Reserved Instance flags, and a ranked multi-region comparison table Figure 7. The cost legend overlay shows per-service pricing with color-coded thresholds. The region selector in the toolbar lets you re-price the entire architecture in any of eight Azure regions. Deployment Guide Generation with Bicep Generate step-by-step deployment documentation including: Prerequisites and Azure resource requirements Step-by-step deployment instructions Bicep templates for each service (Infrastructure as Code) Post-deployment verification steps Security configuration recommendations Figure 8. Each generated Deployment Guide opens with the architecture name, an estimated deployment time, and a prerequisites checklist covering subscription roles, CLI versions, Microsoft Entra ID permissions, and region requirements, followed by numbered, copy-ready deployment steps. Figure 9. The Infrastructure as Code section produces a main.bicep orchestrator plus a per-service module (Log Analytics, Key Vault, Cosmos DB, SQL Database, Event Hubs, Azure Functions, and more). The Download All Templates button packages everything into a ready-to-deploy folder. Workflow Animation & Avatar Presenter Visualize how data flows through your architecture with step-by-step animations that highlight services on the canvas as each step plays. When the Azure Speech Service is configured, a photorealistic talking avatar can narrate the workflow or present model comparison results, with live word-by-word closed captions in a draggable, resizable panel. Figure 10. A workflow step is highlighted on the canvas as the Avatar Presenter narrates that step. Live word-by-word closed captions appear in a draggable, resizable panel, useful for accessibility and stakeholder demos. Export Options Figure 11. A single-slide PowerPoint export, available in dark or light theme, ready to drop straight into a stakeholder deck. Format Use Case PNG Documentation, presentations SVG Scalable vector graphics PPTX Single PowerPoint slide (dark or light theme) Draw.io Edit in diagrams.net JSON Backup, version control CSV / ZIP Cost analysis with multi-region comparison Highlights The Azure Architecture Diagram Builder unifies the architecture design lifecycle in a single tool: End-to-end workflow: from natural-language description to deployable Bicep templates without tool switching Official Azure icons: 714 icons across 29 categories, mapped directly from the Azure service catalog Live pricing: queries the Azure Retail Prices API at design time rather than relying on static estimates WAF-integrated validation: architectural best practices built into the design loop rather than applied after the fact Multi-model flexibility: choose the AI model that best suits each task, with fast models for iteration and reasoning models for complex designs Open source: the source code is available for customization and contribution One-Command Deploy with Azure Developer CLI The fastest way to get your own instance running is with azd : # Install azd (once) brew tap azure/azd && brew install azd # macOS winget install microsoft.azd # Windows # Clone, configure, and deploy git clone https://github.com/Arturo-Quiroga-MSFT/azure-architecture-diagram-builder cd azure-architecture-diagram-builder azd auth login azd env set AZURE_OPENAI_ENDPOINT "https://your-resource.openai.azure.com/" azd env set AZURE_OPENAI_API_KEY "your-key" azd up # Provisions infrastructure + builds + deploys (~8 min) azd up provisions the following via Bicep: Resource Purpose Azure Container Registry Stores the Docker image Azure Container Apps Runs the app (nginx + token server) Log Analytics + Application Insights Monitoring and telemetry Azure Speech (S0) Avatar Presenter (optional, keyless auth via managed identity) Try It Today The Azure Architecture Diagram Builder is available now: Live demo: https://aka.ms/diagram-builder Source code: GitHub repository Documentation: See the Getting Started Guide for detailed setup instructions We welcome feedback and contributions. Use the GitHub Issues page to report bugs, suggest features, or share your experience. Tags: artificial intelligence · application · apps & devops · well architected · infrastructure386Views1like0CommentsHow to Modernise a Microsoft Access Database (Forms + VBA) to Node.JS, OpenAPI and SQL Server
Microsoft Access has played a significant role in enterprise environments for over three decades. Released in November 1992, its flexibility and ease of use made it a popular choice for organizations of all sizes—from FTSE250 companies to startups and the public sector. The platform enables rapid development of graphical user interfaces (GUIs) paired with relational databases, allowing users to quickly create professional-looking applications. Developers, data architects, and power users have all leveraged Microsoft Access to address various enterprise challenges. Its integration with Microsoft Visual Basic for Applications (VBA), an object-based programming language, ensured that Access solutions often became central to business operations. Unsurprisingly, modernizing these applications is a common requirement in contemporary IT engagements as thse solutions lead to data fragmentation, lack of integration into master data systems, multiple copies of the same data replicated across each access database and so on. At first glance, upgrading a Microsoft Access application may seem simple, given its reliance on forms, VBA code, queries, and tables. However, substantial complexity often lurks beneath this straightforward exterior. Modernization efforts must consider whether to retain the familiar user interface to reduce staff retraining, how to accurately re-implement business logic, strategies for seamless data migration, and whether to introduce an API layer for data access. These factors can significantly increase the scope and effort required to deliver a modern equivalent, especially when dealing with numerous web forms, making manual rewrites a daunting task. This is where GitHub Copilot can have a transformative impact, dramatically reducing redevelopment time. By following a defined migration path, it is possible to deliver a modernized solution in as little as two weeks. In this blog post, I’ll walk you through each tier of the application and give you example prompts used at each stage. 🏛️Architecture Breakdown: The N-Tier Approach Breaking down the application architecture reveals a classic N-Tier structure, consisting of a presentation layer, business logic layer, data access layer, and data management layer. 💫First-Layer Migration: Migrating a Microsoft Access Database to SQL Server The migration process began with the database layer, which is typically the most straightforward to move from Access to another relational database management system (RDBMS). In this case, SQL Server was selected to leverage the SQL Server Migration Assistant (SSMA) for Microsoft Access—a free tool from Microsoft that streamlines database migration to SQL Server, Azure SQL Database, or Azure SQL Database Managed Instance (SQLMI). While GitHub Copilot could generate new database schemas and insert scripts, the availability of a specialized tool made the process more efficient. Using SSMA, the database was migrated to SQL Server with minimal effort. However, it is important to note that relationships in Microsoft Access may lack explicit names. In such cases, SSMA appends a GUID or uses one entirely to create unique foreign key names, which can result in confusing relationship names post-migration. Fortunately, GitHub Copilot can batch-rename these relationships in the generated SQL scripts, applying more meaningful naming conventions. By dropping and recreating the constraints, relationships become easier to understand and maintain. SSMA handles the bulk of the migration workload, allowing you to quickly obtain a fully functional SQL Server database containing all original data. In practice, renaming and recreating constraints often takes longer than the data migration itself. Prompt Used: # Context I want to refactor the #file:script.sql SQL script. Your task is to follow the below steps to analyse it and refactor it according to the specified rules. You are allowed to create / run any python scripts or terminal commands to assist in the analysis and refactoring process. # Analysis Phase Identify: Any warning comments Relations between tables Foreign key creation References to these foreign keys in 'MS_SSMA_SOURCE' metadata # Refactor Phase Refactor any SQL matching the following rules: - Create a new script file with the same name as the original but with a `.refactored.sql` extension - Rename any primary key constraints to follow the format PK_{table_name}_{column_name} - Rename any foreign key constraints like [TableName]${GUID} to FK_{child_table}_{parent_table} - Rename any indexes like [TableName]${GUID} to IDX_{table_name}_{column_name} - Ensure any updated foreign keys are updated elsewhere in the script - Identify which warnings flagged by the migration assistant need addressed # Summary Phase Create a summary file in markdown format with the following sections: - Summary of changes made - List of warnings addressed - List of foreign keys renamed - Any other relevant notes 🤖Bonus: Introduce Database Automation and Change Management As we now had a SQL database, we needed to consider how we would roll out changes to the database and we could introduce a formal tool to cater for this within the solution which was Liquibase. Prompt Used: # Context I want to refactor #file:db.changelog.xml. Your task is to follow the below steps to analyse it and refactor it according to the specified rules. You are allowed to create / run any python scripts or terminal commands to assist in the analysis and refactoring process. # Analysis Phase Analyse the generated changelog to identify the structure and content. Identify the tables, columns, data types, constraints, and relationships present in the database. Identify any default values, indexes, and foreign keys that need to be included in the changelog. Identify any vendor specific data types / fucntions that need to be converted to common Liquibase types. # Refactor Phase DO NOT modify the original #file:db.changelog.xml file in any way. Instead, create a new changelog file called `db.changelog-1-0.xml` to store the refactored changesets. The new file should follow the structure and conventions of Liquibase changelogs. You can fetch https://docs.liquibase.com/concepts/data-type-handling.html to get available Liquibase types and their mappings across RDBMS implementations. Copy the original changesets from the `db.changelog.xml` file into the new file Refactor the changesets according to the following rules: - The main changelog should only include child changelogs and not directly run migration operations - Child changelogs should follow the convention db.changelog-{version}.xml and start at 1-0 - Ensure data types are converted to common Liquibase data types. For example: - `nvarchar(max)` should be converted to `TEXT` - `datetime2` should be converted to `TIMESTAMP` - `bit` should be converted to `BOOLEAN` - Ensure any default values are retained but ensure that they are compatible with the liquibase data type for the column. - Use standard SQL functions like `CURRENT_TIMESTAMP` instead of vendor-specific functions. - Only use vendor specific data types or functions if they are necessary and cannot be converted to common Liquibase types. These must be documented in the changelog and summary. Ensure that the original changeset IDs are preserved for traceability. Ensure that the author of all changesets is "liquibase (generated)" # Validation Phase Validate the new changelog file against the original #file:db.changelog.xml to ensure that all changesets are correctly refactored and that the structure is maintained. Confirm no additional changesets are added that were not present in the original changelog. # Finalisation Phase Provide a summary of the changes made in the new changelog file. Document any vendor specific data types or functions that were used and why they could not be converted to common Liquibase types. Ensure the main changelog file (`db.changelog.xml`) is updated to include the new child changelog file (`db.changelog-1-0.xml`). 🤖Bonus: Synthetic Data Generation Since the legacy system lacked synthetic data for development or testing, GitHub Copilot was used to generate fake seed data. Care was taken to ensure all generated data was clearly fictional—using placeholders like ‘Fake Name’ and ‘Fake Town’—to avoid any confusion with real-world information. This step greatly improved the maintainability of the project, enabling developers to test features without handling sensitive or real data. 💫Second-Layer Migration: OpenAPI Specifications With data migration complete, the focus shifted to implementing an API-driven approach for data retrieval. Adopting modern standards, OpenAPI specifications were used to define new RESTful APIs for creating, reading, updating, and deleting data. Because these APIs mapped directly to underlying entities, GitHub Copilot efficiently generated the required endpoints and services in Node.js, utilizing a repository pattern. This approach not only provided robust APIs but also included comprehensive self-describing documentation, validation at the API boundary, automatic error handling, and safeguards against invalid data reaching business logic or database layers. 💫Third-Layer Migration: Business Logic The business logic, originally authored in VBA, was generally straightforward. GitHub Copilot translated this logic into its Node.js equivalent and created corresponding tests for each method. These tests were developed directly from the code, adding a layer of quality assurance that was absent in the original Access solution. The result was a set of domain services mirroring the functionality of their VBA predecessors, successfully completing the migration of the third layer. At this stage, the project had a new database, a fresh API tier, and updated business logic, all conforming to the latest organizational standards. The final major component was the user interface, an area where advances in GitHub Copilot’s capabilities became especially evident. 💫Fourth Layer: User Interface The modernization of the Access Forms user interface posed unique challenges. To minimize retraining requirements, the new system needed to retain as much of the original layout as possible, ensuring familiar placement of buttons, dropdowns, and other controls. At the same time, it was necessary to meet new accessibility standards and best practices. Some Access forms were complex, spanning multiple tabs and containing numerous controls. Manually describing each interface for redevelopment would have been time-consuming. Fortunately, newer versions of GitHub Copilot support image-based prompts, allowing screenshots of Access Forms to serve as context. Using these screenshots, Copilot generated Government Digital Service Views that closely mirrored the original application while incorporating required accessibility features, such as descriptive labels and field selectors. Although the automatically generated UI might not fully comply with all current accessibility standards, prompts referencing WCAG guidelines helped guide Copilot’s improvements. The generated interfaces provided a strong starting point for UX engineers to further refine accessibility and user experience to meet organizational requirements. 🤖Bonus: User Story Generation from the User Interface For organizations seeking a specification-driven development approach, GitHub Copilot can convert screenshots and business logic into user stories following the “As a … I want to … So that …” format. While not flawless, this capability is invaluable for systems lacking formal requirements, giving business analysts a foundation to build upon in future iterations. 🤖Bonus: Introducing MongoDB Towards the end of the modernization engagement, there was interest in demonstrating migration from SQL Server to MongoDB. GitHub Copilot can facilitate this migration, provided it is given adequate context. As with all NoSQL databases, the design should be based on application data access patterns—typically reading and writing related data together. Copilot’s ability to automate this process depends on a comprehensive understanding of the application’s data relationships and patterns. # Context The `<business_entity>` entity from the existing system needs to be added to the MongoDB schema. You have been provided with the following: - #file:documentation - System documentation to provide domain / business entity context - #file:db.changelog.xml - Liquibase changelog for SQL context - #file:mongo-erd.md - Contains the current Mongo schema Mermaid ERD. Create this if it does not exist. - #file:stories - Contains the user stories that will the system will be built around # Analysis Phase Analyse the available documentation and changelog to identify the structure, relationships, and business context of the `<business_entity>`. Identify: - All relevant data fields and attributes - Relationships with other entities - Any specific data types, constraints, or business rules Determine how this entity fits into the overall MongoDB schema: - Should it be a separate collection? - Should it be embedded in another document? - Should it be a reference to another collection for lookups or relationships? - Explore the benefit of denormalization for performance and business needs Consider the data access patterns and how this entity will be used in the application. # MongoDB Schema Design Using the analysis, suggest how the `<business_entity>` should be represented in MongoDB: - The name of the MongoDB collection that will represent this entity - List each field in the collection, its type, any constraints, and what it maps to in the original business context - For fields that are embedded, document the parent collection and how the fields are nested. Nested fields should follow the format `parentField->childField`. - For fields that are referenced, document the reference collection and how the lookup will be performed. - Provide any additional notes on indexing, performance considerations, or specific MongoDB features that should be used - Always use pascal case for collection names and camel case for field names # ERD Creation Create or update the Mermaid ERD in `mongo-erd.md` to include the results of your analysis. The ERD should reflect: - The new collection or embedded document structure - Any relationships with other collections/entities - The data types, constraints, and business rules that are relevant for MongoDB - Ensure the ERD is clear and follows best practices for MongoDB schema design Each entity in the ERD should have the following layout: **Entity Name**: The name of the MongoDB collection / schema **Fields**: A list of fields in the collection, including: - Field Name (in camel case) - Data Type (e.g., String, Number, Date, ObjectId) - Constraints (e.g. indexed, unique, not null, nullable) In this example, Liquibase was used as a changelog to supply the necessary context, detailing entities, columns, data types, and relationships. Based on this, Copilot could offer architectural recommendations for new document or collection types, including whether to embed documents or use separate collections with cache references for lookup data. Copilot can also generate an entity relationship diagram (ERD), allowing for review and validation before proceeding. From there, a new data access layer can be generated, configurable to switch between SQL Server and MongoDB as needed. While production environments typically standardize on a single database model, this demonstration showcased the speed and flexibility with which strategic architectural components can be introduced using GitHub Copilot. 👨💻Conclusion This modernization initiative demonstrated how strategic use of automation and best practices can transform legacy Microsoft Access solutions into scalable, maintainable architectures utilizing Node.js, SQL Server, MongoDB, and OpenAPI. By carefully planning each migration layer—from database and API specifications to business logic—the team preserved core functionality while introducing modern standards and enhanced capabilities. GitHub Copilot played a pivotal role, not only speeding up redevelopment but also improving code quality through automated documentation, test generation, and meaningful naming conventions. The result was a significant reduction in development time, with a robust, standards-compliant system delivered in just two weeks compared to an estimated six to eight months using traditional manual methods. This project serves as a blueprint for organizations seeking to modernize their Access-based applications, highlighting the efficiency gains and quality improvements that can be achieved by leveraging AI-powered tools and well-defined migration strategies. The approach ensures future scalability, easier maintenance, and alignment with contemporary enterprise requirements.1.3KViews1like2CommentsAzure Course Blueprints
Each Blueprint serves as a 1:1 visual representation of the official Microsoft instructor‑led course (ILT), ensuring full alignment with the learning path. This helps learners: see exactly how topics fit into the broader Azure landscape, map concepts interactively as they progress, and understand the “why” behind each module, not just the “what.” Formats Available: PDF · Visio · Excel · Video Every icon is clickable and links directly to the related Learn module. Layers and Cross‑Course Comparisons For expert‑level certifications like SC‑100 and AZ‑305, the Visio Template+ includes additional layers for each associate-level course. This allows trainers and students to compare certification paths at a glance: 🔐 Security Path SC‑100 side‑by‑side with SC‑200, SC‑300, AZ‑500 🏗️ Infrastructure & Dev Path AZ‑305 alongside AZ‑104, AZ‑204, AZ‑700, AZ‑140 This helps learners clearly identify: prerequisites, skill gaps, overlapping modules, progression paths toward expert roles. Because associate certifications (e.g., SC‑300 → SC‑100 or AZ‑104 → AZ‑305) are often prerequisites or recommended foundations, this comparison layer makes it easy to understand what additional knowledge is required as learners advance. Azure Course Blueprints + Demo Deploy Demos are essential for achieving end‑to‑end understanding of Azure. To reduce preparation overhead, we collaborated with Peter De Tender to align each Blueprint with the official Trainer Demo Deploy scenarios. With a single click, trainers can deploy the full environment and guide learners through practical, aligned demonstrations. https://aka.ms/DemoDeployPDF Benefits for Students 🎯 Defined Goals Learners clearly see the skills and services they are expected to master. 🔍 Focused Learning By spotlighting what truly matters, the Blueprint keeps learners oriented toward core learning objectives. 📈 Progress Tracking Students can easily identify what they’ve already mastered and where more study is needed. 📊 Slide Deck Topic Lists (Excel) A downloadable .xlsx file provides: a topic list for every module, links to Microsoft Learn, prerequisite dependencies. This file helps students build their own study plan while keeping all links organized. Download links Associate Level PDF - Demo Visio Contents AZ-104 Azure Administrator Associate R: 12/14/2023 U: 12/17/2025 Blueprint Demo Video Visio Excel AZ-204 Azure Developer Associate R: 11/05/2024 U: 12/17/2025 Blueprint Demo Visio Excel AZ-500 Azure Security Engineer Associate R: 01/09/2024 U: 10/10/2024 Blueprint Demo Visio+ Excel AZ-700 Azure Network Engineer Associate R: 01/25/2024 U: 12/17/2025 Blueprint Demo Visio Excel SC-200 Security Operations Analyst Associate R: 04/03/2025 U:04/09/2025 Blueprint Demo Visio Excel SC-300 Identity and Access Administrator Associate R: 10/10/2024 Blueprint Demo Excel Specialty PDF Visio AZ-140 Azure Virtual Desktop Specialty R: 01/03/2024 U: 12/17/2025 Blueprint Demo Visio Excel Expert level PDF Visio AZ-305 Designing Microsoft Azure Infrastructure Solutions R: 05/07/2024 U: 12/17/2025 Blueprint Demo Visio+ AZ-104 AZ-204 AZ-700 AZ-140 Excel SC-100 Microsoft Cybersecurity Architect R: 10/10/2024 U: 04/09/2025 Blueprint Demo Visio+ AZ-500 SC-300 SC-200 Excel Skill based Credentialing PDF AZ-1002 Configure secure access to your workloads using Azure virtual networking R: 05/27/2024 Blueprint Visio Excel AZ-1003 Secure storage for Azure Files and Azure Blob Storage R: 02/07/2024 U: 02/05/2024 Blueprint Excel Subscribe if you want to get notified of any update like new releases or updates. Author: Ilan Nyska, Microsoft Technical Trainer My email ilan.nyska@microsoft.com LinkedIn https://www.linkedin.com/in/ilan-nyska/ I’ve received so many kind messages, thank-you notes, and reshares — and I’m truly grateful. But here’s the reality: 💬 The only thing I can use internally to justify continuing this project is your engagement — through this survey https://lnkd.in/gnZ8v4i8 ___ Benefits for Trainers: Trainers can follow this plan to design a tailored diagram for their course, filled with notes. They can construct this comprehensive diagram during class on a whiteboard and continuously add to it in each session. This evolving visual aid can be shared with students to enhance their grasp of the subject matter. Explore Azure Course Blueprints! | Microsoft Community Hub Visio stencils Azure icons - Azure Architecture Center | Microsoft Learn ___ Are you curious how grounding Copilot in Azure Course Blueprints transforms your study journey into smarter, more visual experience: 🧭 Clickable guides that transform modules into intuitive roadmaps 🌐 Dynamic visual maps revealing how Azure services connect ⚖️ Side-by-side comparisons that clarify roles, services, and security models Whether you're a trainer, a student, or just certification-curious, Copilot becomes your shortcut to clarity, confidence, and mastery. Navigating Azure Certifications with Copilot and Azure Course Blueprints | Microsoft Community Hub36KViews15likes20CommentsTransforming Video Content into Structured SOPs Using Graph-based RAG
Introduction In today’s digital-first environments, a large portion of enterprise knowledge lives inside video content, training sessions, onboarding walkthroughs, and recorded operational procedures. While videos are great for learning, they are not ideal for quick reference, compliance, or repeatable processes. Converting that knowledge into structured documentation like Standard Operating Procedures (SOPs) is often manual and time-consuming. What if this process could be automated using AI? The Problem Transcripts alone don’t solve the problem. When videos are converted into text, the output typically lacks: Clear structure (sections, headings, hierarchy) Context (relationships between steps, tools, and roles) Completeness (definitions and dependencies spread across the content) This leads to a common challenge: Teams spend significant effort manually reading transcripts, interpreting context, and restructuring them into usable documentation. As seen in modern architecture challenges, manual and repetitive configurations don’t scale well and increase maintenance effort Enter Graph-based RAG (GraphRAG) GraphRAG extends traditional RAG by building a knowledge graph instead of treating content as disconnected chunks. What GraphRAG Does Extracts entities (tools, systems, roles, concepts) Maps relationships between them Groups related concepts into logical sections Preserves context across the entire document Architecture Overview Below is the high-level pipeline: Video → Transcription → Knowledge Graph → LLM Generation → Structured SOP Implementation Approach (Step-by-Step) Stage 1: Knowledge Graph Construction Convert video to transcript Split transcript into chunks Feed chunks into GraphRAG GraphRAG performs: Text Unit Extraction Entity Recognition Relationship Mapping Community Detection Result: A structured knowledge graph representation of the transcript Stage 2: Structure Extraction From the knowledge graph: Sequential Steps Preserve procedural flow from transcript order Logical Sections Derived using community detection Key Concepts Identified using graph centrality (importance via connections) This creates a framework for the SOP Stage 3: Intelligent Document Generation Using Azure OpenAI, each SOP section is generated: Section Generated From Title & Purpose High-level concepts Scope Entity boundaries Definitions Entity descriptions Responsibilities Role-based entities Procedures Sequential steps References Linked content The key advantage: LLM is grounded in graph structure not raw text Key Benefits Context Preservation - Relationships between concepts are maintained across sections. Comprehensive Coverage - Community detection ensures important topics are not missed. Reduced Hallucination - LLM generation is grounded in structured knowledge. Scalability- Works for: 30-minute tutorials, 3-hour training sessions and Enterprise knowledge bases Real-World Impact (Example) In enterprise scenarios like pharmaceutical SOP generation: Processing time: ~15–20 minutes for a multi-hour video Output quality: 8–10 structured SOP sections Consistency: Terminology and relationships preserved Coverage: Minimal missing topics Where This Approach Works Best Training videos → SOPs Meeting recordings → action summaries Technical demos → documentation Interview recordings → knowledge bases Tutorials → reference guides Key Takeaway This approach represents a shift from text processing → knowledge understanding. By combining: Knowledge graphs (structure) LLMs (language generation) We can transform raw, unstructured content into usable, enterprise-grade knowledge assets. Resources https://microsoft.github.io/graphrag/index/overview/ Final Thoughts Have you explored GraphRAG or similar approaches in your projects? What challenges did you face? How did you handle unstructured knowledge? Share your experiences — let’s learn together.220Views0likes0CommentsCentralizing Enterprise API Access for Agent-Based Architectures
Problem Statement When building AI agents or automation solutions, calling enterprise APIs directly often means configuring individual HTTP actions within each agent for every API. While this works for simple scenarios, it quickly becomes repetitive and difficult to manage as complexity grows. The challenge becomes more pronounced when a single business domain exposes multiple APIs, or when the same APIs are consumed by multiple agents. This leads to duplicated configurations, higher maintenance effort, inconsistent behavior, and increased governance and security risks. A more scalable approach is to centralize and reuse API access. By grouping APIs by business domain using an API management layer, shaping those APIs through a Model Context Protocol (MCP) server, and exposing the MCP server as a standardized tool or connector, agents can consume business capabilities in a consistent, reusable, and governable manner. This pattern not only reduces duplication and configuration overhead but also enables stronger versioning, security controls, observability, and domain‑driven ownership—making agent-based systems easier to scale and operate in enterprise environments. Designing Agent‑Ready APIs with Azure API Management, an MCP Server, and Copilot Studio As enterprises increasingly adopt AI‑powered assistants and Copilots, API design must evolve to meet the needs of intelligent agents. Traditional APIs—often designed for user interfaces or backend integrations—can expose excessive data, lack intent-level abstraction, and increase security risk when consumed directly by AI systems. This document outlines a practical, enterprise-‑ready approach to organize APIs in Azure API Management (APIM), introduce a Model Context Protocol (MCP) server to shape and control context, and integrate the solution with Microsoft Copilot Studio. The goal is to make APIs truly agent-‑ready: secure, scalable, reusable, and easy to govern. Architecture at a glance Back-end services expose domain APIs. Azure API Management (APIM) groups and governs those APIs (products, policies, authentication, throttling, versions). An MCP server calls APIM, orchestrates/filters responses, and returns concise, model-friendly outputs. Copilot Studio connects to the MCP server and invokes a small set of predictable operations to satisfy user intents. Why Traditional API Designs Fall Short for AI Agents Enterprise APIs have historically been built around CRUD operations and service-‑to-‑service integration patterns. While this works well for deterministic applications, AI agents work best with intent-driven operations and context-aware responses. When agents consume traditional APIs directly, common issues include: overly verbose payloads, multiple calls to satisfy a single user intent, and insufficient guardrails for read vs. write operations. The result can be unpredictable agent behavior that is difficult to test, validate, and govern. Structuring APIs Effectively in Azure API Management Azure API Management (APIM) is the control plane between enterprise systems and AI agents. A well-‑structured APIM instance improves security, discoverability, and governance through products, policies, subscriptions, and analytics. Key design principles for agent consumption Organize APIs by business capability (for example, Customer, Orders, Billing) rather than technical layers. Expose agent-facing APIs via dedicated APIM products to enable controlled access, throttling, versioning, and independent lifecycle management. Prefer read-only operations where possible; scope write operations narrowly and protect them with explicit checks, approvals, and least-privilege identities. Read‑only APIs should be prioritized, while action‑oriented APIs must be carefully scoped and gated. The Role of the MCP Server in Agent‑Based Architectures APIM provides governance and security, but agents also need an intent-level interface and model-friendly responses. A Model Context Protocol (MCP) server fills this gap by acting as a mediator between Copilot Studio and APIM-exposed APIs. Instead of exposing many back-end endpoints directly to the agent, the MCP server can: orchestrate multiple API calls, filter irrelevant fields, enforce business rules, enrich results with additional context, and emit concise, predictable JSON outputs. This makes agent behavior more reliable and easier to validate. Instead of exposing multiple backend APIs directly to the agent, the MCP server aggregates responses, filters irrelevant data, enriches results with business context, and formats responses into LLM‑friendly schemas. By introducing this abstraction layer, Copilot interactions become simpler, safer, and more deterministic. The agent interacts with a small number of well‑defined MCP operations that encapsulate enterprise logic without exposing internal complexity. Designing an Effective MCP Server An MCP server should have a focused responsibility: shaping context for AI models. It should not replace core back-end services; it should adapt enterprise capabilities for agent consumption. What MCP should do An MCP server should be designed with a clear and focused responsibility: shaping context for AI models. Its primary role is not to replace backend services, but to adapt enterprise data for intelligent consumption. MCP does not orchestrate enterprise workflows or apply business logic. It standardizes how agents discover and invoke external tools and APIs by exposing them through a structured protocol interface. Orchestration, intent resolution, and policy-driven execution are handled by the agent runtime or host framework. It is equally important to understand what does not belong in MCP. Complex transactional workflows, long‑running processes, and UI‑specific formatting should remain in backend systems. Keeping MCP lightweight ensures scalability and easier maintenance. Call APIM-managed APIs and orchestrate multi-step retrieval when needed. Apply security checks and business rules consistently. Filter and minimize payloads (return only fields needed for the intent). Normalize and reshape responses into stable, predictable JSON schemas. Handle errors and edge cases with safe, descriptive messages. What MCP should not do Avoid implementing complex transactional workflows, long-running processes, or UI-specific formatting in MCP. Keep it lightweight so it remains scalable, testable, and easy to maintain. Step by step guide 1) Create an MCP server in Azure API Management (APIM) Open the Azure portal (portal.azure.com). Go to your API Management instance. In the left navigation, expand APIs. Create (or select) an API group for the business domain you want to expose (for example, Orders or Customers). Add the relevant APIs/operations to that API group. Create or select an APIM product dedicated for agent usage, and ensure the product requires a subscription (subscription key). Create an MCP server in APIM and map it to the API (or API group) you want to expose as MCP operations. In the MCP server settings, ensure Subscription key required is enabled. From the product’s Subscriptions page, copy the subscription key you will use in Copilot Studio. Screenshot placeholders: APIM API group, product configuration, MCP server mapping, subscription settings, subscription key location. * Note: Using an API Management subscription key to access MCP operations is one supported way to authenticate and consume enterprise APIs. However, this approach is best suited for initial setups, demos, or scenarios where key-based access is explicitly required. For production‑grade enterprise solutions, Microsoft recommends using managed identity–based access control. Managed identities for Azure resources eliminate the need to manage secrets such as subscription keys or client secrets, integrate natively with Microsoft Entra ID, and support fine‑grained role‑based access control (RBAC). This approach improves security posture while significantly reducing operational and governance overhead for agent and service‑to‑service integrations. Wherever possible, agents and MCP servers should authenticate using managed identities to ensure secure, scalable, and compliant access to enterprise APIs. 2) Create a Copilot Studio agent and connect to the APIM MCP server using a subscription key Copilot Studio natively supports Model Context Protocol (MCP) servers as tools. When an agent is connected to an MCP server, the tool metadata—including operation names, inputs, and outputs—is automatically discovered and kept in sync, reducing manual configuration and maintenance overhead. Sign in to Copilot Studio. Create a new agent and add clear instructions describing when to use the MCP tool and how to present results (for example, concise summaries plus key fields). Open Tools > Add tool > Model Context Protocol, then choose Create. Enter the MCP server details: Server endpoint URL: copy this from your MCP server in APIM. Authentication: select API Key. Header name: use the subscription key header required by your APIM configuration. Select Create new connection, paste the APIM subscription key, and save. Test the tool in the agent by prompting for a domain-specific task (for example, “Get order status for 12345”). Validate that responses are concise and that errors are handled safely. Screenshot placeholders: MCP tool creation screen, endpoint + auth configuration, connection creation, test prompt and response. Operational best practices and guardrails Least privilege by default: create separate APIM products and identities for agent scenarios; avoid broad access to internal APIs. Prefer intent-level operations: expose fewer, higher-level MCP operations instead of many low-level endpoints. Protect write operations: require explicit parameters, validation, and (when appropriate) approval flows; keep “read” and “write” tools separate. Stable schemas: return predictable JSON shapes and limit optional fields to reduce prompt brittleness. Observability: log MCP requests/responses (with sensitive fields redacted), monitor APIM analytics, and set alerts for failures and throttling. Versioning: version MCP operations and APIM APIs; deprecate safely. Security hygiene: treat subscription keys as secrets, rotate regularly, and avoid exposing them in prompts or logs. Summary As organizations scale agent‑based and Copilot‑driven solutions, directly exposing enterprise APIs to AI agents quickly becomes complex and risky. Centralizing API access through Azure API Management, shaping agent‑ready context via a Model Context Protocol (MCP) server, and consuming those capabilities through Copilot Studio establishes a clean and governable architecture. This pattern reduces duplication, enforces consistent security controls, and enables intent‑driven API consumption without exposing unnecessary backend complexity. By combining domain‑aligned API products, lightweight MCP operations, and least‑privilege identity‑based access, enterprises can confidently scale AI agents while maintaining strong governance, observability, and operational control. References Azure API Management (APIM) – Overview Azure API Management – Key Concepts Azure MCP Server Documentation (Model Context Protocol) Extend your agent with Model Context Protocol Managed identities for Azure resources – Overview429Views0likes0CommentsAdvancing to Agentic AI with Azure NetApp Files VS Code Extension v1.2.0
The Azure NetApp Files VS Code Extension v1.2.0 introduces a major leap toward agentic, AI‑informed cloud operations with the debut of the autonomous Volume Scanner. Moving beyond traditional assistive AI, this release enables intelligent infrastructure analysis that can detect configuration risks, recommend remediations, and execute approved changes under user governance. Complemented by an expanded natural language interface, developers can now manage, optimize, and troubleshoot Azure NetApp Files resources through conversational commands - from performance monitoring to cross‑region replication, backup orchestration, and ARM template generation. Version 1.2.0 establishes the foundation for a multi‑agent system built to reduce operational toil and accelerate a shift toward self-managing enterprise storage in the cloud.400Views0likes0CommentsDesigning Reliable Health Check Endpoints for IIS Behind Azure Application Gateway
Why Health Probes Matter in Azure Application Gateway Azure Application Gateway relies entirely on health probes to determine whether backend instances should receive traffic. If a probe: Receives a non‑200 response Times out Gets redirected Requires authentication …the backend is marked Unhealthy, and traffic is stopped—resulting in user-facing errors. A healthy IIS application does not automatically mean a healthy Application Gateway backend. Failure Flow: How a Misconfigured Health Probe Leads to 502 Errors One of the most confusing scenarios teams encounter is when the IIS application is running correctly, yet users intermittently receive 502 Bad Gateway errors. This typically happens when health probes fail, causing Azure Application Gateway to mark backend instances as Unhealthy and stop routing traffic to them. The following diagram illustrates this failure flow. Failure Flow Diagram (Probe Fails → Backend Unhealthy → 502) Key takeaway: Most 502 errors behind Azure Application Gateway are not application failures—they are health probe failures. What’s Happening Here? Azure Application Gateway periodically sends health probes to backend IIS instances. If the probe endpoint: o Redirects to /login o Requires authentication o Returns 401 / 403 / 302 o Times out the probe is considered failed. After consecutive failures, the backend instance is marked Unhealthy. Application Gateway stops forwarding traffic to unhealthy backends. If all backend instances are unhealthy, every client request results in a 502 Bad Gateway—even though IIS itself may still be running. This is why a dedicated, lightweight, unauthenticated health endpoint is critical for production stability. Common Health Probe Pitfalls with IIS Before designing a solution, let’s look at what commonly goes wrong. 1. Probing the Root Path (/) Many IIS applications: Redirect / → /login Require authentication Return 401 / 302 / 403 Application Gateway expects a clean 200 OK, not redirects or auth challenges. 2. Authentication-Enabled Endpoints Health probes do not support authentication headers. If your app enforces: Windows Authentication OAuth / JWT Client certificates …the probe will fail. 3. Slow or Heavy Endpoints Probing a controller that: Calls a database Performs startup checks Loads configuration can cause intermittent failures, especially under load. 4. Certificate and Host Header Mismatch TLS-enabled backends may fail probes due to: Missing Host header Incorrect SNI configuration Certificate CN mismatch Design Principles for a Reliable IIS Health Endpoint A good health check endpoint should be: Lightweight Anonymous Fast (< 100 ms) Always return HTTP 200 Independent of business logic Client Browser | | HTTPS (Public DNS) v +-------------------------------------------------+ | Azure Application Gateway (v2) | | - HTTPS Listener | | - SSL Certificate | | - Custom Health Probe (/health) | +-------------------------------------------------+ | | HTTPS (SNI + Host Header) v +-------------------------------------------------------------------+ | IIS Backend VM | | | | Site Bindings: | | - HTTPS : app.domain.com | | | | Endpoints: | | - /health (Anonymous, Static, 200 OK) | | - /login (Authenticated) | | | +-------------------------------------------------------------------+ Azure Application Gateway health probe architecture for IIS backends using a dedicated /health endpoint. Azure Application Gateway continuously probes a dedicated /health endpoint on each IIS backend instance. The health endpoint is designed to return a fast, unauthenticated 200 OK response, allowing Application Gateway to reliably determine backend health while keeping application endpoints secure. Step 1: Create a Dedicated Health Endpoint Recommended Path 1 /health This endpoint should: Bypass authentication Avoid redirects Avoid database calls Example: Simple IIS Health Page Create a static file: 1 C:\inetpub\wwwroot\website\health\index.html Static Fast Zero dependencies Step 2: Exclude the Health Endpoint from Authentication If your IIS site uses authentication, explicitly allow anonymous access to /health. web.config Example 1 <location path="health"> 2 <system.webServer> 3 <security> 4 <authentication> 5 <anonymousAuthentication enabled="true" /> 6 <windowsAuthentication enabled="false" /> 7 </authentication> 8 </security> 9 </system.webServer> 10 </location> ⚠️ This ensures probes succeed even if the rest of the site is secured. Step 3: Configure Azure Application Gateway Health Probe Recommended Probe Settings Setting Value Protocol HTTPS Path /health Interval 30 seconds Timeout 30 seconds Unhealthy threshold 3 Pick host name from backend Enabled Why “Pick host name from backend” matters This ensures: Correct Host header Proper certificate validation Avoids TLS handshake failures Step 4: Validate Health Probe Behavior From Application Gateway Navigate to Backend health Ensure status shows Healthy Confirm response code = 200 From the IIS VM 1 Invoke-WebRequest https://your-app-domain/health Expected: 1 StatusCode : 200 Troubleshooting Common Failures Probe shows Unhealthy but app works ✔ Check authentication rules ✔ Verify /health does not redirect ✔ Confirm HTTP 200 response TLS or certificate errors ✔ Ensure certificate CN matches backend domain ✔ Enable “Pick host name from backend” ✔ Validate certificate is bound in IIS Intermittent failures ✔ Reduce probe complexity ✔ Avoid DB or service calls ✔ Use static content Production Best Practices Use separate health endpoints per application Never reuse business endpoints for probes Monitor probe failures as early warning signs Test probes after every deployment Keep health endpoints simple and boring Final Thoughts A reliable health check endpoint is not optional when running IIS behind Azure Application Gateway—it is a core part of application availability. By designing a dedicated, authentication‑free, lightweight health endpoint, you can eliminate a large class of false outages and significantly improve platform stability. If you’re migrating IIS applications to Azure or troubleshooting unexplained Application Gateway failures, start with your health probe—it’s often the silent culprit.370Views0likes0CommentsAKS cluster with AGIC hits the Azure Application Gateway backend pool limit (100)
I’m writing this article to document a real-world scaling issue we hit while exposing many applications from an Azure Kubernetes Service (AKS) cluster using Application Gateway Ingress Controller (AGIC). The problem is easy to miss because Kubernetes resources keep applying successfully, but the underlying Azure Application Gateway has a hard platform limit of 100 backend pools—so once your deployment pattern requires the 101st pool, AGIC can’t reconcile the gateway configuration and traffic stops flowing for new apps. This post explains how the limit is triggered, how to reproduce and recognize it, and what practical mitigation paths exist as you grow. A real-world scalability limit, reproduction steps, and recommended mitigation options: AGIC typically creates one Application Gateway backend pool per Kubernetes Service referenced by an Ingress. Azure Application Gateway enforces a hard limit of 100 backend pools. When the 101st backend pool is required, Application Gateway rejects the update and AGIC fails reconciliation. Kubernetes resources appear created, but traffic does not flow due to the external platform limit. Gateway API–based application routing is the most scalable forward-looking solution. Architecture Overview The environment follows a Hub-and-Spoke network architecture, commonly used in enterprise Azure deployments to centralize shared services and isolate workloads. Hub Network Azure Firewall / Network security services VPN / ExpressRoute Gateways Private DNS Zones Shared monitoring and governance components Spoke Network Private Azure Kubernetes Service (AKS) cluster Azure Application Gateway with private frontend Application Gateway Ingress Controller (AGIC) Application workloads exposed via Kubernetes Services and Ingress Ingress Traffic Flow Client → Private Application Gateway → AGIC-managed routing → Kubernetes Service → Pod Application Deployment Model Each application followed a simple and repeatable Kubernetes pattern that ultimately triggered backend pool exhaustion. One Deployment per application One Service per application One Ingress per application Each Ingress referencing a unique Service Kubernetes Manifests Used Note: All Kubernetes manifests in this example are deployed into the demo namespace. Please ensure the namespace is created before applying the manifests. Deployment template apiVersion: apps/v1 kind: Deployment metadata: name: app-{{N}} namespace: demo spec: replicas: 1 selector: matchLabels: app: app-{{N}} template: metadata: labels: app: app-{{N}} spec: containers: - name: app image: hashicorp/http-echo:1.0 args: - "-text=Hello from app {{N}}" ports: - containerPort: 5678 Service template apiVersion: v1 kind: Service metadata: name: svc-{{N}} namespace: demo spec: selector: app: app-{{N}} ports: - port: 80 targetPort: 5678 Ingress template apiVersion: networking.k8s.io/v1 kind: Ingress metadata: name: ing-{{N}} namespace: demo spec: ingressClassName: azure-application-gateway rules: - host: app{{N}}.example.internal http: paths: - path: / pathType: Prefix backend: service: name: svc-{{N}} port: number: 80 Reproducing the Backend Pool Limitation The issue was reproduced by deploying 101 applications using the same pattern. Each iteration resulted in AGIC attempting to create a new backend pool. for ($i = 1; $i -le 101; $i++) { (Get-Content deployment.yaml) -replace "{{N}}", $i | kubectl apply -f - (Get-Content service.yaml) -replace "{{N}}", $i | kubectl apply -f - (Get-Content ingress.yaml) -replace "{{N}}", $i | kubectl apply -f - } Observed AGIC Error Code="ApplicationGatewayBackendAddressPoolLimitReached" Message="The number of BackendAddressPools exceeds the maximum allowed value. The number of BackendAddressPools is 101 and the maximum allowed is 100. Root Cause Analysis Azure Application Gateway enforces a non-configurable maximum of 100 backend pools. AGIC creates backend pools based on Services referenced by Ingress resources, leading to exhaustion at scale. Available Options After Hitting the Limit Option 1: Azure Gateway Controller (AGC) AGC uses the Kubernetes Gateway API and avoids the legacy Ingress model. However, it currently supports only public frontends and does not support private frontends. Option 2: ingress-nginx via Application Routing This option is supported only until November 2026 and is not recommended due to deprecation and lack of long-term viability. Option 3: Application Routing with Gateway API (Preview) Gateway API–based application routing is the strategic long-term direction for AKS. Although currently in preview, it has been stable upstream for several years and is suitable for onboarding new applications with appropriate risk awareness. Like in the below screenshot, I am using two controllers. Reference Microsoft documents: Azure Kubernetes Service (AKS) Managed Gateway API Installation (preview) - Azure Kubernetes Service | Microsoft Learn Azure Kubernetes Service (AKS) application routing add-on with the Kubernetes Gateway API (preview) - Azure Kubernetes Service | Microsoft Learn Secure ingress traffic with the application routing Gateway API implementation Conclusion The 100-backend pool limitation is a hard Azure Application Gateway constraint. Teams using AGIC must plan for scale early by consolidating services or adopting Gateway API–based routing to avoid production onboarding blockers. Author: Kumar Shashi Kaushal(Sr. Digital Cloud solutions Architect)467Views0likes0Comments