azure
7799 TopicsBuilding Interactive Agent UIs with AG-UI and Microsoft Agent Framework
Introduction Picture this: You've built an AI agent that analyzes financial data. A user uploads a quarterly report and asks: "What are the top three expense categories?" Behind the scenes, your agent parses the spreadsheet, aggregates thousands of rows, and generates visualizations. All in 20 seconds. But the user? They see a loading spinner. Nothing else. No "reading file" message, no "analyzing data" indicator, no hint that progress is being made. They start wondering: Is it frozen? Should I refresh? The problem isn't the agent's capabilities - it's the communication gap between the agent running on the backend and the user interface. When agents perform multi-step reasoning, call external APIs, or execute complex tool chains, users deserve to see what's happening. They need streaming updates, intermediate results, and transparent progress indicators. Yet most agent frameworks force developers to choose between simple request/response patterns or building custom solutions to stream updates to their UIs. This is where AG-UI comes in. AG-UI is a fairly new event-based protocol that standardizes how agents communicate with user interfaces. Instead of every framework and development team inventing their own streaming solution, AG-UI provides a shared vocabulary of structured events that work consistently across different agent implementations. When an agent starts processing, calls a tool, generates text, or encounters an error, the UI receives explicit, typed events in real time. The beauty of AG-UI is its framework-agnostic design. While this blog post demonstrates integration with Microsoft Agent Framework (MAF), the same AG-UI protocol works with LangGraph, CrewAI, or any other compliant framework. Write your UI code once, and it works with any AG-UI-compliant backend. (Note: MAF supports both Python and .NET - this blog post focuses on the Python implementation.) TL;DR The Problem: Users don't get real-time updates while AI agents work behind the scenes - no progress indicators, no transparency into tool calls, and no insight into what's happening. The Solution: AG-UI is an open, event-based protocol that standardizes real-time communication between AI agents and user interfaces. Instead of each development team and framework inventing custom streaming solutions, AG-UI provides a shared vocabulary of structured events (like TOOL_CALL_START, TEXT_MESSAGE_CONTENT, RUN_FINISHED) that work across any compliant framework. Key Benefits: Framework-agnostic - Write UI code once, works with LangGraph, Microsoft Agent Framework, CrewAI, and more Real-time observability - See exactly what your agent is doing as it happens Server-Sent Events - Built on standard HTTP for universal compatibility Protocol-managed state - No manual conversation history tracking In This Post: You'll learn why AG-UI exists, how it works, and build a complete working application using Microsoft Agent Framework with Python - from server setup to client implementation. What You'll Learn This blog post walks through: Why AG-UI exists - how agent-UI communication has evolved and what problems current approaches couldn't solve How the protocol works - the key design choices that make AG-UI simple, reliable, and framework-agnostic Protocol architecture - the generic components and how AG-UI integrates with agent frameworks Building an AG-UI application - a complete working example using Microsoft Agent Framework with server, client, and step-by-step setup Understanding events - what happens under the hood when your agent runs and how to observe it Thinking in events - how building with AG-UI differs from traditional APIs, and what benefits this brings Making the right choice - when AG-UI is the right fit for your project and when alternatives might be better Estimated reading time: 15 minutes Who this is for: Developers building AI agents who want to provide real-time feedback to users, and teams evaluating standardized approaches to agent-UI communication To appreciate why AG-UI matters, we need to understand the journey that led to its creation. Let's trace how agent-UI communication has evolved through three distinct phases. The Evolution of Agent-UI Communication AI agents have become more capable over time. As they evolved, the way they communicated with user interfaces had to evolve as well. Here's how this evolution unfolded. Phase 1: Simple Request/Response In the early days of AI agent development, the interaction model was straightforward: send a question, wait for an answer, display the result. This synchronous approach mirrored traditional API calls and worked fine for simple scenarios. # Simple, but limiting response = agent.run("What's the weather in Paris?") display(response) # User waits... and waits... Works for: Quick queries that complete in seconds, simple Q&A interactions where immediate feedback and interactivity aren't critical. Breaks down: When agents need to call multiple tools, perform multi-step reasoning, or process complex queries that take 30+ seconds. Users see nothing but a loading spinner, with no insight into what's happening or whether the agent is making progress. This creates a poor user experience and makes it impossible to show intermediate results or allow user intervention. Recognizing these limitations, development teams began experimenting with more sophisticated approaches. Phase 2: Custom Streaming Solutions As agents became more sophisticated, teams recognized the need for incremental feedback and interactivity. Rather than waiting for the complete response, they implemented custom streaming solutions to show partial results as they became available. # Every team invents their own format for chunk in agent.stream("What's the weather?"): display(chunk) # But what about tool calls? Errors? Progress? This was a step forward for building interactive agent UIs, but each team solved the problem differently. Also, different frameworks had incompatible approaches - some streamed only text tokens, others sent structured JSON, and most provided no visibility into critical events like tool calls or errors. The problem: No standardization across frameworks - client code that works with LangGraph won't work with Crew AI, requiring separate implementations for each agent backend Each implementation handles tool calls differently - some send nothing during tool execution, others send unstructured messages Complex state management - clients must track conversation history, manage reconnections, and handle edge cases manually The industry needed a better solution - a common protocol that could work across all frameworks while maintaining the benefits of streaming. Phase 3: Standardized Protocol (AG-UI) AG-UI emerged as a response to the fragmentation problem. Instead of each framework and development team inventing their own streaming solution, AG-UI provides a shared vocabulary of events that work consistently across different agent implementations. # Standardized events everyone understands async for event in agent.run_stream("What's the weather?"): if event.type == "TEXT_MESSAGE_CONTENT": display_text(event.delta) elif event.type == "TOOL_CALL_START": show_tool_indicator(event.tool_name) elif event.type == "TOOL_CALL_RESULT": show_tool_result(event.result) The key difference is structured observability. Rather than guessing what the agent is doing from unstructured text, clients receive explicit events for every stage of execution: when the agent starts, when it generates text, when it calls a tool, when that tool completes, and when the entire run finishes. What's different: A standardized vocabulary of event types, complete observability into agent execution, and framework-agnostic clients that work with any AG-UI-compliant backend. You write your UI code once, and it works whether the backend uses Microsoft Agent Framework, LangGraph, or any other framework that speaks AG-UI. Now that we've seen why AG-UI emerged and what problems it solves, let's examine the specific design decisions that make the protocol work. These choices weren't arbitrary - each one addresses concrete challenges in building reliable, observable agent-UI communication. The Design Decisions Behind AG-UI Why Server-Sent Events (SSE)? Aspect WebSockets SSE (AG-UI) Complexity Bidirectional Unidirectional (simpler) Firewall/Proxy Sometimes blocked Standard HTTP Reconnection Manual implementation Built-in browser support Use case Real-time games, chat Agent responses (one-way) For agent interactions, you typically only need server→client communication, making SSE a simpler choice. SSE solves the transport problem - how events travel from server to client. But once connected, how does the protocol handle conversation state across multiple interactions? Why Protocol-Managed Threads? # Without protocol threads (client manages): conversation_history = [] conversation_history.append({"role": "user", "content": message}) response = agent.complete(conversation_history) conversation_history.append({"role": "assistant", "content": response}) # Complex, error-prone, doesn't work with multiple clients # With AG-UI (protocol manages): thread = agent.get_new_thread() # Server creates and manages thread agent.run_stream(message, thread=thread) # Server maintains context # Simple, reliable, shareable across clients With transport and state management handled, the final piece is the actual messages flowing through the connection. What information should the protocol communicate, and how should it be structured? Why Standardized Event Types? Instead of parsing unstructured text, clients get typed events: RUN_STARTED - Agent begins (start loading UI) TEXT_MESSAGE_CONTENT - Text chunk (stream to user) TOOL_CALL_START - Tool invoked (show "searching...", "calculating...") TOOL_CALL_RESULT - Tool finished (show result, update UI) RUN_FINISHED - Complete (hide loading) This lets UIs react intelligently without custom parsing logic. Now that we understand the protocol's design choices, let's see how these pieces fit together in a complete system. Architecture Overview Here's how the components interact: The communication between these layers relies on a well-defined set of event types. Here are the core events that flow through the SSE connection: Core Event Types AG-UI provides a standardized set of event types to describe what's happening during an agent's execution: RUN_STARTED - agent begins execution TEXT_MESSAGE_START, TEXT_MESSAGE_CONTENT, TEXT_MESSAGE_END - streaming segments of text TOOL_CALL_START, TOOL_CALL_ARGS, TOOL_CALL_END, TOOL_CALL_RESULT - tool execution events RUN_FINISHED - agent has finished execution RUN_ERROR - error information This model lets the UI update as the agent runs, rather than waiting for the final response. The generic architecture above applies to any AG-UI implementation. Now let's see how this translates to Microsoft Agent Framework. AG-UI with Microsoft Agent Framework While AG-UI is framework-agnostic, this blog post demonstrates integration with Microsoft Agent Framework (MAF) using Python. MAF is available in both Python and .NET, giving you flexibility to build AG-UI applications in your preferred language. Understanding how MAF implements the protocol will help you build your own applications or work with other compliant frameworks. Integration Architecture The Microsoft Agent Framework integration involves several specialized layers that handle protocol translation and execution orchestration: Understanding each layer: FastAPI Endpoint - Handles HTTP requests and establishes SSE connections for streaming AgentFrameworkAgent - Protocol wrapper that translates between AG-UI events and Agent Framework operations Orchestrators - Manage execution flow, coordinate tool calling sequences, and handle state transitions ChatAgent - Your agent implementation with instructions, tools, and business logic ChatClient - Interface to the underlying language model (Azure OpenAI, OpenAI, or other providers) The good news? When you call add_agent_framework_fastapi_endpoint, all the middleware layers are configured automatically. You simply provide your ChatAgent, and the integration handles protocol translation, event streaming, and state management behind the scenes. Now that we understand both the protocol architecture and the Microsoft Agent Framework integration, let's build a working application. Hands-On: Building Your First AG-UI Application This section demonstrates how to build an AG-UI server and client using Microsoft Agent Framework and FastAPI. Prerequisites Before building your first AG-UI application, ensure you have: Python 3.10 or later installed Basic understanding of async/await patterns in Python Azure CLI installed and authenticated (az login) Azure OpenAI service endpoint and deployment configured (setup guide) Cognitive Services OpenAI Contributor role for your Azure OpenAI resource You'll also need to install the AG-UI integration package: pip install agent-framework-ag-ui --pre This automatically installs agent-framework-core, fastapi, and uvicorn as dependencies. With your environment configured, let's create the server that will host your agent and expose it via the AG-UI protocol. Building the Server Let's create a FastAPI server that hosts an AI agent and exposes it via AG-UI: # server.py import os from typing import Annotated from dotenv import load_dotenv from fastapi import FastAPI from pydantic import Field from agent_framework import ChatAgent, ai_function from agent_framework.azure import AzureOpenAIChatClient from agent_framework_ag_ui import add_agent_framework_fastapi_endpoint from azure.identity import DefaultAzureCredential # Load environment variables from .env file load_dotenv() # Validate environment configuration openai_endpoint = os.getenv("AZURE_OPENAI_ENDPOINT") model_deployment = os.getenv("AZURE_OPENAI_DEPLOYMENT_NAME") if not openai_endpoint: raise RuntimeError("Missing required environment variable: AZURE_OPENAI_ENDPOINT") if not model_deployment: raise RuntimeError("Missing required environment variable: AZURE_OPENAI_DEPLOYMENT_NAME") # Define tools the agent can use @ai_function def get_order_status( order_id: Annotated[str, Field(description="The order ID to look up (e.g., ORD-001)")] ) -> dict: """Look up the status of a customer order. Returns order status, tracking number, and estimated delivery date. """ # Simulated order lookup orders = { "ORD-001": {"status": "shipped", "tracking": "1Z999AA1", "eta": "Jan 25, 2026"}, "ORD-002": {"status": "processing", "tracking": None, "eta": "Jan 23, 2026"}, "ORD-003": {"status": "delivered", "tracking": "1Z999AA3", "eta": "Delivered Jan 20"}, } return orders.get(order_id, {"status": "not_found", "message": "Order not found"}) # Initialize Azure OpenAI client chat_client = AzureOpenAIChatClient( credential=DefaultAzureCredential(), endpoint=openai_endpoint, deployment_name=model_deployment, ) # Configure the agent with custom instructions and tools agent = ChatAgent( name="CustomerSupportAgent", instructions="""You are a helpful customer support assistant. You have access to a get_order_status tool that can look up order information. IMPORTANT: When a user mentions an order ID (like ORD-001, ORD-002, etc.), you MUST call the get_order_status tool to retrieve the actual order details. Do NOT make up or guess order information. After calling get_order_status, provide the actual results to the user in a friendly format.""", chat_client=chat_client, tools=[get_order_status], ) # Initialize FastAPI application app = FastAPI( title="AG-UI Customer Support Server", description="Interactive AI agent server using AG-UI protocol with tool calling" ) # Mount the AG-UI endpoint add_agent_framework_fastapi_endpoint(app, agent, path="/chat") def main(): """Entry point for the AG-UI server.""" import uvicorn print("Starting AG-UI server on http://localhost:8000") uvicorn.run(app, host="0.0.0.0", port=8000, log_level="info") # Run the application if __name__ == "__main__": main() What's happening here: We define a tool: get_order_status with the AI_function decorator Use Annotated and Field for parameter descriptions to help the agent understand when and how to use the tool We create an Azure OpenAI chat client with credential authentication The ChatAgent is configured with domain-specific instructions and the tools parameter add_agent_framework_fastapi_endpoint automatically handles SSE streaming and tool execution The server exposes the agent at the /chat endpoint Note: This example uses Azure OpenAI, but AG-UI works with any chat model. You can also integrate with Azure AI Foundry's model catalog or use other LLM providers. Tool calling is supported by most modern LLMs including GPT-4, GPT-4o, and Claude models. To run this server: # Set your Azure OpenAI credentials export AZURE_OPENAI_ENDPOINT="https://your-resource.openai.azure.com/" export AZURE_OPENAI_DEPLOYMENT_NAME="gpt-4o" # Start the server python server.py With your server running and exposing the AG-UI endpoint, the next step is building a client that can connect and consume the event stream. Streaming Results to Clients With the server running, clients can connect and stream events as the agent processes requests. Here's a Python client that demonstrates the streaming capabilities: # client.py import asyncio import os from dotenv import load_dotenv from agent_framework import ChatAgent, FunctionCallContent, FunctionResultContent from agent_framework_ag_ui import AGUIChatClient # Load environment variables from .env file load_dotenv() async def interactive_chat(): """Interactive chat session with streaming responses.""" # Connect to the AG-UI server base_url = os.getenv("AGUI_SERVER_URL", "http://localhost:8000/chat") print(f"Connecting to: {base_url}\n") # Initialize the AG-UI client client = AGUIChatClient(endpoint=base_url) # Create a local agent representation agent = ChatAgent(chat_client=client) # Start a new conversation thread conversation_thread = agent.get_new_thread() print("Chat started! Type 'exit' or 'quit' to end the session.\n") try: while True: # Collect user input user_message = input("You: ") # Handle empty input if not user_message.strip(): print("Please enter a message.\n") continue # Check for exit commands if user_message.lower() in ["exit", "quit", "bye"]: print("\nGoodbye!") break # Stream the agent's response print("Agent: ", end="", flush=True) # Track tool calls to avoid duplicate prints seen_tools = set() async for update in agent.run_stream(user_message, thread=conversation_thread): # Display text content if update.text: print(update.text, end="", flush=True) # Display tool calls and results for content in update.contents: if isinstance(content, FunctionCallContent): # Only print each tool call once if content.call_id not in seen_tools: seen_tools.add(content.call_id) print(f"\n[Calling tool: {content.name}]", flush=True) elif isinstance(content, FunctionResultContent): # Only print each result once result_id = f"result_{content.call_id}" if result_id not in seen_tools: seen_tools.add(result_id) result_text = content.result if isinstance(content.result, str) else str(content.result) print(f"[Tool result: {result_text}]", flush=True) print("\n") # New line after response completes except KeyboardInterrupt: print("\n\nChat interrupted by user.") except ConnectionError as e: print(f"\nConnection error: {e}") print("Make sure the server is running.") except Exception as e: print(f"\nUnexpected error: {e}") def main(): """Entry point for the AG-UI client.""" asyncio.run(interactive_chat()) if __name__ == "__main__": main() Key features: The client connects to the AG-UI endpoint using AGUIChatClient with the endpoint parameter run_stream() yields updates containing text and content as they arrive Tool calls are detected using FunctionCallContent and displayed with [Calling tool: ...] Tool results are detected using FunctionResultContent and displayed with [Tool result: ...] Deduplication logic (seen_tools set) prevents printing the same tool call multiple times as it streams Thread management maintains conversation context across messages Graceful error handling for connection issues To use the client: # Optional: specify custom server URL export AGUI_SERVER_URL="http://localhost:8000/chat" # Start the interactive chat python client.py Example Session: Connecting to: http://localhost:8000/chat Chat started! Type 'exit' or 'quit' to end the session. You: What's the status of order ORD-001? Agent: [Calling tool: get_order_status] [Tool result: {"status": "shipped", "tracking": "1Z999AA1", "eta": "Jan 25, 2026"}] Your order ORD-001 has been shipped! - Tracking Number: 1Z999AA1 - Estimated Delivery Date: January 25, 2026 You can use the tracking number to monitor the delivery progress. You: Can you check ORD-002? Agent: [Calling tool: get_order_status] [Tool result: {"status": "processing", "tracking": null, "eta": "Jan 23, 2026"}] Your order ORD-002 is currently being processed. - Status: Processing - Estimated Delivery: January 23, 2026 Your order should ship soon, and you'll receive a tracking number once it's on the way. You: exit Goodbye! The client we just built handles events at a high level, abstracting away the details. But what's actually flowing through that SSE connection? Let's peek under the hood. Event Types You'll See As the server streams back responses, clients receive a series of structured events. If you were to observe the raw SSE stream (e.g., using curl), you'd see events like: curl -N http://localhost:8000/chat \ -H "Content-Type: application/json" \ -H "Accept: text/event-stream" \ -d '{"messages": [{"role": "user", "content": "What'\''s the status of order ORD-001?"}]}' Sample event stream (with tool calling): data: {"type":"RUN_STARTED","threadId":"eb4d9850-14ef-446c-af4b-23037acda9e8","runId":"chatcmpl-xyz"} data: {"type":"TEXT_MESSAGE_START","messageId":"e8648880-a9ff-4178-a17d-4a6d3ec3d39c","role":"assistant"} data: {"type":"TOOL_CALL_START","toolCallId":"call_GTWj2N3ZyYiiQIjg3fwmiQ8y","toolCallName":"get_order_status","parentMessageId":"e8648880-a9ff-4178-a17d-4a6d3ec3d39c"} data: {"type":"TOOL_CALL_ARGS","toolCallId":"call_GTWj2N3ZyYiiQIjg3fwmiQ8y","delta":"{\""} data: {"type":"TOOL_CALL_ARGS","toolCallId":"call_GTWj2N3ZyYiiQIjg3fwmiQ8y","delta":"order"} data: {"type":"TOOL_CALL_ARGS","toolCallId":"call_GTWj2N3ZyYiiQIjg3fwmiQ8y","delta":"_id"} data: {"type":"TOOL_CALL_ARGS","toolCallId":"call_GTWj2N3ZyYiiQIjg3fwmiQ8y","delta":"\":\""} data: {"type":"TOOL_CALL_ARGS","toolCallId":"call_GTWj2N3ZyYiiQIjg3fwmiQ8y","delta":"ORD"} data: {"type":"TOOL_CALL_ARGS","toolCallId":"call_GTWj2N3ZyYiiQIjg3fwmiQ8y","delta":"-"} data: {"type":"TOOL_CALL_ARGS","toolCallId":"call_GTWj2N3ZyYiiQIjg3fwmiQ8y","delta":"001"} data: {"type":"TOOL_CALL_ARGS","toolCallId":"call_GTWj2N3ZyYiiQIjg3fwmiQ8y","delta":"\"}"} data: {"type":"TOOL_CALL_END","toolCallId":"call_GTWj2N3ZyYiiQIjg3fwmiQ8y"} data: {"type":"TOOL_CALL_RESULT","messageId":"f048cb0a-a049-4a51-9403-a05e4820438a","toolCallId":"call_GTWj2N3ZyYiiQIjg3fwmiQ8y","content":"{\"status\": \"shipped\", \"tracking\": \"1Z999AA1\", \"eta\": \"Jan 25, 2026\"}","role":"tool"} data: {"type":"TEXT_MESSAGE_START","messageId":"8215fc88-8cb6-4ce4-8bdb-a8715dcd26cf","role":"assistant"} data: {"type":"TEXT_MESSAGE_CONTENT","messageId":"8215fc88-8cb6-4ce4-8bdb-a8715dcd26cf","delta":"Your"} data: {"type":"TEXT_MESSAGE_CONTENT","messageId":"8215fc88-8cb6-4ce4-8bdb-a8715dcd26cf","delta":" order"} data: {"type":"TEXT_MESSAGE_CONTENT","messageId":"8215fc88-8cb6-4ce4-8bdb-a8715dcd26cf","delta":" ORD"} data: {"type":"TEXT_MESSAGE_CONTENT","messageId":"8215fc88-8cb6-4ce4-8bdb-a8715dcd26cf","delta":"-"} data: {"type":"TEXT_MESSAGE_CONTENT","messageId":"8215fc88-8cb6-4ce4-8bdb-a8715dcd26cf","delta":"001"} data: {"type":"TEXT_MESSAGE_CONTENT","messageId":"8215fc88-8cb6-4ce4-8bdb-a8715dcd26cf","delta":" has"} data: {"type":"TEXT_MESSAGE_CONTENT","messageId":"8215fc88-8cb6-4ce4-8bdb-a8715dcd26cf","delta":" been"} data: {"type":"TEXT_MESSAGE_CONTENT","messageId":"8215fc88-8cb6-4ce4-8bdb-a8715dcd26cf","delta":" shipped"} data: {"type":"TEXT_MESSAGE_CONTENT","messageId":"8215fc88-8cb6-4ce4-8bdb-a8715dcd26cf","delta":"!"} ... (additional TEXT_MESSAGE_CONTENT events streaming the response) ... data: {"type":"TEXT_MESSAGE_END","messageId":"8215fc88-8cb6-4ce4-8bdb-a8715dcd26cf"} data: {"type":"RUN_FINISHED","threadId":"eb4d9850-14ef-446c-af4b-23037acda9e8","runId":"chatcmpl-xyz"} Understanding the flow: RUN_STARTED - Agent begins processing the request TEXT_MESSAGE_START - First message starts (will contain tool calls) TOOL_CALL_START - Agent invokes the get_order_status tool Multiple TOOL_CALL_ARGS events - Arguments stream incrementally as JSON chunks ({"order_id":"ORD-001"}) TOOL_CALL_END - Tool invocation structure complete TOOL_CALL_RESULT - Tool execution finished with result data TEXT_MESSAGE_START - Second message starts (the final response) Multiple TEXT_MESSAGE_CONTENT events - Response text streams word-by-word TEXT_MESSAGE_END - Response message complete RUN_FINISHED - Entire run completed successfully This granular event model enables rich UI experiences - showing tool execution indicators ("Searching...", "Calculating..."), displaying intermediate results, and providing complete transparency into the agent's reasoning process. Seeing the raw events helps, but truly working with AG-UI requires a shift in how you think about agent interactions. Let's explore this conceptual change. The Mental Model Shift Traditional API Thinking # Imperative: Call and wait response = agent.run("What's 2+2?") print(response) # "The answer is 4" Mental model: Function call with return value AG-UI Thinking # Reactive: Subscribe to events async for event in agent.run_stream("What's 2+2?"): match event.type: case "RUN_STARTED": show_loading() case "TEXT_MESSAGE_CONTENT": display_chunk(event.delta) case "RUN_FINISHED": hide_loading() Mental model: Observable stream of events This shift feels similar to: Moving from synchronous to async code Moving from REST to event-driven architecture Moving from polling to pub/sub This mental shift isn't just philosophical - it unlocks concrete benefits that weren't possible with request/response patterns. What You Gain Observability # You can SEE what the agent is doing TOOL_CALL_START: "get_order_status" TOOL_CALL_ARGS: {"order_id": "ORD-001"} TOOL_CALL_RESULT: {"status": "shipped", "tracking": "1Z999AA1", "eta": "Jan 25, 2026"} TEXT_MESSAGE_START: "Your order ORD-001 has been shipped..." Interruptibility # Future: Cancel long-running operations async for event in agent.run_stream(query): if user_clicked_cancel: await agent.cancel(thread_id, run_id) break Transparency # Users see the reasoning process "Looking up order ORD-001..." "Order found: Status is 'shipped'" "Retrieving tracking information..." "Your order has been shipped with tracking number 1Z999AA1..." To put these benefits in context, here's how AG-UI compares to traditional approaches across key dimensions: AG-UI vs. Traditional Approaches Aspect Traditional REST Custom Streaming AG-UI Connection Model Request/Response Varies Server-Sent Events State Management Manual Manual Protocol-managed Tool Calling Invisible Custom format Standardized events Framework Varies Framework-locked Framework-agnostic Browser Support Universal Varies Universal Implementation Simple Complex Moderate Ecosystem N/A Isolated Growing You've now seen AG-UI's design principles, implementation details, and conceptual foundations. But the most important question remains: should you actually use it? Conclusion: Is AG-UI Right for Your Project? AG-UI represents a shift toward standardized, observable agent interactions. Before adopting it, understand where the protocol stands and whether it fits your needs. Protocol Maturity The protocol is stable enough for production use but still evolving: Ready now: Core specification stable, Microsoft Agent Framework integration available, FastAPI/Python implementation mature, basic streaming and threading work reliably. Choose AG-UI If You Building new agent projects - No legacy API to maintain, want future compatibility with emerging ecosystem Need streaming observability - Multi-step workflows where users benefit from seeing each stage of execution Want framework flexibility - Same client code works with any AG-UI-compliant backend Comfortable with evolving standards - Can adapt to protocol changes as it matures Stick with Alternatives If You Have working solutions - Custom streaming working well, migration cost not justified Need guaranteed stability - Mission-critical systems where breaking changes are unacceptable Build simple agents - Single-step request/response without tool calling or streaming needs Risk-averse environment - Large existing implementations where proven approaches are required Beyond individual project decisions, it's worth considering AG-UI's role in the broader ecosystem. The Bigger Picture While this blog post focused on Microsoft Agent Framework, AG-UI's true power lies in its broader mission: creating a common language for agent-UI communication across the entire ecosystem. As more frameworks adopt it, the real value emerges: write your UI once, work with any compliant agent framework. Think of it like GraphQL for APIs or OpenAPI for REST - a standardization layer that benefits the entire ecosystem. The protocol is young, but the problem it solves is real. Whether you adopt it now or wait for broader adoption, understanding AG-UI helps you make informed architectural decisions for your agent applications. Ready to dive deeper? Here are the official resources to continue your AG-UI journey. Resources AG-UI & Microsoft Agent Framework Getting Started with AG-UI (Microsoft Learn) - Official tutorial AG-UI Integration Overview - Architecture and concepts AG-UI Protocol Specification - Official protocol documentation Backend Tool Rendering - Adding function tools Security Considerations - Production security guidance Microsoft Agent Framework Documentation - Framework overview AG-UI Dojo Examples - Live demonstrations UI Components & Integration CopilotKit for Microsoft Agent Framework - React component library Community & Support Microsoft Q&A - Community support Agent Framework GitHub - Source code and issues Related Technologies Azure AI Foundry Documentation - Azure AI platform FastAPI Documentation - Web framework Server-Sent Events (SSE) Specification - Protocol standard This blog post introduces AG-UI with Microsoft Agent Framework, focusing on fundamental concepts and building your first interactive agent application.Supporting ChatGPT on PostgreSQL in Azure
Affan Dar, Vice President of Engineering, PostgreSQL at Microsoft Adam Prout, Partner Architect, PostgreSQL at Microsoft Panagiotis Antonopoulos, Distinguished Engineer, PostgreSQL at Microsoft The OpenAI engineering team recently published a blog post describing how they scaled their databases by 10x over the past year, to support 800 million monthly users. To do so, OpenAI relied on Azure Database for PostgreSQL to support important services like ChatGPT and the Developer API. Collaborating with a customer experiencing rapid user growth has been a remarkable journey. One key observation is that PostgreSQL works out of box for very large-scale points. As many in the public domain have noted, ChatGPT grew to 800M+ users before OpenAI started moving new and shardable workloads to Azure Cosmos DB. Nevertheless, supporting the growth of one of the largest Postgres deployments was a great learning experience for both of our teams. Our OpenAI friends did an incredible job at reacting fast and adjusting their systems to handle the growth. Similarly, the Postgres team at Azure worked to further tune the service to support the increasing OpenAI workload. The changes we made were not limited to OpenAI, hence all our Azure Database for PostgreSQL customers with demanding workloads have benefited. A few of the enhancements and the work that led to these are listed below. Changing the network congestion protocol to reduce replication lag Azure Database for PostgreSQL used the default CUBIC congestion control algorithm for replication traffic to replicas both within and outside the region. Leading up to one of the OpenAI launch events, we observed that several geo-distributed read replicas occasionally experienced replication lag. Replication from the primary server to the read replicas would typically operate without issues; however, at times, the replicas would unexpectedly begin falling behind the primary for reasons that were not immediately clear. This lag would not recover on its own and would grow to a point when, eventually, automation would restart the read replica. Once restarted, the read replica would once again catch up, only to repeat this cycle again within a day or less. After an extensive debugging effort, we traced the root cause to how the TCP congestion control algorithm handled a higher rate of packet drops. These drops were largely a result of high point-to-point traffic between the primary server and its replicas, compounded by the existing TCP window settings. Packet drops across regions are not unexpected; however, the default congestion control algorithm (CUBIC) treats packet loss as a sign of congestion and does an aggressive backoff. In comparison, the Bottleneck Bandwidth and Round-trip propagation time (BBR) congestion control algorithm is less sensitive to packet drops. Switching to BBR, adding SKU specific TCP window settings, and switching to fair queuing network discipline (which can control pacing of outgoing packets at hardware level) resolved this issue. We’ll also note that one of our seasoned PostgreSQL committers provided invaluable insights during this process, helping us pinpoint the issue more effectively. Scaling out with Read replicas PostgreSQL primaries, if configured properly, work amazingly well in supporting a large number of read replicas. In fact, as noted in the OpenAI engineering blog, a single primary has been able to power around 50+ replicas across multiple regions. However, going beyond this increases the chance of impacting the primary. For this reason, we added the cascading replica support to scale out reads even further. But this brings in a number of additional failure modes that need to be handled. The system must carefully orchestrate repairs around lagging and failing intermediary nodes, safely repointing replicas to new intermediary nodes while performing catch up or rewind in a mission critical setup. Furthermore, disaster recovery (DR) scenarios can require a fast rebuild of a replica and as data movement across regions is a costly and time-consuming operation, we developed the ability to create a geo replica from a snapshot of another replica in the same region. This feature avoids the traditional full data copy process, which may take hours or even days depending on the size of the data, by leveraging data for that cluster that already exists in that region. This feature will soon be available for all our customers as well. Scaling out Writes These improvements solved the read replica lag problems and read scale but did not help address the growing write scale for OpenAI. At some point, the balance tipped and it was obvious that the IOPs limits of a single PostgreSQL primary instance will not cut it anymore. As a result OpenAI decided to move new and shardable workloads to Azure Azure Cosmos DB, which is our default recommended NoSQL store for fully elastic workloads. However, some workloads, as noted in the OpenAI blog are much harder to shard. While OpenAI is using Azure Database for PostgreSQL flexible server, several of the write scaling requirements that came up have been baked into our new Azure HorizonDB offering, which entered private preview in November 2025. Some of the architectural innovations are described in the following sections. Azure HorizonDB scalability design To better support more demanding workloads, Azure HorizonDB introduces a new storage layer for Postgres that delivers significant performance and reliability enhancements: More efficient read scale out. Postgres read replicas no longer need to maintain their own copy of the data. They can read pages from the single copy maintained by the storage layer. Lower latency Write-Ahead Logging (WAL) writes and higher throughput page reads via two purpose-built storage services designed for WAL storage and Page storage. Durability and high availability responsibilities are shifted from the Postgres primary to the storage layer, allowing Postgres to dedicate more resources to executing transactions and queries. Postgres failovers are faster and more reliable. To understand how Azure HorizonDB delivers these capabilities, let’s look at its high‑level architecture as shown in Figure 1. It follows a log-centric storage model, where the PostgreSQL writeahead log (WAL) is the sole mechanism used to durably persist changes to storage. PostgreSQL compute nodes never write data pages to storage directly in Azure HorizonDB. Instead, pages and other on-disk structures are treated as derived state and are reconstructed and updated from WAL records by the data storage fleet. Azure HorizonDB storage uses two separate storage services for WAL and data pages. This separation allows each to be designed and optimized for the very different patterns of reads and writes PostgreSQL does against WAL files in contrast to data pages. The WAL server is optimized for very low latency writes to the tail of a sequential WAL stream and the Page server is designed for random reads and writes across potentially many terabytes of pages. These two separate services work together to enable Postgres to handle IO intensive OLTP workloads like OpenAI’s. The WAL server can durably write a transaction across 3 availability zones using a single network hop. The typical PostgreSQL replication setup with a hot standby (Figure 2) requires 4 hops to do the same work. Each hop is a component that can potentially fail or slow down and delay a commit. Azure HorizonDB page service can scale out page reads to many hundreds of thousands of IOPs for each Postgres instance. It does this by sharding the data in Postgres data files across a fleet of page servers. This spreads the reads across many high performance NVMe disks on each page server. 2 - WAL Writes in HorizonDB Another key design principle for Azure HorizonDB was to move durability and high availability related work off PostgreSQL compute allowing it to operate as a stateless compute engine for queries and transactions. This approach gives Postgres more CPU, disk and network to run your application’s business logic. Table 1 summarizes the different tasks that community PostgreSQL has to do, which Azure HorizonDB moves to its storage layer. Work like dirty page writing and checkpointing are no longer done by a Postgres primary. The work for sending WAL files to read replicas is also moved off the primary and into the storage layer – having many read replicas puts no load on the Postgres primary in Azure HorizonDB. Backups are handled by Azure Storage via snapshots, Postgres isn’t involved. Task Resource Savings Postgres Process Moved WAL sending to Postgres replicas Disk IO, Network IO Walsender WAL archiving to blob storage Disk IO, Network IO Archiver WAL filtering CPU, Network IO Shared Storage Specific (*) Dirty Page Writing Disk IO background writer Checkpointing Disk IO checkpointer PostgreSQL WAL recovery Disk IO, CPU startup recovering PostgreSQL read replica redo Disk IO, CPU startup recovering PostgreSQL read replica shared storage Disk IO background, checkpointer Backups Disk IO pg_dump, pg_basebackup, pg_backup_start, pg_backup_stop Full page writes Disk IO Backends doing WAL writing Hot standby feedback Vacuum accuracy walreceiver Table 1 - Summary of work that the Azure HorizonDB storage layer takes over from PostgreSQL The shared storage architecture of Azure HorizonDB is the fundamental building block for delivering exceptional read scalability and elasticity which are critical for many workloads. Users can spin up read replicas instantly without requiring any data copies. Page Servers are able to scale and serve requests from all replicas without any additional storage costs. Since WAL replication is entirely handled by the storage service, the primary’s performance is not impacted as the number of replicas changes. Each read replica can scale independently to serve different workloads, allowing for workload isolation. Finally, this architecture allows Azure HorizonDB to substantially improve the overall experience around high availability (HA). HA replicas can now be added without any data copying or storage costs. Since the data is shared between the replicas and continuously updated by Page Servers, secondary replicas only replay a portion of the WAL and can easily keep up with the primary, reducing failover times. The shared storage also guarantees that there is a single source of truth and the old primary never diverges after a failover. This prevents the need for expensive reconciliation, using pg_rewind, or other techniques and further improves availability. Azure HorizonDB was designed from the ground up with learnings from large scale customers, to meet the requirements of the most demanding workloads. The improved performance, scalability and availability of the Azure HorizonDB architecture make Azure a great destination for Postgres workloads.720Views7likes0CommentsMigrating your AWS offer to Microsoft Marketplace - AWS to Azure service comparisons
As an Independent Software Vendor (ISV), expanding your Marketplace offer's reach beyond AWS Marketplace by replicating to Microsoft Marketplace offers exciting opportunities to grow your customer base. With millions of customers across a global network of businesses and industries, Azure presents a thriving platform to enhance your app’s visibility and functionality. This post is part of a series on replicating apps from AWS to Azure. View all posts in this series. Boost your growth and access more customers by replicating your AWS app to Azure and selling through Microsoft Marketplace. This guide will compare commonly used AWS and Azure components, highlighting differences, to help you replicate your app quickly and easily to prepare it for publishing on Microsoft Marketplace. Future posts will dive deeper into each component area. To ensure a seamless app replication, start by reviewing the marketplace listing requirements. Understanding the key differences between AWS and Azure will help you transition and optimize performance on Azure while benefiting from its unique advantages. This guide will outline these differences, highlight similar services, and offer steps for a seamless replication or migration. You can also join ISV Success to get access to over $126K USD in cloud credits, AI services, developer tools, and 1:1 technical consults to help you replicate your app and publish to Marketplace. The benefits of replicating or migrating to Microsoft Marketplace Migrating to Marketplace unlocks a wealth of opportunities for ISVs. The Azure ecosystem offers several advantages, including: Global reach: Azure’s vast global network of data centers ensures high availability and low-latency access to your application for customers worldwide. Cost efficiency: Azure’s flexible pricing models and cost management tools allow ISVs to optimize their cloud spending. Scalability: With Azure’s powerful compute and storage options, you can scale your application effortlessly to accommodate growing demand. Security and compliance: Azure’s comprehensive security tools and certifications help you meet industry-specific compliance standards, ensuring that your application is secure and trusted. Meet where your customers are: Deploy into customer subscriptions, making your solution more integrated to customer workload. AWS vs. Azure AWS and Azure are the top cloud platforms with diverse services for developers and businesses. Below, we will highlight key areas where AWS and Azure differ—and how to leverage Azure services—when moving your Marketplace offer from AWS to Microsoft Marketplace. Microsoft Marketplace capabilities In Azure, ISVs can leverage metered billing to charge customers based on actual usage, similar to AWS's pay-as-you-go model. This flexible pricing model is ideal for SaaS solutions. Partner Center offers tools for setting pricing models, tracking usage, and adjusting billing. It also provides anomaly detection to help partners identify unexpected usage and ensure transparent billing. When creating SaaS offers in Marketplace, ISVs can define plans with various pricing strategies, such as usage-based or flat-rate billing. These plans, or SKUs, can be customized through free trials, BYOL (Bring Your Own License), or vCPU-based pricing for virtual machines. Both Azure and AWS allow flexible, metered billing based on usage. Azure also provides the ability to set customer discounts or negotiated pricing. Using Partner Center, you can configure and manage these offerings, providing flexibility for customers and partners to scale as needed. Like AWS Control Tower, Azure Lighthouse enables service providers to manage multiple customer Azure environments securely and at scale, offering enhanced visibility, control, and automation. For usage-based monthly billing, you can choose from predefined or custom pricing options (using metered billing APIs). Predefined options like per core, per node, or per pod let Microsoft bill customers based on hourly usage, billing them monthly. Learn more about usage-based pricing here: Setting Plan Pricing. Mapping AWS services to Azure services Your Marketplace offer may use multiple AWS services, and you can build the same offer using Azure services. However, this requires careful mapping to ensure your application functions seamlessly in the Azure environment. Here’s a quick overview of how popular AWS services map to Azure:: Networking: AWS VPC → Azure Virtual Networks (VNets) Compute Services: AWS EC2 → Azure Virtual Machines (VMs), Azure App Services (for web apps) Storage: Amazon S3 → Azure Blob Storage, Azure Data Lake Storage (for big data) Identity Management: AWS IAM → Entra ID Containers: EKS and Elastic Beanstalk → AKS and Azure App Services Serverless: AWS Lambda → Azure Functions Databases: Amazon RDS → Azure SQL Database, Azure Cosmos DB (for NoSQL) Azure for AWS professionals provides you with a more comprehensive mapping of different services. Let's take a deeper look into each of these areas. Cloud architecture and networking One of the primary differences between AWS and Azure lies in their cloud architecture and networking models. AWS uses Virtual Private Clouds (VPCs) to create isolated networks, while Azure employs Virtual Networks (VNets). Both services perform similar functions, but they have different terminologies and setups. For instance, in Azure, you'll be working with VNet Peering, Network Security Groups (NSGs), and Azure VPNs for secure networking. The goal is to map your AWS VPC setup to Azure VNets with ease. AWS needs a Nat Gateway for egress access whereas Azure does not need a Nat Gateway for default egress. AWS Subnets are pinned to Availability Zones (AZs) whereas Azure Subnets span across the AZs. Compute services: EC2 vs. Virtual Machines (VMs) AWS EC2 instances are one of the most widely used compute services, allowing you to run applications on virtual servers. In Azure, the equivalent service is Azure Virtual Machines (VMs). While both offer scalable compute resources, the key differences are in the range of VM sizes, configurations, and the management interface. When migrating from AWS EC2 to Azure VMs, it's important to assess the appropriate Azure VM sizes and configurations that match the performance of your EC2 instances. Additionally, Azure VMs support Azure Resource Manager (ARM) templates, which provide more automation for resource management. For those who have utilized EC2's Auto Scaling feature, Azure provides similar functionality through Azure Scale Sets. Storage: S3 vs. Blob Storage For object storage, AWS uses Amazon S3, while Azure uses Azure Blob Storage. Both services serve the same purpose — storing large amounts of unstructured data — but the underlying configurations, security features, and cost structures differ. While migrating from S3 to Blob Storage, it’s important to review your storage needs and adjust your application accordingly. Azure Blob Storage offers Cool and Archive tiers, which can be a great way to optimize storage costs for infrequently accessed data, and Azure's data redundancy options ensure high availability and durability. The Azure Storage Explorer tool also makes it easier for ISVs to manage their data after migration. Identity and Access Management (IAM) & billing: IAM vs. Entra ID IAM services on AWS and Azure differ in how they manage roles and permissions. AWS uses IAM for users, roles, and policies, while Azure uses Entra ID for IAM across cloud services. AWS organizes accounts through AWS Organizations, with IAM used for role-based access control (RBAC) and policies for service access. Azure’s structure involves Subscriptions and Management Groups, with Entra ID managing identity and access. Azure uses RBAC to assign roles at various levels (Subscription, Resource Group, Resource) and Azure Policies for governance and compliance. Azure Entra ID integrates with Microsoft services, like Office 365, SharePoint, and Teams, supporting identity federation, multi-factor authentication, and RBAC for granular permissions. It enhances governance and security across platforms. Azure handles billing management via subscriptions providing access to resources and can be reassigned to new owners. It offers three classic subscription administrator roles for resource access and management for billing and resource access. Container management: Elastic Beanstalk vs. Azure App Services and EKS vs. AKS For containerized applications, AWS offers Elastic Beanstalk for easy application deployment and management. Azure’s equivalent services include Azure App Services for simple web application hosting and Azure Kubernetes Service (AKS) for container orchestration. While Azure App Services is more suitable for traditional web applications, AKS provides a robust and scalable solution for microservices and containerized applications, similar to AWS’s Elastic Kubernetes Service (EKS). ISVs who are accustomed to Elastic Beanstalk for deploying containerized applications will find Azure App Services or AKS a seamless alternative, with Azure offering rich integrations with DevOps pipelines, CI/CD workflows, and container registries. Serverless: AWS Lambda vs. Azure Functions Both AWS and Azure support serverless computing, which allows developers to run code without managing servers. AWS offers Lambda, while Azure offers Azure Functions. Both services allow you to trigger code in response to events, such as file uploads or API calls. The key difference is that Azure Functions integrates deeply with other Azure services, such as Azure Logic Apps and Azure Event Grid. If your application leverages AWS Lambda, you will find that Azure Functions can serve as an excellent equivalent. Azure also provides Durable Functions, which extend Azure Functions for stateful workflows. Migrating from AWS Lambda to Azure Functions typically requires mapping your event-driven functions and configuring their triggers in the Azure ecosystem. Databases: RDS vs. Azure SQL and Cosmos DB When it comes to databases, AWS offers Amazon RDS for relational databases, and Amazon DynamoDB for NoSQL. Azure provides several alternatives, including Azure SQL Database for relational storage and Azure Cosmos DB for NoSQL storage. Both platforms support database scalability, automated backups, and high availability. If you are using Amazon RDS with services like MySQL or PostgreSQL, you can migrate to Azure Database for MySQL or Azure Database for PostgreSQL. Similarly, if you are using AWS DynamoDB, Azure’s Cosmos DB offers a global, scalable NoSQL database with low-latency access. Messaging: AWS SQS vs. Azure Service Bus Messaging services are crucial when your application handles high-throughput, asynchronous communication between different components. AWS offers Simple Queue Service (SQS) for messaging and SNS for pub/sub notifications while Azure offers Azure Service Bus and Azure Event Grid. Azure Service Bus provides similar functionality to SQS but offers additional capabilities like advanced message routing, dead-lettering, and sessions for handling ordered messages. If your application relies on a queuing mechanism for inter-service communication, you’ll want to map AWS SQS to Azure Service Bus. For event-driven architectures, Azure Event Grid can connect different services and trigger actions across Azure services. Security: Protecting your application on Azure When migrating from AWS to Azure, security is paramount. Both platforms offer strong frameworks to protect data, apps, and infrastructure. Azure provides a suite of integrated security services to maintain high security while enabling cloud scalability. AWS offers AWS Shield and WAF for DDoS and web application firewalls, while Azure offers Azure DDoS Protection and Azure Firewall for similar threat prevention. Azure Security Center monitors your security posture, and Azure Sentinel provides cloud-native SIEM (Security Information and Event Management) for threat detection and response. Microsoft Defender for Identity and Azure Entra ID Identity Protection integrate with Entra ID, ensuring your app security is tightly linked to user identity and governance. Compliance: Meeting regulatory standards on Azure Ensuring compliance with industry standards and regulations is crucial for many ISVs. Azure provides a robust compliance framework that aligns with global standards to meet the most stringent requirements. Whether your application deals with sensitive data or operates in highly regulated industries, Azure’s comprehensive compliance offerings can help you achieve the necessary certifications. Azure complies with key standards such as: GDPR HIPAA SOC 1, 2, and 3 ISO 27001 and other ISO standards FedRAMP Azure provides tools like Azure Policy for governance and Azure Blueprints for complex regulatory requirements. It offers a similar set of compliance certifications to AWS, with a stronger integration into Microsoft enterprise tools, easing compliance for businesses in regulated sectors. For apps handling sensitive data, use Azure Security and Compliance Blueprint to ensure regulatory adherence. Azure’s Compliance Manager helps track and manage compliance, simplifying the process of meeting industry standards. Key resources SaaS Workloads - Microsoft Azure Well-Architected Framework | Microsoft Learn Metered billing for SaaS offers in Partner Center Create plans for a SaaS offer in Azure Marketplace Metered billing with Azure Managed Applications Set plan pricing and availability for an Azure Container offer in Microsoft commercial marketplace - Marketplace publisher Configure pricing and availability for a virtual machine offer in Partner Center - Marketplace publisher Overview - CSP marketplace - Partner Center Azure for AWS professionals - Azure Architecture Center Azure networking documentation Microsoft Entra ID documentation - Microsoft Entra ID Azure security documentation Azure compliance documentation Azure Storage Documentation Hub Microsoft Azure container services documentation Azure serverless - Azure Logic Apps Migration examples Get over $126K USD in benefits and technical consultations to help you replicate and publish your app with ISV Success Maximize your momentum with step-by-step guidance to publish and grow your app with App Advisor1.2KViews1like0CommentsMigrating your AWS offer to Microsoft Marketplace - Storage services
For software development companies looking to expand or replicate their marketplace offerings from AWS to Microsoft Azure, one of the most critical steps in replicating your solution is selecting the right Azure storage services. While both AWS and Azure provide robust cloud storage options, their architecture, service availability, and design approaches vary. To deliver reliable performance, scale globally, and meet operational requirements, it’s essential to understand how Azure storage works—and how it compares to AWS—before you replicate your app. Broaden your customer base and enhance your app’s exposure by bringing your AWS-based solution to Azure and listing it on Microsoft Marketplace. This guide will walk you through how Azure storage services compare to those on AWS—spotlighting important differences in architecture, scalability, and feature sets—so you can make confident choices when replicating your app’s storage layer to Azure. This post is part of a series on replicating apps from AWS to Azure. View all posts in this series AWS to Azure storage mapping When replicating your app from AWS to Azure, start by mapping your existing storage services to the closest Azure equivalents. Both clouds offer robust object, file, and block storage, but they differ in architecture, features, and integration points. Choosing the right Azure service helps keep your app performant, secure, and manageable—and aligns with Microsoft Marketplace requirements for an Azure‑native deployment. AWS Service Azure Equivalent Recommended use cases & key differences Amazon S3 Azure Blob Storage (enable ADLS Gen2 for hierarchical namespace + POSIX ACLs) Object storage with strong consistency and tiering (Hot/Cool/Archive). Blob is part of an Azure Storage account; ADLS Gen2 unlocks data‑lake/analytics features. Amazon EFS Azure Files (SMB/NFS) General‑purpose shared file systems and lift‑and‑shift app shares. Azure Files supports full-featured SMB and fully POSIX compatible NFS shared filesystems on Linux. Amazon FSx for Windows File Server Azure Files (SMB) Windows workloads that need full NTFS semantics, ACLs, and directory integration. Use Premium for low‑latency shares. Amazon FSx for NetApp ONTAP Azure NetApp Files Enterprise file storage with predictable throughput/latency, multiprotocol (SMB/NFS), and advanced data management. Amazon EBS Azure Managed Disks (Premium SSD v2 or Ultra Disk for top performance) Low‑latency block storage for VMs/DBs with provisioned IOPS/MBps; choose Premium SSD v2/Ultra for tighter SLOs. Local NVMe on EKS Azure Container Storage Extreme performance for Kubernetes workloads with a familiar cloud-native developer experience Many EBS volumes (fleet scale) Azure Elastic SAN (VMs & AKS only) Pooled, large‑scale block for Azure VMs via iSCSI or AKS via Azure Container Storage; simplifies fleet provisioning and management. Tip: Some AWS services map to multiple Azure options. For example, EFS → Azure Files for straightforward SMB/NFS shares, or → Azure NetApp Files when you need stricter latency SLOs and multiprotocol at scale. Match your use case After mapping AWS services to Azure equivalents, the next step is selecting the right service for your workload. Start by considering the access pattern, object, file, or block, and then factor in performance, protocol, and scale. Object storage & analytics: Use Azure Blob Storage for unstructured data like images, logs, and backups. If you need hierarchical namespace and POSIX ACLs, enable Azure Data Lake Storage Gen2 on top of Blob. General file sharing / SMB apps: Choose Azure Files (SMB) for lift‑and‑shift scenarios and Windows workloads. Integrate with Entra ID for NTFS ACL parity, and select the Premium tier for low‑latency performance. NFS or multiprotocol file workloads: Start with Azure Files (NFS) for basic needs, or move to Azure NetApp Files for predictable throughput, multiprotocol support, and enterprise‑grade SLAs. High‑performance POSIX workloads: For HPC or analytics pipelines requiring massive throughput, use Azure Managed Lustre. Persistent storage for containers: Azure’s CSI drivers brings Kubernetes support for most Azure disk, files, and blob offerings. Azure Container Storage brings Kubernetes support for unique disk backends that are unsupported by the Azure Disks CSI driver, such as local NVMe. Block storage for VMs and databases: Use Azure Managed Disks for most scenarios, with Premium SSD v2 or Ultra Disk for provisioned IOPS and sub‑millisecond latency. For large fleets or shared performance pools, choose Azure Elastic SAN (VMs & AKS only). Quick tip: Start simple—Blob for object, Azure Files for SMB and NFS, Managed Disks for block—and scale up to NetApp Files, Elastic SAN, or Managed Lustre when performance or compliance demands it. Factor in security and compliance Encryption: Confirm default encryption meets your compliance requirements; enable customer‑managed keys (CMK) if needed. Access control: Apply Azure RBAC for role‑based permissions and ACLs for granular control at the container or file share level. Network isolation: Use Private Endpoints to keep traffic off the public internet and connect storage to your VNet. Identity integration: Prefer Managed Identities or SAS tokens over account keys for secure access. Compliance checks: Verify your chosen service meets certifications like GDPR, HIPAA, or industry‑specific standards. Optimize for cost Tiering: Use Hot, Cool, and Archive tiers in Blob Storage based on access frequency; apply Premium tiers only where low latency is critical. Lifecycle management: Automate data movement and deletion with lifecycle policies to avoid paying for stale data. Reserved capacity: Commit to 1–3 years of capacity for predictable workloads to unlock discounts. Right‑sizing: Choose the smallest disk, volume, or file share that meets your needs; scale up only when required. Monitoring: Set up cost alerts and review usage regularly to catch anomalies early; use Azure Cost Management for insights. Avoid hidden costs: Co‑locate compute and storage to prevent cross‑region egress charges. Data migration from AWS to Azure Migrating your data from AWS to Azure is a key step in replicating your app’s storage layer for Marketplace. The goal is a one‑time transfer—after migration, your app runs fully on Azure. Azure Storage Mover: A managed service that automates and orchestrates large‑scale data transfers from AWS S3, EFS, or on‑premises sources to Azure Blob Storage, Azure Files, or Azure NetApp Files. Ideal for bulk migrations with minimal downtime. AzCopy: A command‑line tool for fast, reliable copying of data from AWS S3 to Azure Blob Storage. Great for smaller datasets or scripted migrations. Azure Data Factory: Built‑in connectors to move data from AWS storage services to Azure, with options for scheduling and transformation. Azure Data Box: For very large datasets, provides a physical device to securely transfer data from AWS to Azure offline. Final readiness before marketplace listing Validate performance under load: Benchmark with real data and confirm your chosen SKUs deliver the IOPS and latency your app needs. Lock down security: Ensure RBAC roles are applied correctly, Private Endpoints are in place, and encryption meets compliance requirements. Control costs: Verify lifecycle policies, reserved capacity, and cost alerts are active to prevent surprises. Enable monitoring: Set up dashboards and alerts for throughput, latency, and capacity so you can catch issues before customers do. Key Resources SaaS Workloads - Microsoft Azure Well-Architected Framework | Microsoft Learn Metered billing for SaaS offers in Partner Center Create plans for a SaaS offer in Microsoft Marketplace Get over $126K USD in benefits and technical consultations to help you replicate and publish your app with ISV Success Maximize your momentum with step-by-step guidance to publish and grow your app with App Advisor334Views7likes2CommentsMigrating your AWS offer to Microsoft Marketplace - Database services
For software development companies looking to expand or replicate their marketplace offerings from AWS to Microsoft Azure, one of the most critical steps in replicating your solution is selecting the right Azure database services. While both AWS and Azure provide robust managed database options, their architecture, service availability, and design approaches vary. To deliver reliable performance, scale globally, and meet operational requirements, it’s essential to understand how Azure databases work—and how they compare to AWS—before you replicate your app. Broaden your customer base and enhance your app’s exposure by bringing your AWS-based solution to Azure and listing it on Microsoft Marketplace. This guide walks you through how Azure database services compare to those on AWS—spotlighting differences in architecture, scalability, and feature sets—so you can make confident choices when replicating your app’s data layer to Azure. This post is part of a series on replicating apps from AWS to Azure. View all posts in this series. AWS to Azure database mapping When replicating your app from AWS to Azure, start by mapping your existing database services to the closest Azure equivalents. Both clouds offer relational, NoSQL, and analytics databases, but they differ in architecture, features, and integration points. Choosing the right Azure service helps keep your app performant, secure, and manageable—and aligns with Azure Marketplace requirements for an Azure-native deployment. AWS Service Azure Equivalent Recommended Use Cases & Key Differences Amazon RDS (MySQL/PostgreSQL) Azure Database for MySQL / PostgreSQL Fully managed relational DB with built-in HA, scaling, and security. Building Generative AI apps. Amazon RDS (SQL Server) Azure SQL Database or Azure SQL Managed Instance Use Azure SQL Database for modern apps; choose Managed Instance for near 100% compatibility with on-prem SQL Server. SQL Server on EC2 SQL Server on Azure VMs Best for lift-and-shift scenarios requiring full OS-level control. Amazon RDS (Oracle) Oracle Database@Azure Managed Oracle workloads with Azure integration. Amazon Aurora (PostgreSQL/MySQL) Azure Database for PostgreSQL (Flexible Server) or Azure Database for MySQL Similar managed experience for large workloads, consider Azure HorizonDB (public preview)—built on PostgreSQL to compete with Aurora & AlloyDB. Learn more. Amazon DynamoDB Azure Cosmos DB (NoSQL API) Global distribution, multi-model support, and guaranteed SLAs for latency and throughput. Amazon Keyspaces (Cassandra) Azure Managed Instance for Apache Cassandra Managed Cassandra with elastic scaling and Azure-native security. Cassandra on EC2 Azure Managed Instance for Apache Cassandra Same as above; ideal for lift-and-shift Cassandra clusters. Amazon DocumentDB MongoDB Atlas MongoDB on EC2 Azure DocumentDB Azure DocumentDB Azure DocumentDB Drop-in compatibility for MongoDB workloads with global replication and vCore-based pricing. Amazon Redshift Azure Synapse Analytics Enterprise analytics with integrated data lake and Power BI connectivity. Amazon ElastiCache (Redis) Azure Cache for Redis Low-latency caching with clustering and persistence options. Match your use case After mapping AWS services to Azure equivalents, the next step is selecting the right service for your workload. Start by considering the data model (relational, document, key-value), then factor in performance, consistency, and global reach. Building AI apps: Generative AI, vector search, advanced analytics. Relational workloads: Use Azure SQL Database, Azure SQL Managed Instance, or Azure Database for MySQL/PostgreSQL for transactional apps; enable zone redundancy for HA. Review schema compatibility, stored procedures, triggers, and extensions. Inventory all databases, tables, indexes, users, and dependencies before migration. Document any required refactoring for Azure. NoSQL workloads: Choose Azure Cosmos DB for globally distributed apps; select the API (No SQL, MongoDB, Cassandra) that matches your existing schema. Validate data: Model mapping and test migration in a sandbox environment to ensure data integrity and application connectivity. Analytics: For large-scale queries and BI integration, Azure Synapse Analytics offers MPP architecture and tight integration with Azure Data Lake. Inventory all analytics assets, ETL pipelines, and dependencies. Plan for migration using Azure Data Factory or Synapse pipelines. Test performance benchmarks and optimize query plans post-migration. Caching: Azure Cache for Redis accelerates app performance with in-memory data and clustering. Update application connection strings and drivers to use Azure endpoints. Implement retry logic and connection pooling for reliability. Validate cache warm-up and failover strategies. Hybrid scenarios: Combine Cosmos DB with Synapse Link (for Synapse as target) or Fabric Mirroring (for Fabric as target) for real-time analytics without ETL overhead. Assess network isolation, security, and compliance requirements. Deploy Private Endpoints and configure RBAC as needed. Document integration points and monitor hybrid data flows. Factor in security and compliance Encryption: Confirm default encryption meets compliance requirements; enable customer-managed keys (CMK) if needed. Enable Transparent Data Encryption (TDE) and review encryption for backups and in-transit data. Access control: Apply Azure RBAC and database-level roles for granular permissions. Audit user roles and permissions regularly to ensure least privilege. Network isolation: Use Private Endpoints within a virtual network to keep traffic off the public internet. Configure Network Security Groups (NSGs) and firewalls for additional protection. Identity integration: Prefer Managed Identities for secure access to databases. Integrate with Azure Active Directory for centralized identity management. Compliance checks: Verify certifications like GDPR, HIPAA, or industry-specific standards. Use Azure Policy and Compliance Manager to automate compliance validation Audit logging and threat detection: Enable audit logging and advanced threat detection with Microsoft Defender for all database services. Review logs and alerts regularly. Optimize for cost Compute tiers: Choose General Purpose for balanced workloads; Business Critical for low-latency and high IOPS. Review workload sizing and adjust tiers as needed for cost efficiency. Autoscaling: Enable autoscale for Cosmos DB and flexible servers to avoid overprovisioning. Monitor scaling events and set thresholds to control spend. Reserved capacity: Commit to 1–3 years for predictable workloads to unlock discounts. Evaluate usage patterns before committing to reservations. Serverless: Use serverless compute for workloads with completely ad hoc usage and low frequency of access. This eliminates the need for pre-provisioned resources and reduces costs for unpredictable workloads. Monitoring: Use Azure Cost Management and query performance insights to optimize spend. Set up budget alerts and analyze cost trends monthly. Include basic resource monitoring to detect adverse usage patterns early. Storage and backup costs: Review storage costs, backup retention policies, and configure lifecycle management for backups and archives. Data migration from AWS to Azure Migrating your data from AWS to Azure is a key step in replicating your app’s database layer for Azure Marketplace. The goal is a one-time transfer—after migration, your app runs fully on Azure. Azure Database Migration Service (DMS): Automates migration from RDS, Aurora, or on-prem to Azure Database, Azure SQL Managed Instance, Azure Database for MySQL/PostgreSQL, and SQL Server on Azure VM (for MySQL/PostgreSQL/SQL Server). Supports online and offline migrations; run pre-migration assessments and schema validation. Azure Data Factory: Orchestrates data movement from DynamoDB, Redshift, or S3 to Azure Cosmos DB or Synapse. Use mapping data flows for transformations and data cleansing. MongoDB migrations: Use the online migration utility designed for medium to large-scale migrations to Azure DocumentDB. Ensure schema compatibility and validate performance benchmarks before cutover. Cassandra migrations: Use Cassandra hybrid cluster or dual write proxy for Azure Managed Instance for Apache Cassandra. Validate schema compatibility and test migration in a sandbox environment. Offline transfers: For very large datasets, use Azure Data Box for secure physical migration. Plan logistics and security for device handling. Migration best practices: Schedule migration during a maintenance window, validate data integrity post-migration, and perform cutover only after successful data validation & verifications. Final readiness before marketplace listing Validate performance: Benchmark with real data and confirm chosen SKUs deliver required throughput and latency. Test application functionality under expected load and validate query performance for all critical scenarios. Lock down security: Ensure RBAC roles, Private Endpoints, and encryption meet compliance requirements. Review audit logs, enable threat detection, and verify access controls for all database and storage resources. Control costs: Verify autoscaling, reserved capacity, and cost alerts are active. Review storage and backup policies, and set up budget alerts for ongoing cost control. Enable monitoring: Set up dashboards for query performance, latency, and capacity. Configure alerts for failures, anomalies, and capacity thresholds. Monitor with Azure Monitor and Log Analytics for real-time operational insights. Documentation and support: Update migration runbooks, operational guides, troubleshooting documentation, and escalation contacts for post-migration support. Key Resources SaaS Workloads - Microsoft Azure Well-Architected Framework | Microsoft Learn Metered billing for SaaS offers in Partner Center Create plans for a SaaS offer in Microsoft Marketplace Get over $126K USD in benefits and technical consultations to help you replicate and publish your app with ISV Success Maximize your momentum with step-by-step guidance to publish and grow your app with App Advisor197Views2likes0CommentsHow do I add two personal account to microsoft learn portal ?
Hi Community, How do I add two personal account to microsoft learn portal which would allow me to see all my certification in one place. I have done my Microsoft certification using two different personal accounts. When I try to do it, I get an error "You can only have one personal account linked". Can someone advise or point me to the right direction how to approach Microsoft to help me resolve this issue. Regards159Views0likes2CommentsAzure Migrate Physical Server Discovery - ServerDiscoveryService.exe Crash Bug
Summary The Azure Migrate appliance for physical server discovery fails to complete discovery due to a crash bug in ServerDiscoveryService.exe. The service successfully connects to target servers but crashes during WSMan transport cleanup before any discovery data is collected. Environment Appliance OS: Windows Server 2022 Standard Evaluation (Build 20348) Appliance Type: Physical server discovery (script-based installation) ServerDiscoveryService.exe Version: 2.0.3300.663 .NET Version: 8.0.22 (CoreCLR 8.0.2225.52707) Target Servers: Windows Server (various) and Linux, all on-premises Discovery Agent Version: 2.0.03300.663 Appliance Configuration Manager Version: 6.1.294.1847 Symptoms Target server validation succeeds in the appliance configuration manager CIM sessions connect successfully (logs show "TestConnection succeeded for CIM Session with HTTP protocol") Connections are immediately disposed with "Disposing all connections when the process is shutdown" No discovery data is collected Azure portal shows error 60001 with misleading "Could not load file or assembly 'Microsoft.Management.Infrastructure'" message Discovery status remains "Discovery Incomplete" for all Windows servers Root Cause The ServerDiscoveryService.exe process crashes repeatedly with an unhandled NullReferenceException in the WSMan transport finalizer. This is visible in the Windows Application Event Log: Application: ServerDiscoveryService.exe CoreCLR Version: 8.0.2225.52707 .NET Version: 8.0.22 Description: The process was terminated due to an unhandled exception. Exception Info: System.NullReferenceException: Object reference not set to an instance of an object. at System.Management.Automation.Remoting.Client.BaseClientTransportManager.CloseAsync() at System.Management.Automation.Remoting.Client.WSManClientSessionTransportManager.CloseAsync() at System.Management.Automation.Remoting.Client.BaseClientTransportManager.Finalize() The crash also triggers an access violation: Faulting application name: ServerDiscoveryService.exe, version: 2.0.3300.663 Exception code: 0xc0000005 Faulting application path: C:\Program Files\Microsoft Azure Server Discovery Service\ServerDiscoveryService.exe These crashes occur approximately every 10 minutes. Troubleshooting Completed Verified manual connectivity works: PowerShell Invoke-Command and New-CimSession both succeed from the appliance to target servers using the same credentials Verified WinRM configuration: Targets have WinRM HTTP listener on port 5985, LocalAccountTokenFilterPolicy is set to 1 Verified assemblies exist: Microsoft.Management.Infrastructure.dll is present in the GAC on both the appliance and target servers Tested both FQDNs and IP addresses: Same failure occurs with both Tested both local and domain credentials: Same failure with properly formatted credentials (domain\user) Verified time synchronization: Appliance clock is accurate Verified appliance is up to date: All components show current versions Tested with fresh appliance: Previously tried OVA-based appliance with similar results; rebuilt using Microsoft's PowerShell script installer on clean Server 2022—same issue Relevant Log Locations C:\ProgramData\Microsoft Azure\Logs\ConfigManager\ClientOperations_*.log - Shows successful CIM connections followed by immediate disposal C:\ProgramData\Microsoft Azure\Logs\ConfigManager\ApplianceOnboarding-Portal-*.log - Shows error 60000 "UnhandledException" with message "Internal error occured." (note: typo is in original) Windows Event Log (Application) - Contains the actual crash stack traces Conclusion This is a code defect in ServerDiscoveryService.exe—a null reference exception in a finalizer is a programming error that cannot be caused by configuration or environmental factors. The service connects successfully but crashes before completing its work. Request Please escalate to the Azure Migrate engineering team for a bug fix in ServerDiscoveryService.exe version 2.0.3300.663.Automating Microsoft Sentinel: A blog series on enabling Smart Security
This entry guides readers through building custom Playbooks in Microsoft Sentinel, highlighting best practices for trigger selection, managed identities, and integrating built-in tools and external APIs. It offers practical steps and insights to help security teams automate incident response and streamline operations within Sentinel.1.2KViews2likes1CommentUnderstand New Sentinel Pricing Model with Sentinel Data Lake Tier
Introduction on Sentinel and its New Pricing Model Microsoft Sentinel is a cloud-native Security Information and Event Management (SIEM) and Security Orchestration, Automation, and Response (SOAR) platform that collects, analyzes, and correlates security data from across your environment to detect threats and automate response. Traditionally, Sentinel stored all ingested data in the Analytics tier (Log Analytics workspace), which is powerful but expensive for high-volume logs. To reduce cost and enable customers to retain all security data without compromise, Microsoft introduced a new dual-tier pricing model consisting of the Analytics tier and the Data Lake tier. The Analytics tier continues to support fast, real-time querying and analytics for core security scenarios, while the new Data Lake tier provides very low-cost storage for long-term retention and high-volume datasets. Customers can now choose where each data type lands—analytics for high-value detections and investigations, and data lake for large or archival types—allowing organizations to significantly lower cost while still retaining all their security data for analytics, compliance, and hunting. Please flow diagram depicts new sentinel pricing model: Now let's understand this new pricing model with below scenarios: Scenario 1A (PAY GO) Scenario 1B (Usage Commitment) Scenario 2 (Data Lake Tier Only) Scenario 1A (PAY GO) Requirement Suppose you need to ingest 10 GB of data per day, and you must retain that data for 2 years. However, you will only frequently use, query, and analyze the data for the first 6 months. Solution To optimize cost, you can ingest the data into the Analytics tier and retain it there for the first 6 months, where active querying and investigation happen. After that period, the remaining 18 months of retention can be shifted to the Data Lake tier, which provides low-cost storage for compliance and auditing needs. But you will be charged separately for data lake tier querying and analytics which depicted as Compute (D) in pricing flow diagram. Pricing Flow / Notes The first 10 GB/day ingested into the Analytics tier is free for 31 days under the Analytics logs plan. All data ingested into the Analytics tier is automatically mirrored to the Data Lake tier at no additional ingestion or retention cost. For the first 6 months, you pay only for Analytics tier ingestion and retention, excluding any free capacity. For the next 18 months, you pay only for Data Lake tier retention, which is significantly cheaper. Azure Pricing Calculator Equivalent Assuming no data is queried or analyzed during the 18-month Data Lake tier retention period: Although the Analytics tier retention is set to 6 months, the first 3 months of retention fall under the free retention limit, so retention charges apply only for the remaining 3 months of the analytics retention window. Azure pricing calculator will adjust accordingly. Scenario 1B (Usage Commitment) Now, suppose you are ingesting 100 GB per day. If you follow the same pay-as-you-go pricing model described above, your estimated cost would be approximately $15,204 per month. However, you can reduce this cost by choosing a Commitment Tier, where Analytics tier ingestion is billed at a discounted rate. Note that the discount applies only to Analytics tier ingestion—it does not apply to Analytics tier retention costs or to any Data Lake tier–related charges. Please refer to the pricing flow and the equivalent pricing calculator results shown below. Monthly cost savings: $15,204 – $11,184 = $4,020 per month Now the question is: What happens if your usage reaches 150 GB per day? Will the additional 50 GB be billed at the Pay-As-You-Go rate? No. The entire 150 GB/day will still be billed at the discounted rate associated with the 100 GB/day commitment tier bucket. Azure Pricing Calculator Equivalent (100 GB/ Day) Azure Pricing Calculator Equivalent (150 GB/ Day) Scenario 2 (Data Lake Tier Only) Requirement Suppose you need to store certain audit or compliance logs amounting to 10 GB per day. These logs are not used for querying, analytics, or investigations on a regular basis, but must be retained for 2 years as per your organization’s compliance or forensic policies. Solution Since these logs are not actively analyzed, you should avoid ingesting them into the Analytics tier, which is more expensive and optimized for active querying. Instead, send them directly to the Data Lake tier, where they can be retained cost-effectively for future audit, compliance, or forensic needs. Pricing Flow Because the data is ingested directly into the Data Lake tier, you pay both ingestion and retention costs there for the entire 2-year period. If, at any point in the future, you need to perform advanced analytics, querying, or search, you will incur additional compute charges, based on actual usage. Even with occasional compute charges, the cost remains significantly lower than storing the same data in the Analytics tier. Realized Savings Scenario Cost per Month Scenario 1: 10 GB/day in Analytics tier $1,520.40 Scenario 2: 10 GB/day directly into Data Lake tier $202.20 (without compute) $257.20 (with sample compute price) Savings with no compute activity: $1,520.40 – $202.20 = $1,318.20 per month Savings with some compute activity (sample value): $1,520.40 – $257.20 = $1,263.20 per month Azure calculator equivalent without compute Azure calculator equivalent with Sample Compute Conclusion The combination of the Analytics tier and the Data Lake tier in Microsoft Sentinel enables organizations to optimize cost based on how their security data is used. High-value logs that require frequent querying, real-time analytics, and investigation can be stored in the Analytics tier, which provides powerful search performance and built-in detection capabilities. At the same time, large-volume or infrequently accessed logs—such as audit, compliance, or long-term retention data—can be directed to the Data Lake tier, which offers dramatically lower storage and ingestion costs. Because all Analytics tier data is automatically mirrored to the Data Lake tier at no extra cost, customers can use the Analytics tier only for the period they actively query data, and rely on the Data Lake tier for the remaining retention. This tiered model allows different scenarios—active investigation, archival storage, compliance retention, or large-scale telemetry ingestion—to be handled at the most cost-effective layer, ultimately delivering substantial savings without sacrificing visibility, retention, or future analytical capabilities.1.3KViews1like1CommentRun a SQL Query with Azure Arc
Hi All, In this article, you can find a way to retrieve database permission from all your onboarded databases through Azure Arc. This idea is born from a customer request around maintaining a standard permission set, in a very wide environment (about 1000 SQL Server). This solution is based on Azure Arc, so first you need to onboard your SQL Server to Azure Arc and enable the SQL Server extension. If you want to test Azure Arc in a test environment, you can use the Azure Jumpstart, in this repo you will find ready-to-deploy arm templates the deploy demos environments. The other solution components are an automation account, log analytics and a Data collection rule \ endpoint. Here you can find a little recap of the purpose of each component: Automation account: with this resource you can run and schedule a PowerShell script, and you can also store the credentials securely Log Analytics workspace: here you will create a custom table and store all the data that comes from the script Data collection Endpoint / Data Collection Rule: enable you to open a public endpoint to allow you to ingest collected data on Log analytics workspace In this section you will discover how I composed the six phases of the script: Obtain the bearer token and authenticate on the portal: First of all you need to authenticate on the azure portal to get all the SQL instance and to have to token to send your assessment data to log analytics $tenantId = "XXXXXXXXXXXXXXXXXXXXXXXXXXX" $cred = Get-AutomationPSCredential -Name 'appreg' Connect-AzAccount -ServicePrincipal -Tenant $tenantId -Credential $cred $appId = $cred.UserName $appSecret = $cred.GetNetworkCredential().Password $endpoint_uri = "https://sampleazuremonitorworkspace-weu-a5x6.westeurope-1.ingest.monitor.azure.com" #Logs ingestion URI for the DCR $dcrImmutableId = "dcr-sample2b9f0b27caf54b73bdbd8fa15908238799" #the immutableId property of the DCR object $streamName = "Custom-MyTable" $scope= [System.Web.HttpUtility]::UrlEncode("https://monitor.azure.com//.default") $body = "client_id=$appId&scope=$scope&client_secret=$appSecret&grant_type=client_credentials"; $headers = @{"Content-Type"="application/x-www-form-urlencoded"}; $uri = "https://login.microsoftonline.com/$tenantId/oauth2/v2.0/token" $bearerToken = (Invoke-RestMethod -Uri $uri -Method "Post" -Body $body -Headers $headers).access_token Get all the SQL instances: in my example I took all the instances, you can also use a tag to filter some resources, for example if a want to assess only the production environment you can use the tag as a filter $servers = Get-AzResource -ResourceType "Microsoft.AzureArcData/SQLServerInstances" When you have all the SQL instance you can run your t-query to obtain all the permission , remember now we are looking for the permission, but you can use for any query you want or in other situation where you need to run a command on a generic server $SQLCmd = @' Invoke-SQLcmd -ServerInstance . -Query "USE master; BEGIN IF LEFT(CAST(Serverproperty('ProductVersion') AS VARCHAR(1)),1) = '8' begin IF EXISTS (SELECT TOP 1 * FROM tempdb.dbo.sysobjects (nolock) WHERE name LIKE '#TUser%') begin DROP TABLE #TUser end end ELSE begin IF EXISTS (SELECT TOP 1 * FROM tempdb.sys.objects (nolock) WHERE name LIKE '#TUser%') begin DROP TABLE #TUser end end CREATE TABLE #TUser (DBName SYSNAME,[Name] SYSNAME,GroupName SYSNAME NULL,LoginName SYSNAME NULL,default_database_name SYSNAME NULL,default_schema_name VARCHAR(256) NULL,Principal_id INT); IF LEFT(CAST(Serverproperty('ProductVersion') AS VARCHAR(1)),1) = '8' INSERT INTO #TUser EXEC sp_MSForEachdb ' SELECT ''?'' as DBName, u.name As UserName, CASE WHEN (r.uid IS NULL) THEN ''public'' ELSE r.name END AS GroupName, l.name AS LoginName, NULL AS Default_db_Name, NULL as default_Schema_name, u.uid FROM [?].dbo.sysUsers u LEFT JOIN ([?].dbo.sysMembers m JOIN [?].dbo.sysUsers r ON m.groupuid = r.uid) ON m.memberuid = u.uid LEFT JOIN dbo.sysLogins l ON u.sid = l.sid WHERE (u.islogin = 1 OR u.isntname = 1 OR u.isntgroup = 1) and u.name not in (''public'',''dbo'',''guest'') ORDER BY u.name ' ELSE INSERT INTO #TUser EXEC sp_MSforeachdb ' SELECT ''?'', u.name, CASE WHEN (r.principal_id IS NULL) THEN ''public'' ELSE r.name END GroupName, l.name LoginName, l.default_database_name, u.default_schema_name, u.principal_id FROM [?].sys.database_principals u LEFT JOIN ([?].sys.database_role_members m JOIN [?].sys.database_principals r ON m.role_principal_id = r.principal_id) ON m.member_principal_id = u.principal_id LEFT JOIN [?].sys.server_principals l ON u.sid = l.sid WHERE u.TYPE <> ''R'' and u.TYPE <> ''S'' and u.name not in (''public'',''dbo'',''guest'') order by u.name '; SELECT DBName, Name, GroupName,LoginName FROM #TUser where Name not in ('information_schema') and GroupName not in ('public') ORDER BY DBName,[Name],GroupName; DROP TABLE #TUser; END" '@ $command = New-AzConnectedMachineRunCommand -ResourceGroupName "test_query" -MachineName $server1 -Location "westeurope" -RunCommandName "RunCommandName" -SourceScript $SQLCmd In a second, you will receive the output of the command, and you must send it to the log analytics workspace (aka LAW). In this phase, you can also review the output before sending it to LAW, for example, removing some text or filtering some results. In my case, I’m adding the information about the server where the script runs to each record. $array = ($command.InstanceViewOutput -split "r?n" | Where-Object { $.Trim() }) | ForEach-Object { $line = $ -replace '\', '\\' ù$array = $array | Where-Object { $_ -notmatch "DBName,Name,GroupName,LoginName" } | Where-Object {$_ -notmatch "------"} The last phase is designed to send the output to the log analytics workspace using the dce \ dcr. $staticData = @" [{ "TimeGenerated": "$currentTime", "RawData": "$raw", }]"@; $body = $staticData; $headers = @{"Authorization"="Bearer $bearerToken";"Content-Type"="application/json"}; $uri = "$endpoint_uri/dataCollectionRules/$dcrImmutableId/streams/$($streamName)?api-version=2023-01-01" $rest = Invoke-RestMethod -Uri $uri -Method "Post" -Body $body -Headers $headers When the data arrives in log analytics workspace, you can query this data, and you can create a dashboard or why not an alert. Now you will see how you can implement this solution. For the log analytics, dce and dcr, you can follow the official docs: Tutorial: Send data to Azure Monitor Logs with Logs ingestion API (Resource Manager templates) - Azure Monitor | Microsoft Learn After you create the dcr and the log analytics workspace with its custom table. You can proceed with the Automation account. Create an automation account using the creating wizard You can proceed with the default parameter. When the Automation Account creation is completed, you can create a credential in the Automation Account. This allows you to avoid the exposition of the credential used to connect to Azure You can insert here the enterprise application and the key. Now you are ready to create the runbook (basically the script that we will schedule) You can give the name you want and click create. Now go in the automation account than Runbooks and Edit in Portal, you can copy your script or the script in this link. Remember to replace your tenant ID, you will find in Entra ID section and the Enterprise application You can test it using the Test Pane function and when you are ready you can Publish and link a schedule, for example daily at 5am. Remember, today we talked about database permissions, but the scenarios are endless: checking a requirement, deploying a small fix, or removing/adding a configuration — at scale. At the end, as you see, Azure Arc is not only another agent, is a chance to empower every environment (and every other cloud provider 😉) with Azure technology. See you in the next techie adventure. **Disclaimer** The sample scripts are not supported under any Microsoft standard support program or service. The sample scripts are provided AS IS without warranty of any kind. Microsoft further disclaims all implied warranties including, without limitation, any implied warranties of merchantability or of fitness for a particular purpose. The entire risk arising out of the use or performance of the sample scripts and documentation remains with you. In no event shall Microsoft, its authors, or anyone else involved in the creation, production, or delivery of the scripts be liable for any damages whatsoever (including, without limitation, damages for loss of business profits, business interruption, loss of business information, or other pecuniary loss) arising out of the use of or inability to use the sample scripts or documentation, even if Microsoft has been advised of the possibility of such damages.