microsoft fabric
92 TopicsApproaches to Integrating Azure Databricks with Microsoft Fabric: The Better Together Story!
Azure Databricks and Microsoft Fabric can be combined to create a unified and scalable analytics ecosystem. This document outlines eight distinct integration approaches, each accompanied by step-by-step implementation guidance and key design considerations. These methods are not prescriptive—your cloud architecture team can choose the integration strategy that best aligns with your organization’s governance model, workload requirements and platform preferences. Whether you prioritize centralized orchestration, direct data access, or seamless reporting, the flexibility of these options allows you to tailor the solution to your specific needs.5.7KViews9likes1CommentTableau to Power BI Migration: Semantic Layer-First Approach for Cloud Architects
Author's: Mahjabin Ahmed, Yassine El Ouardi, Lavanya Sreedhar LavanyaSreedhar, Peter Lo PeterLo, Aryan Anmol aryananmol, Shreya Harvu shreyaharvu and Rafia Aqil Rafia_Aqil In this guide, we provide practical guidance for migrating from Tableau to Power BI, with a focus on technical best practices and architecture. Unifying business intelligence on the Microsoft Fabric platform, enterprises gain closer integration with Microsoft 365 (Teams, Copilot, Excel). For cloud solution architects and BI developers, a successful migration is not just about rebuilding dashboards in a new tool. It requires thoughtful architectural planning and a shift to a more model-centric approach to BI. Why Semantic Layer-First Architecture Matters The Traditional Migration Challenge Most Tableau to Power BI migrations follow a dashboard-centric approach: teams attempt to replicate existing Tableau workbooks, calculated fields, and LOD (Level of Detail) expressions directly into Power BI reports. While this may seem efficient initially, it creates significant downstream challenges: Duplicated logic: Each report embeds its own calculations and business rules, leading to conflicting KPIs across the organization Maintenance overhead: Changes to business logic require updating dozens or hundreds of individual reports Governance gaps: Without centralized definitions, semantic drift occurs—different teams calculate "Revenue" or "Active Customer" differently Scalability issues: As data volumes grow, report-level transformations become performance bottlenecks The Semantic Layer-First Alternative Microsoft's recommended approach centers on semantic models (formerly called datasets)—centralized, governed data models that separate business logic from visualization. In this architecture: The payoff is substantial: when data evolves or business rules change, you update the semantic model once, and all dependent reports automatically reflect the changes—no manual redesign required. Understanding Migration Complexity: Simple to Very Complex Dashboards Not all Tableau dashboards are created equal. The migration strategy should align with dashboard complexity, and the semantic layer approach becomes increasingly valuable as complexity grows. Follow a Step-by-Step Migration Strategy Migrating from Tableau to Power BI is not a one-click effort – it requires a mix of automated and manual refactoring, plus a sound change management plan. Below are key strategies and best practices for a successful migration: Audit your Tableau estate: Start by taking inventory of all existing Tableau workbooks, data sources, and dashboards. Determine what needs to be migrated (focus on high-value, widely used reports first) and identify any redundant or obsolete content that can be retired rather than converted. Conduct a proof-of-concept (PoC): Before migrating everything, pick a representative complex dashboard (or a subset of your data) and perform a pilot migration. This will help you validate that Power BI can connect to your data (e.g. setting up the Power BI gateways for on-premises sources), test performance (Import vs DirectQuery modes), and experiment with replicating key visuals or calculations. Use the PoC to uncover any surprises early – for example, test that any Level of Detail expressions or table calculations in Tableau can be re-created in DAX. The lessons learned here should inform your overall project plan. Use a phased migration approach: Plan to run Tableau and Power BI in parallel for some period, rather than switching everything at once. Migrate in waves – for example, by business unit or subject area – and incorporate user feedback as you go. This phased approach reduces risk and allows your team to improve the process with each iteration. It also gives end users time to adjust gradually. Migrate high-impact dashboards first: Prioritize the migration of key reports and dashboards that are critical to the business or have the most usage. Delivering these early wins will not only surface any technical challenges to solve but will also help demonstrate the value of Power BI’s capabilities to stakeholders. Early success builds buy-in and momentum for the rest of the migration. Reimagine (don’t just replicate) the experience: It’s rarely possible – or desirable – to exactly re-create every Tableau visualization pixel-for-pixel in Power BI. Embrace the opportunity to focus on business questions and improve user experience with Power BI’s features. For example, rather than replicating a complex Tableau workaround, you might implement a cleaner solution in Power BI using native features (like bookmarks, drilldowns, or simpler navigation between pages). Engage business users and subject matter experts during this redesign to ensure the new reports meet their needs. Enable dataset reusability: One major benefit of the Power BI approach is the ability to create shared datasets and dataflows. As you migrate, look for opportunities to create central semantic models (datasets) that can serve multiple reports. For instance, if several Tableau workbooks are all using similar data about sales, you can create one central Sales dataset in Power BI. Report creators across the organization can then build different Power BI reports on that single dataset without duplicating data or logic. This reduces maintenance and promotes a “build once, reuse often” strategy. Provide training and support: Expect a learning curve for teams moving to Power BI – especially those who are very fluent in Tableau. Plan for user upskilling and training programs. Establish a support community or office hours where new users can ask questions and get help. If possible, identify Power BI champions or recruit a Power BI Center of Excellence (COE) team who can guide others. During the transition, ensure there are subject matter experts (SMEs) available to address questions and validate that the new reports are correct. Manage change and expectations: It’s important to communicate why the organization is moving to Power BI (e.g. benefits like deeper integration, lower TCO, better governance) to get buy-in from end users. Some users may be resistant to change, especially if they’ve invested a lot of time in mastering Tableau. Prepare to handle varying responses – emphasize the personal benefits (like improved performance, new capabilities, or career growth with popular skills) to encourage adoption. Also, involve influential business users early and gather their feedback, so they feel ownership in the new solution. Establish governance from Day 1: Don’t wait until after migration to think about governance. Use this chance to set up Power BI governance aligned to best practices. Decide on important aspects such as workspace naming conventions, who can create or publish content, how you’ll monitor usage and costs, and how to manage data access and security (for example, designing a strategy for RLS/OLS/CLS, and deciding when to use per-user datasets vs. organizational semantic models). Good governance will ensure your shiny new Power BI environment doesn’t sprawl into chaos over time. Allow time for adjustment and iteration: Finally, be patient and iterative. Depending on the scale of your organization and the number of Tableau assets, a full migration can take months or even a year or more. Plan realistic transition periods where both systems might coexist. Continuously refine your approach with each wave of migration. Power BI’s frequent update cadence (monthly releases) means new features may emerge even during your project – stay updated, as new capabilities could simplify your migration (for example, the introduction of field parameters or Copilot might let you modernize certain Tableau features more easily). Reimagine (don’t just replicate) the experience (Step 5): Phase 1: Assessment and Planning 1. Audit Your Tableau Estate Inventory all workbooks, data sources, and calculated fields Identify high-traffic dashboards (prioritize for early migration) Categorize by complexity (Simple/Medium/Complex/Very Complex) 2. Design Your Semantic Architecture Map Tableau data sources to Power BI data sources (DirectQuery, Import, or Direct Lake) Plan star schema for fact/dimension tables Identify shared calculations that should live in semantic models vs. report-specific logic 3. Choose Storage Modes Source Type Recommended Mode Rationale Databricks Delta Lake Direct Lake Real-time analytics, no refresh lag Azure SQL Database DirectQuery or Import Based on data volume and refresh SLAs On-Premises SQL Server Import (via Gateway) Network latency considerations Excel/CSV files Import Small reference data Phase 2: Build the Semantic Layer 1. Create Star Schema Data Models Tableau often relies on flat, denormalized datasets. Power BI performs best with star schemas: Fact tables: Transactional data (sales, orders, events) with foreign keys to dimensions Dimension tables: Descriptive attributes (customers, products, dates) with primary keys Relationships: One-to-many from dimension to fact, leveraging bidirectional filtering sparingly 2. Migrate Calculations to DAX Measures Convert Tableau calculated fields to DAX measures in the semantic model: --Example of DAX: -- Define as measure: Total Revenue = SUMX( 'Sales', 'Sales'[Quantity] * 'Sales'[Unit Price] ) 2.1 Use Copilot to Accelerate DAX Development Leverage Copilot in Power BI Desktop to generate and validate DAX: Describe the calculation in natural language Copilot suggests DAX syntax Review, test, and refine 2.2 Document your Semantic Model Invest in creating an AI-ready foundation for your semantic model. AI systems need to understand unique business contexts in order to prioritize correct information to provide consistent and reliable responses to your end users. Name Tables and Columns Clearly: Avoid ambiguity in your semantic model. Use human-readable, business-friendly names. Avoid abbreviations, acronyms, or technical terms. This improves Copilot’s ability to interpret user intent. Create Meaningful Measures: Define reusable DAX measures for key business metrics (e.g., Revenue, Profit Margin). AI features rely on these to generate insights and summaries. Document Semantic Model objects: Add descriptions and synonyms to your Tables, Columns and measures. This enhances natural language querying and improves Copilot’s contextual understanding. Build an AI Data Schema: prepare your semantic model for AI by utilizing tooling features such as Prep data for AI. Phase 3: Understanding Migration Complexity: Simple to Very Complex Dashboards Not all Tableau dashboards are created equal. The migration strategy should align with dashboard complexity, and the semantic layer approach becomes increasingly valuable as complexity grows. 1. Dashboard Conversion Best Practices Think in "pages" not "sheets": Power BI reports combine multiple visuals per page; group related visuals logically Use slicers for interactivity: Replace Tableau filters with Power BI slicers and filter pane Leverage bookmarks for navigation: Create dynamic report experiences with show/hide containers Simple Complexity Level Category Tableau Feature Power BI Equivalent Microsoft Fabric Enhancements Best Practice Notes Data Model Single custom SQL Power Query for data shaping and ETL. OneLake Shortcuts for unified data access. Use star schema for optimized performance; push logic into the semantic layer rather than visuals. Calculations Basic IF/ELSE, SUM Data Analysis Expressions (DAX) for measures and calculated columns. Copilot for Power BI to assist with DAX creation. Fabric IQ for natural language queries. Centralize calculations in semantic models for consistency and governance. Medium Complexity Level Category Tableau Feature Power BI Equivalent Fabric Enhancements Best Practice Notes Data Model Multiple custom SQL (up to 3) Connect live to databases (Azure Databricks): DirectQuery in Power BI Connect with cloud data sources: Power BI data sources OneLake Shortcuts for unified access without databricks compute cost. Semantic Models can combine multiple sources. Optimize with star schema; Prefer OneLake Shortcuts for performance; avoid heavy transformations in visuals. Calculations Nested IFs, CASE Data Analysis Expressions (DAX) for measures and calculated columns. Copilot for Power BI to assist with DAX creation. Fabric Data Agent for conversational BI. Fabric IQ for natural language queries: Fabric IQ Centralize logic in semantic models; use Copilot for automation and validation; keep calculations reusable. Reporting Tooltip format in Bar and Map visuals Select All/Clear option for Single Select dropdown Standard tooltips offer help tooltips, text, and background formatting. Dynamic tooltip will be able to create the Tooltip page and reuse it in multiple visuals The customization is so much better than the OOB tooltips Create report tooltip pages in Power BI - Power BI | Microsoft Learn Use Clear All Slicers Button. Disable Single Select, Add Clear All Slicers button, Customize the Button and Use the Button Complex Complexity Level Category Tableau Feature Power BI Equivalent Fabric Enhancements Best Practice Notes Data Model Multiple sources Create relationship using more than one column Composite Models in Power BI (DirectQuery + Import) for combining multiple sources, also connect to various cloud services. Dataflows for pre-processing. Power BI allows a relationship between 2 tables based on only one active column. OneLake Shortcuts for unified access without Azure Databricks compute cost; Microsoft Fabric Dataflows Gen2 offers multiple ways to ingest, transform, and load data efficiently. Consolidate sources into semantic models; use Direct Lake for performance; Plan and design data model to comply with star schema supported by Power BI Relationship DAX USERELATIONSHIP DAX for activating relationships in Power BI for a specific calculation Calculations LOD, window functions Data Analysis Expressions (DAX) for measures and calculated columns. Copilot to assist with complex DAX. Fabric IQ Ontology for semantic alignment. Change how visuals interact in a Power BI report. Centralize calculations in semantic layer; use variables in DAX for readability and performance. Fabric Data Agent for a conversational BI. Very Complex Complexity Level Category Tableau Feature Power BI Equivalent Fabric Enhancements Best Practice Notes Data Model Multi-source, Excel, SQL Composite Models in Power BI (DirectQuery + Import) for combining multiple sources, also connect to various cloud services. Dataflows for pre-processing. OneLake Shortcuts for unified access; Connector overview build-in support. Mirroring for real-time sync. Combine multiple sources into well-structured semantic models for consistency and optimized performance. Calculations Predictive logic Data Analysis Expressions (DAX) for measures and calculated columns. Fabric AutoML, ML models, AI Insights, Python/R, Notebook‑based ML (Spark/Scikit‑Learn), Fabric AI Functions, Fabric IQ Ontology Fabric Data Agent for a conversational BI. Centralize logic in semantic models; leverage Copilot for automation and parameter-driven workflows. Prepare for Copilot. 2. Tableau Feature Equivalents Tableau Feature Power BI Equivalent Microsoft Learn Link Calculated Fields DAX Measures DAX Documentation Parameters Field Parameters / Bookmarks Use report readers to change visuals Actions Drillthrough / Bookmarks Drillthrough Tableau Prep Power Query / Dataflows Differences between Dataflow Gen1 and Dataflow Gen2 Tableau Server Power BI Service What is Power BI? Overview of Components and Benefits Phase 4: Governance and Deployment Workspace Planning (Dev / Test / Prod Separation) A proper workspace strategy is essential for governed deployments in Fabric and Power BI. Fabric supports separate Development, Test, and Production stages using Deployment Pipelines, enabling controlled promotions of semantic models, reports, dataflows, notebooks, lakehouses, and other items. You can assign each workspace to a pipeline stage (Dev → Test → Prod) to ensure safe lifecycle management. Sensitivity Labeling (Microsoft Purview Information Protection) Sensitivity labels allow governed classification and protection of data across Fabric items. Sensitivity labels can be applied directly to Fabric items (semantic models, reports, dataflows, etc.) through the item's header flyout or the item settings. Labels from Microsoft Purview Information Protection enforce data access rules and help organizations meet compliance requirements. Endorsement & Certification (Promoted, Certified, Master Data) Endorsement improves discoverability and trust in shared organizational content. Promoted: Item creators mark content as recommended for broader use. Certified: Administrators or authorized reviewers validate content meets organizational quality standards. Master Data: Indicates authoritative single‑source‑of‑truth items such as semantic models or lakehouses. All Fabric items except dashboards can be promoted or certified; data‑containing items can be designated as Master Data. Monitoring & Capacity Planning Determine the appropriate size for fabric capacity when migrating from Tableau to PowerBI. The Fabric SKU Estimator can generate a SKU recommendation (estimate) for your capacity requirements. Ensuring performance and cost efficiency requires ongoing monitoring of your Fabric capacity. Microsoft recommends evaluating workloads using Fabric Capacity Metrics and planning SKU sizes based on real usage. Fabric uses bursting and smoothing to handle spikes while enforcing capacity limits. Monitoring helps identify high compute usage, background refreshes, and interactive workloads to optimize performance. Fabric Data Source Connections (OneLake+ Manage Connections) Microsoft Fabric is designed as an end‑to‑end analytics platform that integrates data from many different source systems into a unified environment powered by OneLake, Data Factory, Real‑Time Analytics, Dataflows , Lakehouses, Warehouses, and Mirrored Databases. The Strategic Advantage: Semantic Layer + Fabric IQ The semantic layer-first approach sets the foundation for the next evolution in enterprise analytics. Fabric IQ (announced at Ignite 2025) is Microsoft's semantic intelligence platform that auto-elevates semantic models into ontologies—structured knowledge graphs that power AI agents, Copilot experiences, and cross-domain data reasoning. What this means for your migration: Semantic models you build today become the foundation for AI-driven analytics tomorrow Data Agents can reason across multiple semantic models, answering questions that span domains Business users transition from "report consumers" to "data explorers" via natural language interfaces Conclusion: Build for the Future, Not Just for Today Migrating from Tableau to Power BI is more than a technology swap—it's an opportunity to re-architect your analytics strategy for the cloud-native, AI-powered era. The semantic layer-first approach requires upfront investment in data modeling, DAX expertise, and Fabric platform adoption. But the payoff is transformative: Consistency: Single source of truth for all business metrics Scalability: Semantic models that serve hundreds of reports and thousands of users Agility: Changes to business logic propagate instantly across the enterprise Future-readiness: Foundation for Fabric IQ, Data Agents, and AI-driven insights Start your migration with the end in mind: not just convert dashboards, but a modern, governed, AI-ready analytics platform that scales with your business. Addressing Key Migration Concerns (1) Why a semantic‑layered model approach is better than recreating Tableau dashboards A semantic‑layered modeling approach is the optimal strategy for migration and is significantly more effective than attempting to recreate Tableau dashboards exactly as they exist. By contrast, Power BI and Fabric encourage a semantic model–first architecture, where all business rules, relationships, calculations, and transformations are centralized in a governed model that serves many dashboards. The approach not only provides consistency and reuse across the enterprise but also ensures that report authors build on a single certified version of the truth. (2) How semantic-layered model approach reduces the constant redesign caused by changing data needs. A semantic‑layered modeling approach directly addresses concern about constant changes and frequent redesigns of dashboards when data evolves. With a semantic layer, changes are absorbed in the model layer—so the logic is updated once and flows automatically into all dependent reports. Combined with Fabric features like OneLake shortcuts, Direct Lake mode, and centralized governance, the semantic layer drastically reduces breakage, minimizes rework, and ensures scalability as data continues to grow and shift. Additional Resources Direct Lake in Microsoft Fabric Create Fabric Data Agents OneLake Shortcuts Write DAX queries with Copilot - DAX Prepare Your Data for AI - Power BI | Microsoft Learn3.3KViews4likes2CommentsOperations Context for AI | Ontology in Fabric IQ
Generate a full business ontology from an existing Power BI semantic model, map entities and relationships, and embed real-time operational signals alongside business rules that live inside the ontology with the data and its meaning. From there, trace cascading operational impacts across your business through the relationship graph, stand up Operations Agents in natural language with Teams-based actions, and connect the same ontology as a knowledge source in Copilot Studio or Azure AI Foundry. Chafia Aouissi, Fabric IQ Principal PM Manager, shares how to model your business operations, embed intelligence in your data, and deploy agents that act on it. Logic lives with your data. Embed business rules directly in the Fabric IQ ontology. Define thresholds, trigger notifications, and cascade decisions through connected entities. Watch the demo. Surface cascading impacts with a single query. Filter the Fabric IQ relationship graph by any entity and trace downstream operational impact across your entire business. Check out a built-in ontology graph in Fabric IQ. Build a Fabric IQ Operations agent in natural language. Define monitoring goals, connect your ontology as its knowledge base, configure Teams actions, & deploy. See the full build. QUICK LINKS: 00:00 — Unify models & data with Fabric IQ 01:12 — Generate an ontology 02:27 — Bring in Power BI reports 03:08 — View across multiple data sources 04:23 — Define rules 05:18 — Built-in ontology graph 06:03 — Fabric IQ agents 08:24 — Fabric IQ as Knowledge Source 08:54 — Wrap up Link References Get started at https://aka.ms/FabricIQ Unfamiliar with Microsoft Mechanics? As Microsoft’s official video series for IT, you can watch and share valuable content and demos of current and upcoming tech from the people who build it at Microsoft. Subscribe to our YouTube: https://www.youtube.com/c/MicrosoftMechanicsSeries Talk with other IT Pros, join us on the Microsoft Tech Community: https://techcommunity.microsoft.com/t5/microsoft-mechanics-blog/bg-p/MicrosoftMechanicsBlog Watch or listen from anywhere, subscribe to our podcast: https://microsoftmechanics.libsyn.com/podcast Keep getting this insider knowledge, join us on social: Follow us on Twitter: https://twitter.com/MSFTMechanics Share knowledge on LinkedIn: https://www.linkedin.com/company/microsoft-mechanics/ Enjoy us on Instagram: https://www.instagram.com/msftmechanics/ Loosen up with us on TikTok: https://www.tiktok.com/@msftmechanics Video Transcript: -The agents you build and use need the right operational context of how your business runs to deliver the best outcomes. Today, that context is often fragmented across systems, defined differently by different teams, or buried in dashboards and logic, making outcomes inconsistent and agent behavior hard to predict. That’s where Microsoft Fabric IQ comes in. Fabric IQ introduces a semantic foundation that unifies models and data through an ontology. It defines the shared business entities and their relationships, and connects them to your data. It provides the operational context needed to understand how the business actually runs, without altering any underlying data. Analysts can not only work with the data they already trust, but also model how the business works. And agents can use that same shared context to reason and act more consistently. Today, I’ll show you both sides. -First, how a data analyst leverages Fabric IQ inside a Fabric workspace, using ontology to model the business concepts. Then, how Fabric IQ uses the same context to drive more reliable and predictable insights from agents. I’ll start from the point of view of an analyst looking to create a full fidelity view of how an airline operates, including processes such as ticketing, maintenance, and more. The first thing I need to do is to create a new ontology. I can either build one from scratch, or jumpstart by using an existing Power BI semantic model. As you see here, there’s a new option to generate an ontology from this semantic model. I just need to choose the workspace, give it a name. I’ll choose AirlineOperationsOntology. Then confirm by hitting Create. In just a few clicks, I’m able to see the different entities of our airline business. All data is now linked not only through keys, but also business relationships and semantics. We can see flights, airlines, routes, and more. If I click into airports, because Fabric IQ is semantically aware of the relationships between entities, it shows the routes connected to runways which are in turn connected to airports. -And for any entity, I can choose to add more live operational signals. In this case, I want to add details about the runway conditions using real-time data, including contamination, visual range for visibility, as well as the available cleared width and more. And you can also bring in your Power BI reports for a canonical view of how to monitor and manage these aircrafts. From the ontology, I’ll head over to the Report links tab that opens the OneLake catalog with all of my reports. I’ll search for air and there are three matching reports for gates, ground service, plus safety and runway. So I’ll add them and hit Connect to confirm. -So, in just a few clicks, we’ve expanded our ontology with the live operational view, using real-time signal, geospatial data, and more. Now, as an analyst, I can work immediately with it, and our agents can act on it as well. Let’s fast‑forward and see what I’ve unlocked. You can see that I now have a richer view over my data, which is connected to real‑world operations. We’re no longer optimizing one report or one dataset at a time. We’re looking at our operations across multiple data sources, in the language of our business, with meaning and relationships already understood. -Now let me show you how this makes it easier to turn insights into concrete decisions and actions. First, in the flight entity type overview, I can see how it relates to my other business processes. I see entities like bookings, gates, airlines, and more as a graph. I have links to all of my connected Power BI reports. There is a real-time weather data, including wind knots, as well as geo-spatial insights showing all of my flights. Using Fabric Maps, I have a fleet level view of all my active flights, and I can see live air traffic across the fleet. This, in fact, is a heat map view of three New York City area airports, and we can see that JFK in this case is impacted with lots of runway activity. -I can now understand the system as a whole, across bookings, flights, airports, and real‑time conditions. And I can drill in further to understand what is going on: I have opened the runways entity, and you’ll remember some of these categories from before. Since there’s snow in the area, I can immediately see the runway conditions that affect operations, things like surface friction and contamination levels, so I understand how safe it is for planes to take off and land. -Beyond connecting raw data, I can also define rules directly in the ontology, so this logic lives with the data and its business meaning, instead of being hard coded somewhere else. In this case, I’ll add a rule that says if runway contamination exceeds high threshold value of 25%, notify the passengers proactively of upcoming delays. We’ll also notify the ground crew, so they know that the runways need to be cleared. The rules are now embedded in the ontology, and the value comes from seeing how runway conditions impact the rest of the operations. That’s where the built‑in ontology graph helps. Let’s look at the relationship graph. I’ll expand the graph view. And add a filter for JFK airport Then run the query. No code needed here. And I get a filtered view for JFK. And immediately, I can see a poor condition that’s affecting Runway 25R. -From that insight, it’s easy to see the downstream impact. This runway issue is already affecting related gates and baggage operations that will need to be rescheduled. This is a unified view of our entire operations and how connected events will cascade across related business entities. This is how ontology helps you as an analyst. But remember, the same operational context is also available to AI agents, no matter how you build them. -Let me demonstrate this in the context of one our built-in Fabric IQ agents. From the New Item catalog, you can find the built-in agents by searching for agent. There is a Data agent designed to answer questions, and an Operations agent designed for real-time data and business action recommendations. The Operations agent is a perfect fit for our airline operations scenario, so I’ll choose that one. I want this agent to help with runway-related analysis and actions, so I’ll name it RunwayConditionsAgent, leave the location, and create it. -From there, I can add a bit more information to set up the agent, like adding the business goals for what it should accomplish I want this one to monitor surface conditions for runways and ensure things run smoothly based on logic like we used before. In fact, in the Agent instructions, using natural language, no code, I’ll describe that if surface contamination is reported above 10%, send ground crews to take care of it. Likewise, the clear width should be more than 25 meters, and the agent should send ground crew to visually assess whether planes can safely brake. -Now let’s add some knowledge. And for that, I’ll choose our AirlineOntology. And here’s where I can add actions. I’ll add one to assign ground crew for clearing, along with description for what needs to be done. Then I’ll give it the Runway ID as the one to clear and the Temperature to predict the type of clearing needed. And Create to add that one. Now I’ll add another for requesting visual assessment, and perform similar steps for the parameters. These will send status updates in Microsoft Teams. Now everything is defined and ready. I just need to save this new agent. That takes a moment. And once it’s finished, it creates a nice agent playbook with what it’s designed to do. -Now, with the agent running, the right people will get notified of what to do in Microsoft Teams Here, I’m looking at the Operations agent It’s alerting us that Runway 29L has only 22 meters of clear path. This is under our 25 meter threshold. It recommends to deploy the ground crew for a runway clearance operation. As the human in the loop, I can choose whether or not to proceed with the recommendation. I’ll do that. -Then it asks to confirm a few details. They look good, so I’ll confirm, and the ground crew is on its way. And here is the good news. If you’re building your own agent in Microsoft Copilot Studio or using Microsoft Foundry, Fabric IQ ontology will be an integrated knowledge source that you will be able to choose from. As you choose your knowledge types, you can select Fabric IQ. This will ground agents in the same semantic foundation that already runs your operations. And of course, the agents you build and connect to Fabric IQ will respect the permissions and security policies you already use in Fabric today. -As I have shown, Microsoft Fabric IQ gives agents shared understanding, with entities, relationships, rules, and actions so they can move from insight to decision more reliably. To learn more and get started, check out aka.ms/FabricIQ. Keep watching Microsoft Mechanics for the latest news and deep dives. And thank you for watching.219Views0likes0CommentsSimplifying Migration to Fabric Real-Time Intelligence for Power BI Real Time Reports
Power BI with real-time streaming has been the preferred solution for users to visualize streaming data. Real-Time streaming in PowerBI is being retired. We recommend users to start planning the migration of their data processing pipeline to Fabric Real-Time Intelligence.6.7KViews2likes1CommentGeneral Availability Refresh: Mirroring Azure Database for PostgreSQL in Microsoft Fabric
Introduction We are excited to announce the General Availability Refresh of Mirroring Azure Database for PostgreSQL in Microsoft Fabric. This update introduces several new capabilities designed to reduce friction, improve transparency, and increase trust in PostgreSQL data for analytics and AI workloads. Database professionals, DBAs, and developers can now leverage a more robust, flexible, and transparent solution for integrating PostgreSQL data in Microsoft Fabric. Mirroring Azure Database for PostgreSQL enables seamless integration of transactional data into Microsoft Fabric, supporting advanced analytics and AI scenarios. Previously, users faced limitations in data type support, operational requirements, and troubleshooting transparency. The General Availability Refresh directly addresses these challenges, delivering a more reliable and user-friendly experience. Native Data Type Support One of the most significant enhancements is support for PostgreSQL native data types, including JSON and JSONB. Users can now mirror complex, semi-structured data without conversion, preserving the full fidelity of source data. Additionally, transparent replication to varchar(max) or varbinary is now supported, ensuring that even custom or less common types can be accurately mirrored in Fabric. Previously, data type limitations often required workarounds or manual transformations, introducing complexity and risk. With this update, data flows more naturally from PostgreSQL to Fabric, streamlining analytics and AI pipelines and reducing the time spent on data preparation. Mirroring now preserves PostgreSQL-native types—including JSON and JSONB—without requiring schema transformations or type coercion. This ensures: Full fidelity replication of semi-structured data Elimination of intermediate serialization (e.g., string encoding) Tables containing JSON/JSONB columns are mirrored into Fabric with their structure intact, enabling: Direct querying of nested JSON fields Consistent schema representation across source and analytical layers Once mirrored, JSON content can be queried natively within Fabric workloads, allowing: Hybrid analytical scenarios (relational + semi-structured) Use of familiar SQL patterns for JSON traversal and extraction Flexible Server High Availability Support for PG versions earlier than 17 The refresh expands support for Azure Database for PostgreSQL Flexible Server with high availability enabled on versions earlier than 17. This means organizations can leverage mirroring with HA configurations without needing to upgrade their database engine, improving operational flexibility and minimizing disruption. In the past, mirroring required specific PostgreSQL versions, limiting adoption for organizations with established HA deployments. Now, the broader compatibility allows teams to maintain their preferred configurations and benefit from seamless data integration in Fabric. Improved Transparency: Enhanced Error Messaging and Dedicated PostgreSQL UDFs Transparency is critical for troubleshooting and maintaining trust in data pipelines. The General Availability Refresh introduces enhanced error messaging, providing clear and actionable insights into replication issues. Dedicated PostgreSQL user-defined functions (UDFs) further support monitoring and diagnostics, enabling users to quickly identify and resolve problems. --Validates that all system and configuration requirements are met before starting CDC mirroring. SELECT * FROM azure_cdc.check_prerequisites(); -- Quickly verify which extension version is deployed — critical for troubleshooting SELECT azure_cdc.azure_cdc_version(); -- Returns a detailed list of errors and issues detected during CDC operations SELECT * FROM azure_cdc.get_health_status('', ''); -- Scans every eligible user table in the database and returns the mirroring readiness status for each SELECT * FROM azure_cdc.get_all_tables_mirror_status(); These features streamline troubleshooting, reduce downtime, and empower teams to maintain robust mirroring operations. The improvements are a substantial step forward from previous releases, which offered limited visibility into replication health and errors. Unblocking Mirroring for servers with Read Replica enables We also removed existing block for creating Mirrored databases on Flexible Servers with Read Replicas created. Now primary servers with one or more Read Replicas can serve as source for Mirroring their databases in Fabric with no limitations. Conclusion The new capabilities significantly reduce friction for analytics and AI workloads in Microsoft Fabric. By removing technical barriers and increasing operational simplicity, the General Availability Refresh empowers organizations to unlock the full potential of their PostgreSQL data for advanced analytics and AI applications. The General Availability Refresh of Mirroring Azure Database for PostgreSQL in Microsoft Fabric delivers critical enhancements: native data type support, expanded Flexible Server HA compatibility, replication identity improvements, and better transparency. Together, these features make mirroring more robust, flexible, and trustworthy for analytics and AI workloads. We encourage technical professionals, DBAs, and developers to adopt these new capabilities and explore how they can transform data integration in Microsoft Fabric. For more information, visit the official documentation and resources: Azure Database for PostgreSQL Documentation Microsoft Fabric Documentation346Views1like1CommentStep by Step Guide to Ontology and Plan for Financial Service
What We Will Build In this guide, we will construct a complete Fabric IQ solution that accomplishes the following: First, a Lakehouse that ingests publicly available data including bank financials, P2P lending statistics, borrower demographics, and licensing information. Second, a Semantic Model that defines the analytical layer with proper dimensions, measures, and relationships. Third, an Ontology that elevates these tables into business entities such as Bank, P2P Platform, Borrower, and Loan, connected by meaningful relationships and governed by regulatory rules. Fourth, a Planning sheet that enables supervisors to forecast enforcement workloads, allocate examination budgets, and model scenarios based on live data. Step 1: Preparing the Data Foundation in Fabric Lakehouse Every Fabric IQ solution begins with data. Before we can model business semantics or build planning sheets, we need a well structured Lakehouse that holds our source data in a governed and queryable format. Creating the Lakehouse Navigate to your Fabric workspace and create a new Lakehouse. In this example, we have named it P2PLendingLH, housed within the workspace P2P Lending CrossSector Demo. The Lakehouse serves as the Bronze and Silver layer of our medallion architecture, storing both raw ingested data and transformed analytical tables. Data Sources and Tables The Lakehouse is populated with data from publicly available publications. The table structure follows a dimensional modeling pattern with clear separation between dimension tables (prefixed with dim_) and relationship tables (prefixed with rel_). The following tables form the foundation of our model: Table Name Description dim_bank Bank profiles including KBMI tier, total assets, CAR, NPL, channeling exposure percentage dim_borrower Borrower demographics with credit score, employment type, province, and risk segment dim_p2p_platform Licensed P2P lending operators with TWP90 rate, outstanding balance, and total borrowers dim_loan Individual loan records with amount, tenure, interest rate, and repayment status dim_supervisor_team supervisory teams and their regional assignments dim_channeling_agreement Bank to P2P channeling contracts and exposure limits In addition to dimension tables, several relationship tables capture the connections between entities. These include rel_bank_channels_platform (which bank funds which P2P platform), rel_borrower_takes_loan (linking borrowers to their loans), rel_loan_funded_by_bank (tracing the funding chain), rel_platform_issues_loan (connecting platforms to the loans they originate), and rel_supervisor_oversees_platform and rel_supervisor_oversees_bank (mapping supervisory responsibility). Step 2: Creating the Semantic Model With data in the Lakehouse, the next step is to create a Semantic Model that defines the analytical interface. The Semantic Model is a Power BI construct that organizes your tables into a star schema with proper relationships, hierarchies, and measures. More importantly for our purpose, this Semantic Model will later serve as the blueprint from which we generate our Ontology. Generating the Model from Lakehouse From within the Lakehouse, click on "New semantic model" in the toolbar. A dialog appears allowing you to name your model and select which tables to include. In our case, we select all dimension and relationship tables to ensure the Ontology will have full visibility into the data landscape. Figure 1. Creating a new Direct Lake semantic model from the P2PLendingLH Lakehouse, selecting dimension and relationship tables for inclusion. Notice that the dialog shows the workspace name (P2P Lending CrossSector Demo) and provides a searchable list of all available tables. The Direct Lake mode is automatically selected, which means the Semantic Model will query data directly from the Lakehouse parquet files without importing a copy. This is important for our use case because it ensures that when regulator publishes updated monthly statistics and the Lakehouse is refreshed, the Semantic Model and subsequently the Ontology will reflect the latest data. Configuring Relationships and Properties After creation, the Semantic Model opens in the editing view where you can configure relationships, add calculated measures, and define display properties. The model view shows the entity cards with their fields and the lines connecting related tables. Figure 2. The Semantic Model editor showing entity cards for dim_bank and dim_borrower, with relationship lines and the full table listing in the Data panel. In the screenshot above, you can see two of the core dimension tables. The dim_bank table contains fields such as bank_id, bank_type, channeling_exposure_pct, channeling_total, name, regulator_team, and total_assets. The dim_borrower table holds borrower_id, credit_score, employment_type, name, province, and risk_segment. The Data panel on the right reveals the complete set of tables available in this model, including all the relationship tables that define the connections between entities. At this stage, you should verify that all necessary relationships are correctly established. For example, dim_bank should connect to rel_bank_channels_platform through bank_id, and dim_p2p_platform should connect to rel_platform_issues_loan through platform_id. These relationships are what enable the Ontology to reason across domains in the next step. You may also want to add calculated measures at this point, such as a weighted average TWP90 across all platforms funded by a specific bank, or a total channeling exposure as a percentage of the bank's total assets. These measures will be carried forward into the Ontology and can be used by AI agents for natural language querying. Step 3: Generating the Ontology This is the step where the magic of Fabric IQ truly comes alive. The Ontology transforms your Semantic Model from a reporting layer into an intelligence layer. While the Semantic Model answers the question "what does the data look like," the Ontology answers the question "what does the data mean." What the Ontology Does An Ontology in Fabric IQ is a machine understandable vocabulary of your business. It consists of entity types (the things in your environment, such as Bank, Borrower, or P2P Platform), properties (the facts about those entities, such as a bank's NPL ratio or a platform's TWP90 rate), and relationships (the ways entities connect, such as a Bank channels funding to a P2P Platform). Beyond static modeling, the Ontology also supports rules and constraints that can trigger automated actions when business conditions are met. Generating from the Semantic Model To create the Ontology, open your Semantic Model and look for the "Generate Ontology" button in the toolbar. Clicking it opens the generation dialog, which presents three key value propositions: Unify models into a semantic layer allows you to align concepts across domains and modeling paradigms, bringing banking data and P2P lending data into a shared vocabulary. Model expressively enables you to capture complex relationships, domain specific rules, and actions that drive business workflows, such as triggering an alert when a P2P platform's TWP90 crosses the 5 percent regulatory threshold. Reason over events and temporal patterns means that the Ontology can use sequences and trends to inform decisions and automation, such as detecting three consecutive months of TWP90 deterioration. Figure 3. The Ontology generation dialog, creating a new Ontology named NewP2P from the existing Semantic Model within the P2P Lending CrossSector Demo workspace. In the dialog, you specify the workspace (P2P Lending CrossSector Demo) and give your Ontology a name (in this example, NewP2P). After clicking Create, Fabric IQ analyzes the Semantic Model's structure, identifies entity types from dimension tables, infers relationships from the foreign key connections, and generates a navigable graph that represents your business domain. Enriching the Ontology with Rules Once the Ontology is generated, you can enrich it with business rules that reflect regulatory requirements. For the P2P lending use case, the following rules are particularly relevant: Rule Name Condition Action Elevated TWP90 P2P Platform TWP90 exceeds 5 percent Flag platform as high risk and alert PVML supervisor Contagion Risk Bank channeling exposure to flagged P2P platform exceeds 10 percent of portfolio Alert Banking supervisor and recommend joint examination Youth Overleveraged Borrowers aged 19 to 34 represent more than 60 percent of a platform's portfolio AND TWP90 is above average Trigger consumer protection review and education program allocation CAR Threshold Bank CAR drops below 10 percent while having active P2P channeling agreements Escalate to Kepala Eksekutif Pengawas Perbankan These rules integrate with Fabric Activator, enabling the Ontology to automatically initiate business processes through alerts and automated actions. This means that when new monthly P2P statistics are ingested and a platform's TWP90 crosses the threshold, the system does not wait for an analyst to discover it manually. The rule fires, the alert is sent, and the supervisory workflow begins. Querying with Natural Language One of the most powerful capabilities enabled by the Ontology is the ability to query across domains using natural language through a Data Agent. Because the Ontology defines the business vocabulary and binds it to real data, a supervisor can ask questions like: "Which banks have channeling agreements with P2P platforms whose TWP90 is currently above 5 percent, and what is their total exposure?" The Data Agent resolves this query by traversing the Ontology graph: from the Bank entity through the channels_funding_to relationship to P2P Platform, filtering by the TWP90 property, and aggregating the channeling_total measure. Step 4: Setting Up Planning Sheets While the Ontology tells you what is happening in your business right now, the Plan item in Fabric IQ helps you decide what should happen next. Planning in Fabric IQ brings budgeting, forecasting, and scenario modeling directly into the same environment where your data lives, eliminating the disconnect between analytical insights and forward looking decisions. Creating a Planning Sheet To create a Plan, navigate to your workspace and select New Item followed by Plan (preview). After naming the plan and connecting it to your Semantic Model, you can begin building Planning sheets that pull dimensions and measures directly from the same data that powers your Ontology. In the screenshot below, we see a Planning sheet named "Planning P2P" that presents a tabular view of all P2P lending platforms alongside their key risk metrics. Figure 4. The Planning sheet showing P2P lending platforms with their TWP90 rates, total outstanding balances (in trillions of Rupiah), total borrower counts (in thousands), and risk categories. The Planning sheet is structured with the platform name and risk_category as row dimensions, and three critical measures as values: Sum of twp90_rate, Sum of total_outstanding (displayed in trillions of Rupiah), and Sum of total_borrowers (displayed in thousands). The risk_category column provides an immediate visual classification of each platform's health status, with categories such as Elevated and Very High clearly indicating where supervisory attention should be directed. Looking at the data, several insights emerge immediately. DanaBijak and DanaCepat both carry a Very High risk category, with TWP90 rates of 18.77 and 17.79 respectively. CashWagon ID shows an Elevated risk designation despite a comparatively modest TWP90 of 8.26, likely due to its substantial outstanding balance of 144.97 thousand borrowers. The aggregate row at the top reveals the industry total: a combined TWP90 of 365.38 (this is a sum across all platforms), total outstanding of 29.86 trillion Rupiah, and 7,281.55 thousand borrowers across the monitored universe. Using Planning for Supervisory Resource Allocation The real power of the Planning sheet becomes apparent when supervisors begin using it for forward looking decisions. Consider the following scenarios that can be modeled directly within the Planning interface: Enforcement Forecasting: Based on the current data showing multiple platforms in the Very High risk category, supervisors can forecast the expected volume of warning letters and administrative sanctions for the coming quarter. If historical patterns show that each Very High platform typically receives two to three rounds of correspondence before resolution, the planning sheet can project staffing requirements for the enforcement team. Budget Allocation: The Planning sheet can incorporate budget dimensions alongside risk metrics. If the current quarterly examination budget allows for on site visits to 15 platforms, the risk category column helps prioritize which platforms should be visited first. The forecast capability can then project whether the budget is sufficient given the current risk trajectory, or whether a reallocation request should be submitted.272Views0likes0CommentsMicrosoft Fabric Operations Agent Step by Step Walkthrough
Fabric Capacity and Workspace You need a Microsoft Fabric workspace backed by a paid capacity. Trial capacities are not supported for Operations Agent. Your capacity must be provisioned in a supported region. As of April 2026, Operations Agent is available in all Microsoft Fabric regions except South Central US and East US. If your capacity is outside the US or EU, you will also need to enable cross geo processing and storage for AI through the tenant settings. Your workspace must contain an Eventhouse with at least one KQL database. The Eventhouse is the telemetry backbone, and the KQL database holds the tables the agent will monitor. In the screenshot below, you can see a workspace named OperationAgent-WS that contains an Eventhouse (ops_eventhouse), two KQL databases (ops_db and ops_eventhouse), and a Lakehouse (ops_lakehouse). This is the environment used throughout this guide. Figure 1. Workspace contents showing the Eventhouse, KQL databases, and Lakehouse ready for the Operations Agent. Enabling the Operations Agent in the Admin Portal A Fabric administrator must enable the Operations Agent preview toggle in the Admin Portal before anyone in the organization can create an agent. Navigate to the Admin Portal, locate the section for Real Time Intelligence, and find the setting labeled Enable Operations Agents (Preview). Toggle it to Enabled for the entire organization or for specific security groups depending on your governance requirements. In addition to this toggle, ensure that Microsoft Copilot and Azure OpenAI Service are also enabled at the tenant level. The Operations Agent relies on Azure OpenAI to generate its playbook and to reason about data when conditions are met. Figure 2. The Admin Portal showing the Enable Operations Agents (Preview) toggle set to Enabled for the entire organization. Note that messages sent to Operations Agents are processed through the Azure AI Bot Service. If your capacity is outside the EU Data Boundary, data may be processed outside your geographic or national cloud boundary. Be sure to communicate this to your compliance stakeholders before enabling the feature in production tenants. Microsoft Teams Account Every person who will receive recommendations from the agent must have a Microsoft Teams account. The Operations Agent delivers its findings and action suggestions through a dedicated Teams app called Fabric Operations Agent. You can install this app from the Teams app store by searching for its name. Once installed, the agent will be able to send messages containing data summaries and recommended actions directly to the designated recipients. Creating and Configuring the Operations Agent With your prerequisites in place, you are ready to create the Operations Agent. The following steps walk you through the entire configuration process using the Fabric portal. Step 1: Create a New Operations Agent Open the Microsoft Fabric portal and navigate to your workspace. On the Fabric home page, select the ellipsis icon and then select Create. In the Create pane, scroll to the Real Time Intelligence section and select Operations Agent. A dialog will appear asking you to name your agent and select the target workspace. Choose a descriptive name that reflects the agent’s purpose. In this guide, the agent is named OperationsAgent_1 and is deployed to the OperationAgent-WS workspace. Step 2: Define Business Goals and Agent Instructions Once the agent is created, you are taken to the Agent Setup page. This page is divided into two halves. On the left side, you configure the agent’s behavior. On the right side, you see the generated Agent Playbook after saving. The first field is Business Goals, where you describe the high level objective the agent should accomplish. Write this in clear, outcome oriented language. In this demo, the business goal is set to: “Monitor data pipeline execution and alert on failures.” The second field is Agent Instructions, where you provide more specific guidance on how the agent should reason about the data. Think of this as a brief you would hand to an analyst who will be watching your systems overnight. Be explicit about the table name, the column to watch, and the condition that constitutes an alert. In this demo, the instruction reads: “Monitor pipeline_runs table. Alert when status is failed.” Together, the business goals and instructions give the underlying large language model enough context to generate an accurate playbook. The more specific your instructions, the more reliable the agent’s behavior will be. Figure 3. The Agent Setup page showing business goals, agent instructions, and the generated playbook on the right. On the right side of the screen, you can see the Agent Playbook that was generated after saving. The playbook includes a Business Term Glossary, which shows the business objects the agent inferred from your goals and data. In this case, it identified an object called PipelineRun, mapped to the pipeline_runs table, with two properties: status (the pipeline run status from the status column) and runId (the unique identifier from the run_id column). It also displays the Rules section, which contains the conditions the agent will evaluate. Review the playbook carefully. Since it is generated by an AI model, there may be occasional misinterpretations. Verify that every property maps to the correct column and that the rules reflect your intended thresholds. If something is off, update your goals or instructions and save again to regenerate the playbook. Step 3: Add a Knowledge Source Scroll down on the Agent Setup page to find the Knowledge section. This is where you connect the agent to the data it will monitor. When you first open this section, it will display a message indicating that no knowledge source has been added yet. Figure 4. The Knowledge section before any data source has been added. Select the Add Data button to browse the available data sources. A panel will appear listing the KQL databases and Eventhouses accessible within your Fabric environment. In this demo, three sources are available: ops_db in the OperationAgent-WS workspace, wms_eventhouse in the WMS-CDC-Demo workspace, and ops_eventhouse in the OperationAgent-WS workspace. Select the database that contains the table you want the agent to monitor. For this guide, select ops_db, which holds the pipeline_runs table referenced in the agent instructions. Figure 5. Selecting the knowledge source from available KQL databases and Eventhouses. Once the knowledge source is connected, the agent will be able to query this database at regular intervals (approximately every five minutes) to evaluate its rules. Make sure the table in your selected database is actively receiving data, especially if you plan to demonstrate the agent detecting a condition in real time. Step 4: Define Actions Actions are the responses the agent can recommend when it detects a condition that matches its rules. Scroll further down the Agent Setup page to find the Actions section. Select the Add Action button to define a new custom action. A dialog titled New Custom Action will appear. It has three fields. The Action Name is a short, descriptive label for the action. The Action Description explains the purpose of the action and gives the agent context about when to use it. The Parameters section allows you to define input fields that pass dynamic values (such as names, dates, or identifiers) into the Power Automate flow that will be triggered. Figure 6. The New Custom Action dialog where you define the action name, description, and optional parameters. In this demo, the action is named Send Email Alert with a description indicating that it should send an email notification when a pipeline failure is detected. Once created, you can see the action listed in the Actions section with a green status indicator showing that the action is successfully connected. Figure 7. The Actions section showing the Send Email Alert action with a connected status. Step 5: Configure the Custom Action with Power Automate After creating the action, you need to configure it by linking it to an activator item and a Power Automate flow. Select the action you just created to open the Configure Custom Action pane. In this pane, you will see several fields. First, select the Workspace where the activator item resides. In this demo, the workspace is OperationAgent-WS. Next, select the Activator, which is the Fabric item that bridges the Operations Agent and Power Automate. Here, the activator is named Email_Alert_Activator. Once the connection is created, a Connection String is generated. This string is a unique identifier that links the Operations Agent to the Power Automate flow. Select the Copy button to copy this connection string to your clipboard. You will need it in the next step. Below the connection string, you will find the Open Flow Builder button. Select this to launch the Power Automate flow designer where you will build the email notification flow. Figure 8. The Configure Custom Action pane showing the workspace, activator, connection string, and the button to open the flow builder. Step 6: Build the Power Automate Flow When you select Open Flow Builder, a new browser tab opens with the Power Automate designer. The flow is pre-configured with a trigger called When an Activator Rule is Triggered. This trigger fires whenever the Operations Agent approves an action. In the Parameters tab of the trigger, you will see a field labeled Connection String. Paste the connection string you copied from the previous step into this field. This is the critical link that connects the Power Automate flow back to your Operations Agent. If this string is incorrect or missing, the flow will not fire when the agent recommends the action. Figure 9. The Power Automate flow builder with the activator trigger and the Connection String field. Below the trigger, you can add any actions your workflow requires. For an email alert scenario, add an Office 365 Outlook action to send an email to the operations team. You can use dynamic content from the trigger to include details such as the pipeline run ID, the failure status, and any parameters passed through from the Operations Agent. Save the flow and return to the Fabric portal. Your action is now fully configured and ready to be triggered by the agent. Step 7: Generate the Playbook and Start the Agent With all configuration complete (business goals, instructions, knowledge source, and actions), select Save on the Agent Setup page. Fabric will use the underlying large language model to generate the agent’s playbook. The playbook is a structured summary of everything the agent knows: its goals, the properties it monitors, and the rules it evaluates. You can also select Generate Playbook at the top of the page to regenerate the playbook if you have made changes. Review the playbook one final time to confirm that properties map correctly to your table columns and that rules reflect the exact conditions you want to monitor. When you are satisfied, select Start in the toolbar at the top of the page. The agent will begin actively monitoring your data. It queries the knowledge source approximately every five minutes, evaluating the playbook rules against the latest data. If a condition is met, the agent uses the LLM to summarize the data, generate a recommendation, and send a message to the designated recipients through Microsoft Teams. To pause the agent at any time, select Stop. This is useful during demos when you want to control the timing of the demonstration. How the Agent Operates at Runtime Once started, the Operations Agent follows a continuous loop. Every five minutes, it queries the connected KQL database to evaluate the rules defined in the playbook. If no conditions are met, it continues silently. If a condition is matched (for example, a pipeline run with a status of "failed" appears in the pipeline_runs table), the agent proceeds through the following sequence. First, the agent uses the large language model to analyze the data that triggered the condition. It summarizes the context, identifies the relevant business object (such as a specific pipeline run), and determines which action to recommend. Second, the agent sends a message to the designated recipients through Microsoft Teams. This message contains a summary of the detected insight, the data context that triggered it, and a suggested action. Recipients can approve the action by selecting Yes or reject it by selecting No. If parameters are included (such as a run ID or a severity level), they can be reviewed and adjusted before final approval. Third, if the recipient approves the action, the agent executes it on behalf of the creator using the creator’s credentials. In this demo, approving the action would trigger the Power Automate flow that sends an email alert. It is important to note that if a recommendation is not responded to within three days, the operation is automatically canceled. After cancellation, the action can no longer be approved or interacted with.505Views1like0CommentsAnnouncing Fabric Mirroring integration for Azure Database for MySQL - Public Preview at FabCon 2026
At FabCon 2026, we’re excited to announce the Public Preview of Microsoft Fabric Mirroring integration for Azure Database for MySQL. This integration makes it easier than ever to analyze MySQL operational data using Fabric’s unified analytics platform, without building or maintaining ETL pipelines. This milestone brings near real-time data replication from Azure Database for MySQL into Microsoft Fabric OneLake, unlocking powerful analytics, reporting, and AI scenarios while keeping transactional workloads isolated and performant. Why Fabric integration for Azure Database for MySQL? MySQL is widely used to power business‑critical applications, but operational databases aren’t optimized for analytics. Traditionally, teams rely on complex ETL pipelines, custom connectors, or batch exports — adding cost, latency, and operational overhead. With Fabric integration, Azure Database for MySQL now connects directly to Microsoft Fabric, enabling: Zero‑ETL analytics on MySQL operational data Near real-time synchronization into OneLake Analytics‑ready open formats for BI, data engineering, and AI A unified experience across Power BI, Lakehouse, Warehousing, and notebooks All without impacting your production workloads. What’s new in the Public Preview? The Public Preview introduces a first‑class integration between Azure Database for MySQL and Microsoft Fabric, designed for simplicity and scale. It introduces a solid set of core operational and enterprise‑readiness capabilities, enabling end-users to confidently get started and scale their analytics scenarios. Core replication operations Start, monitor, and stop replication directly from the integrated experience Support for both initial data load and continuous change data capture (CDC) to keep data in sync with minimal latency. Network and security Firewall and gateway support, enabling replication from secured MySQL environments. Support for Azure Database for MySQL servers configured with customer‑managed keys (BYOK), aligning with enterprise security and compliance requirements. Broader data coverage and troubleshooting Ability to mirror tables containing previously unsupported data types, expanding schema compatibility and reducing onboarding friction. Support for up to 1,000 tables per server, enabling larger and more complex databases to participate in Fabric analytics. Basic error messaging and visibility to help identify replication issues and monitor progress during setup and ongoing operations. What scenarios does it unlock? With Fabric integration in place, you can now analyze data in Azure Database for MySQL without impacting production, combine it with other data in Fabric for richer reporting, and use Fabric’s built‑in analytics and AI tools to get insights faster. Learn more about exploring replicated data in Fabric in Explore data in your mirrored database using Microsoft Fabric. How does it work (high level)? Fabric integration for Azure Database for MySQL follows a simple but powerful pattern: Enable and Configure - Enable replication in the Azure portal, then use the Fabric portal to provide connection details and select MySQL tables to mirror. Initial snapshot - Fabric takes a bulk snapshot of the selected tables, converts the data to Parquet, and writes it to a Fabric landing zone. Continuous change capture - Ongoing inserts, updates, and deletes are captured from MySQL binlogs and continuously written as incremental Parquet files. Analytics‑ready in Fabric - The Fabric Replicator processes snapshot and change files and applies them to Delta tables in OneLake, keeping data in sync and ready for analytics. This design ensures low overhead on the source, while providing fresh data for analytics and AI workloads. Below is a more detailed workflow illustrating how this works: Getting started with the Public Preview To try Fabric integration for Azure Database for MySQL during Public Preview, you’ll need: An Azure Database for MySQL instance An active Microsoft Fabric capacity (trial or paid) Access to a Fabric workspace Once enabled, you can select the MySQL databases and tables you want to replicate and begin analyzing data in Fabric. For step-by-step tutorial, please refer to - https://learn.microsoft.com/azure/mysql/integration/fabric-mirroring-mysql The demo video below showcases how the mirroring integration works and walks through the end-to-end experience of mirroring MySQL data into Microsoft Fabric. Stay Connected We’re thrilled to share this milestone at FabCon 2026 and can’t wait to see how you use Fabric integration for Azure Database for MySQL to simplify analytics and unlock new insights. We welcome your feedback and invite you to share your experiences or suggestions at AskAzureDBforMySQL@service.microsoft.com Try the Public Preview, share your feedback, and help shape what’s next. Thank you for choosing Azure Database for MySQL!Azure Databricks & Fabric Disaster Recovery: The Better Together Story
Author's: Amudha Palani amudhapalani, Oscar Alvarado oscaralvarado, Eric Kwashie ekwashie, Peter Lo PeterLo and Rafia Aqil Rafia_Aqil Disaster recovery (DR) is a critical component of any cloud-native data analytics platform, ensuring business continuity even during rare regional outages caused by natural disasters, infrastructure failures, or other disruptions. Identify Business Critical Workloads Before designing any disaster recovery strategy, organizations must first identify which workloads are truly business‑critical and require regional redundancy. Not all Databricks or Fabric processes need full DR protection; instead, customers should evaluate the operational impact of downtime, data freshness requirements, regulatory obligations, SLAs, and dependencies across upstream and downstream systems. By classifying workloads into tiers and aligning DR investments accordingly, customers ensure they protect what matters most without over‑engineering the platform. Azure Databricks Azure Databricks requires a customer‑driven approach to disaster recovery, where organizations are responsible for replicating workspaces, data, infrastructure components, and security configurations across regions. Full System Failover (Active-Passive) Strategy A comprehensive approach that replicates all dependent services to the secondary region. Implementation requirements include: Infrastructure Components: Replicate Azure services (ADLS, Key Vault, SQL databases) using Terraform Deploy network infrastructure (subnets) in the secondary region Establish data synchronization mechanisms Data Replication Strategy: Use Deep Clone for Delta tables rather than geo-redundant storage Implement periodic synchronization jobs using Delta's incremental replication Measure data transfer results using time travel syntax Workspace Asset Synchronization: Co-deploy cluster configurations, notebooks, jobs, and permissions using CI/CD Utilize Terraform and SCIM for identity and access management Keep job concurrencies at zero in the secondary region to prevent execution Fully Redundant (Active-Active) Strategy The most sophisticated approach where all transactions are processed in multiple regions simultaneously. While providing maximum resilience, this strategy: Requires complex data synchronization between regions Incurs highest operational costs due to duplicate processing Typically needed only for mission-critical workloads with zero-tolerance for downtime Can be implemented as partial active-active, processing most workload in primary with subset in secondary Enabling Disaster Recovery Create a secondary workspace in a paired region. Use CI/CD to keep Workspace Assets Synchronized continuously. Requirement Approach Tools Cluster Configurations Co-deploy to both regions as code Terraform Code (Notebooks, Libraries, SQL) Co-deploy with CI/CD pipelines Git, Azure DevOps, GitHub Actions Jobs Co-deploy with CI/CD, set concurrency to zero in secondary Databricks Asset Bundles, Terraform Permissions (Users, Groups, ACLs) Use IdP/SCIM and infrastructure as code Terraform, SCIM Secrets Co-deploy using secret management Terraform, Azure Key Vault Table Metadata Co-deploy with CI/CD workflows Git, Terraform Cloud Services (ADLS, Network) Co-deploy infrastructure Terraform Update your orchestrator (ADF, Fabric pipelines, etc.) to include a simple region toggle to reroute job execution. Replicate all dependent services (Key Vault, Storage accounts, SQL DB). Implement Delta “Deep Clone” synchronization jobs to keep datasets continuously aligned between regions. Introduce an application‑level “Sync Tool” that redirects: data ingestion compute execution Enable parallel processing in both regions for selected or all workloads. Use bi‑directional synchronization for Delta data to maintain consistency across regions. For performance and cost control, run most workloads in primary and only subset workloads in secondary to keep it warm. Implement Three-Pillar DR Design Primary Workspace: Your production Databricks environment running normal operations Secondary Workspace: A standby Databricks workspace in a different(paired) Azure region that remains ready to take over if the primary fails. This architecture ensures business continuity while optimizing costs by keeping the secondary workspace dormant until needed. The DR solution is built on three fundamental pillars that work together to provide comprehensive protection: 1. Infrastructure Provisioning (Terraform) The infrastructure layer creates and manages all Azure resources required for disaster recovery using Infrastructure as Code (Terraform). What It Creates: Secondary Resource Group: A dedicated resource group in your paired DR region (e.g., if primary is in East US, secondary might be in West US 2) Secondary Databricks Workspace: A standby Databricks workspace with the same SKU as your primary, ready to receive failover traffic DR Storage Account: An ADLS Gen2 storage account that serves as the backup destination for your critical data Monitoring Infrastructure: Azure Monitor Log Analytics workspace and alert action groups to track DR health Protection Locks: Management locks to prevent accidental deletion of critical DR resources Key Design Principle: The Terraform configuration references your existing primary workspace without modifying it. It only creates new resources in the secondary region, ensuring your production environment remains untouched during setup. 2. Data Synchronization (Delta Notebooks) The data synchronization layer ensures your critical data is continuously backed up to the secondary region. How It Works: The solution uses a Databricks notebook that runs in your primary workspace on a scheduled basis. This notebook: Connects to Backup Storage: Uses Unity Catalog with Azure Managed Identity for secure, credential-free authentication to the secondary storage account Identifies Critical Tables: Reads from a configuration list you define (sales data, customer data, inventory, financial transactions, etc.) Performs Deep Clone: Uses Delta Lake's native CLONE functionality to create exact copies of your tables in the backup storage Tracks Sync Status: Logs each synchronization operation, tracks row counts, and reports on data freshness Authentication Flow: The synchronization process leverages Unity Catalog's managed identity capabilities: An existing Access Connector for Unity Catalog is granted "Storage Blob Data Contributor" permissions on the backup storage. Storage credentials are created in Databricks that reference this Access Connector. The notebook uses these credentials transparently—no storage keys or secrets are required. What Gets Synced: You define which tables are critical to your business operations. The notebook creates backup copies including: Full table data and schema Table partitioning structure Delta transaction logs for point-in-time recovery 3. Failover Automation (Python Scripts) The failover automation layer orchestrates the switch from primary to secondary workspace when disaster strikes. Microsoft Fabric Microsoft Fabric provides built‑in disaster recovery capabilities designed to keep analytics and Power BI experiences available during regional outages. Fabric simplifies continuity for reporting workloads, while still requiring customer planning for deeper data and workload replication. Power BI Business Continuity Power BI, now integrated into Fabric, provides automatic disaster recovery as a default offering: No opt-in required: DR capabilities are automatically included. Azure storage geo-redundant replication: Ensures backup instances exist in other regions. Read-only access during disasters: Semantic models, reports, and dashboards remain accessible. Always supported: BCDR for Power BI remains active regardless of OneLake DR setting. Microsoft Fabric Fabric's cross-region DR uses a shared responsibility model between Microsoft and customers: Microsoft's Responsibilities: Ensure baseline infrastructure and platform services availability Maintain Azure regional pairings for geo-redundancy. Provide DR capabilities for Power BI as default. Customer Responsibilities: Enable disaster recovery settings for capacities Set up secondary capacity and workspaces in paired regions Replicate data and configurations Enabling Disaster Recovery Organizations can enable BCDR through the Admin portal under Capacity settings: Navigate to Admin portal → Capacity settings Select the appropriate Fabric Capacity Access Disaster Recovery configuration Enable the disaster recovery toggle Critical Timing Considerations: 30-day minimum activation period: Once enabled, the setting remains active for at least 30 days and cannot be reverted. 72-hour activation window: Initial enablement can take up to 72 hours to become fully effective. Azure Databricks & Microsoft Fabric DR Considerations Building a resilient analytics platform requires understanding how disaster recovery responsibilities differ between Azure Databricks and Microsoft Fabric. While both platforms operate within Azure’s regional architecture, their DR models, failover behaviors, and customer responsibilities are fundamentally different. Recovery Procedures Procedure Databricks Fabric Failover Stop workloads, update routing, resume in secondary region. Microsoft initiates failover; customers restore services in DR capacity. Restore to Primary Stop secondary workloads, replicate data/code back, test, resume production. Recreate workspaces and items in new capacity; restore Lakehouse and Warehouse data. Asset Syncing Use CI/CD and Terraform to sync clusters, jobs, notebooks, permissions. Use Git integration and pipelines to sync notebooks and pipelines; manually restore Lakehouses. Business Considerations Consideration Databricks Fabric Control Customers manage DR strategy, failover timing, and asset replication. Microsoft manages failover; customers restore services post-failover. Regional Dependencies Must ensure secondary region has sufficient capacity and services. DR only available in Azure regions with Fabric support and paired regions. Power BI Continuity Not applicable. Power BI offers built-in BCDR with read-only access to semantic models and reports. Activation Timeline Immediate upon configuration. DR setting takes up to 72 hours to activate; 30-day wait before changes allowed.1.1KViews4likes0CommentsSmart Pipelines Orchestration: Designing Predictable Data Platforms on Shared Spark
Introduction In mature data platforms, scaling compute is rarely the primary challenge. Shared, elastic Spark pools already provide sufficient processing capacity for most workloads. The harder problem is achieving predictable execution when multiple pipelines compete for the same resources. In Azure Synapse, Spark pools are commonly shared across pipelines to optimize cost and utilization. While this model is efficient, it introduces a key limitation: execution order is determined by scheduling behavior, not business priority. This post describes an orchestration pattern that makes priority explicit, allowing critical workloads to run predictably on shared Spark compute without modifying Spark code, configuration, or cluster capacity. Goal This work does not aim to optimize Spark performance. Its goal is to ensure that, when pipelines share a Spark pool: latency-sensitive workloads run first heavy backfills do not delay critical pipelines execution order is deterministic under contention All of this needed to be achieved without changes to Spark configuration, notebook logic, or cluster size. Why This Problem Occurs In a naïve orchestration model, pipelines are triggered in parallel. From Spark’s perspective: all jobs are equivalent all jobs attempt to acquire executors at the same time scheduling decisions are based on availability and timing As a result, priority is implicit and often incorrect. A heavy workload may acquire executors before a lightweight but critical one simply because it requests more resources earlier. This behavior is expected from Spark. The issue lies in orchestration, not in compute. Core Concept: Priority as Execution Ordering In shared Spark platforms, priority is enforced through execution ordering, not compute tuning. The orchestration layer controls when workloads are admitted to shared compute. Once execution begins, Spark processes each workload normally. This preserves Spark’s execution model while providing deterministic workload ordering. Step 1: Workload Classification In the demo presented in this blog, workloads are classified during configuration based on business impact: Category Description Priority example Light (critical) SLA sensitive dashboard and downstream consumers High priority , low resource weight(data volume) Medium (High) Core reporting workloads Medium priority Heavy(Best Effort) Backfills and historical computes Low priority, high resource weight(data volume) This classification is external to Spark and external to code. It represents business intent, not implementation. As a future phase, classification can be automated,for example, an agent may adjust priority based on observed failure rates or execution stability. Workload classification is expressed as orchestration metadata, for example: [ {"name":"ExecDashboard","pipeline":"PL_Light_ExecDashboard","weight":1,"tier":"Critical"}, {"name":"FinanceReporting","pipeline":"PL_Medium_FinanceReporting","weight":3,"tier":"High"}, {"name":"Backfill","pipeline":"PL_Heavy_Backfill","weight":8,"tier":"BestEffort"} ] What Runs in Each Workload Category All pipelines execute on the same shared Spark pool, but the work they perform differs in scope, data volume, and sensitivity to contention. Light workloads power SLA-sensitive dashboards and downstream consumers. Their notebooks perform targeted reads with strong filtering, limited joins, and small aggregations. Execution time is short, and overall pipeline duration is dominated by executor availability rather than computation. Medium workloads represent core reporting and analytics logic. These notebooks process larger datasets, perform joins across multiple sources, and apply aggregations that are more expensive than Light workloads but still time-bounded and business-critical. Heavy workloads are best-effort pipelines such as backfills and historical recomputation. Their notebooks scan large data volumes, apply expensive transformations, and are optimized for throughput rather than responsiveness. These workloads tolerate delay but place significant pressure on shared compute when admitted concurrently. All workloads use the same Spark pool, executor configuration, and runtime. The distinction reflects business intent and execution characteristics, not Spark tuning. Example notebooks for each category are available in the accompanying GitHub repository. Step 2: Naïve Orchestration (Baseline) The following pipeline run illustrates the baseline behavior when all workloads are triggered in parallel against a shared Spark pool. All Light, Medium, and Heavy pipelines are admitted concurrently. Executor acquisition and execution order depend on timing rather than business priority, resulting in non-deterministic behavior under contention. Although Light workloads require minimal compute, they are delayed by executor contention caused by Medium and Heavy pipelines entering the Spark pool at the same time. Step 3: Smart Orchestration (Priority-Aware) Orchestration Model The same child pipelines and notebooks are reused. The parent pipeline enforces admission order: Light (Critical) Medium (High) Heavy (Best Effort) Dependencies control admission to the Spark pool. Parallelism is preserved within a priority class. Effect on Shared Spark Light workloads enter the Spark pool without contention Medium workloads run after Light completes Heavy workloads are intentionally delayed Executor acquisition aligns with business priority Light pipelines execute first and complete before medium pipelines are admitted. Heavy workloads run last by design. No Spark configuration changes are introduced. The Spark pool, notebooks, and executor configuration are identical to the naïve run. Only the orchestration graph differs. Step 4: Impact on Light Workloads Light workloads are particularly sensitive to orchestration because their runtime is dominated by queueing time, not computation. Comparing the naïve and priority-aware runs shows that Spark execution time is unchanged, but pipeline duration improves due to earlier admission to the Spark pool and immediate executor access Naïve Execution Spark execution time: short and unchanged Pipeline duration: minutes under contention Delay caused by executor unavailability Smart Execution Spark execution time: unchanged Pipeline duration closely matches compute time Immediate access to executors The improvement comes from removing admission contention, not from increasing resources. Results and Performance Compared to naïve orchestration, priority-aware orchestration ensures that Light workloads complete in minutes rather than tens of minutes under contention, while Spark execution time itself remains unchanged. Heavy workloads no longer delay latency-sensitive pipelines, and execution order is deterministic across runs. These improvements are achieved solely by controlling admission to the shared Spark pool, without modifying Spark configuration, notebook logic, or cluster capacity. Next Steps: 1. Optimizing Heavy Workloads Once heavy workloads are isolated by priority, they can be optimized independently: retries with backoff tolerance for transient failures increased executor counts or larger pools Without admission control, these optimizations increase contention, with smart orchestration, they do not impact critical pipelines. 2. Moving Beyond Static Classification In this implementation, workload classification is static and configuration-driven, which is sufficient for stabilization. A next phase is adaptive classification: collect execution metrics and failure rates detect unstable pipelines reclassify pipelines that exceed thresholds (e.g., >20% failures in a rolling window) This prevents unstable workloads from impacting critical execution paths and makes the pipeline reliable with minimal maintenance. 3. Assisted Classification with Copilot agent At scale, priority decisions benefit from automation. A Copilot-style agent can use historical execution data to recommend classification changes, grounding decisions in observed behavior while keeping engineers in control. Example: Changing workload classification from Light to Medium Consider a pipeline initially classified as Light because it powers an SLA-sensitive dashboard and typically executes quickly with minimal resource usage. Over time, execution telemetry shows a change in behavior: The pipeline fails in 4 of the last 10 runs due to transient Spark errors Average duration has increased by 3×, even when admitted early Retry attempts amplify contention for other Light workloads Based on these signals, an automated agent flags the workload as unstable and recommends reclassifying it from Light to Medium. After reclassification: The pipeline is admitted after Light workloads but before Heavy workloads It no longer blocks latency-critical paths when retries occur Execution remains predictable, while instability is isolated from critical workloads The notebook logic and Spark configuration remain unchanged, only the workload’s admission priority is updated via orchestration metadata. This approach allows the platform to adapt to changing workload characteristics while preserving deterministic execution for critical pipelines. Conclusion Parallel execution is a default, not a strategy. In shared environments, orchestration must explicitly encode business intent rather than relying on scheduler behavior. Enforcing priority at the orchestration layer restores predictability without sacrificing efficiency and provides a foundation for adaptive, policy-driven execution as platforms evolve. Links Orchestrating data movement and transformation in Azure Data Factory - Training | Microsoft Learn How to Optimize Spark Jobs for Maximum Performance: A Complete Guide GitHub repo for notebook reference: sallydabbahmsft/Smart-pipelines-orchestration Feedback: Sally Dabbah | LinkedIn660Views1like2Comments