data & ai
229 Topicsđ Save the Date: FY26 Fabric Partner Community YearâEnd Celebration
As we prepare to wrap up FY26, weâre closing the year the same way we built it â together. This yearâend celebration will be held as part of the final Fabric Engineering Connection calls of FY26, giving us space to pause, look back on what we built together, and celebrate the partners who make this community what it is. đAmericas & EMEA Wednesday, June 24 | 8:00â9:00 AM PT đAPAC Thursday, June 25 | 1:00â2:00 AM UTC / Wednesday, June 24 | 5:00â6:00 PM PT) ⨠What to expect: A look back at the moments that defined FY26 along with partner updates to take you into FY27 Fun & games â including a Mad Libsâstyle community story built live by partners A community toast and a few surprises along the way đ Important: This call is open to members of the Fabric Partner Community on Microsoft Teams. If youâre not already a member, you can join here: https://aka.ms/JoinFabricPartnerCommunity This isnât just a yearâend recap. Itâs a thankâyou to the partners who showed up, shared openly, asked great questions, and helped each other grow real Microsoft Fabric practices. Mark your calendars. We can't wait to celebrate with you! 𼳠đĽ10Views1like0CommentsOn the Next Fabric Engineering Connection
Coming up on the next Fabric Engineering Connection calls, weâre focusing on one of the most important areas for partners right now: data protection, networking, and security in Microsoft Fabric. đ What to expect đ¤ Recent Data Protection Value and Announcements (Americas & EMEA) presented by Yael Biss Covering the latest data protection capabilities in Fabricâdesigned to help partners meet security, compliance, and governance requirements while enabling customers to scale with confidence. đ¤ Updates + AMA with Networking and Data Security Team (Americas/EMEA & APAC) presented by Sarabjit D., Sumiran Tandon, Advaitha Karthikeyan, PMPÂŽ, and Bodhisatva Gautam This is a great opportunity to engage directly with the engineering team working on key scenarios including: â Private Links â Managed Private Endpoints (MPEs) â Outbound Access Protection â Customer Managed Keys (CMK) If you're advising customers on secure Fabric deployments, networking isolation, or enterpriseâgrade governance, this session is definitely one to join. đ Note: Fabric Engineering Connection calls are hosted exclusively in the Fabric Partner Community. Microsoft partners can join here the Community by submitting the form at https://aka.ms/JoinFabricPartnerCommunity. Looking forward to the discussion next week!25Views1like0CommentsFabric Data Agents
Hello All, Hoping to get some information on necessary permissions for our users to properly be able to use Fabric Data Agents. Scenario: Created a Fabric Data Agent that uses data from Fabric lakehouse table. Published Fabric Agent to M365 Copilot, shared with a business user. Provided business user direct access to Lakehouse as well. User is able to access the Fabric Data Agent from M365 Copilot however is getting error stating they do not have access to the data. How do we solve for this, what are we missing? CorinnaSolved83Views0likes2CommentsDatabricks Lakebase: The operational database for AI agents and apps
Understanding the Evolution: From Lakehouse to Lakebase The modern data landscape has long been characterized by a fundamental schism: Online Transaction Processing (OLTP) systems, designed for high-frequency, low-latency transactions in applications, and Online Analytical Processing (OLAP) systems, optimized for complex queries, reporting, and machine learning on vast datasets. This division historically necessitated intricate and often fragile Extract, Transform, Load (ETL) processes to move and synchronize data between these disparate environments, leading to increased complexity, data duplication, and governance challenges. Databricks Lakehouse architecture emerged to unify data warehousing and data lake f unctionalities for analytical workloads, offering the flexibility of data lakes with the performance and governance of data warehouses. However, a critical piece remained: native, high-performance OLTP capabilities directly within this unified environment. This is where Databricks Lakebase enters the picture, representing a significant evolution by bringing fully managed PostgreSQL OLTP capabilities directly into the Databricks Data Intelligence Platform. Lakebase addresses the need for a single, governed platform that can seamlessly handle both transactional and analytical workloads, thereby simplifying data architectures, reducing operational overhead, and accelerating the development of real-time applications and AI agents. By integrating OLTP at the core of the lakehouse, Databricks aims to create a truly unified data and AI platform. The Architectural Innovation: Separation of Compute and Storage At the heart of Databricks Lakebase's efficiency and scalability lies its innovative architecture, which fundamentally separates compute from storage. Unlike traditional monolithic databases where these components are tightly coupled, Lakebase decouples them, offering distinct advantages: Elastic Scaling and Cost Efficiency The transactional compute layer in Lakebase is serverless and ephemeral, meaning it can scale up or down dynamically based on demand. This includes the ability to scale to zero during periods of inactivity, significantly optimizing cost by ensuring you only pay for the compute resources actively used. Data, on the other hand, is persisted directly into low-cost, durable cloud object storage (e.g., Azure Blob Storage) using open formats like Delta Lake. This design not only reduces storage costs but also prevents vendor lock-in and allows other engines within the Databricks platform to access the data directly. Open Data Formats and Interoperability By storing data in open formats, Lakebase ensures high interoperability within the Databricks ecosystem and beyond. This approach eliminates the need for complex and time-consuming ETL processes to move transactional data to the analytical layer, as the data is inherently accessible to both. This foundational integration streamlines data pipelines and provides a unified view of data across all workloads. Key Technical Capabilities and Features Databricks Lakebase offers a rich set of features that make it a compelling solution for modern data architectures: PostgreSQL Compatibility: Lakebase provides full PostgreSQL semantics, including ACID transactions, indexing capabilities, and support for standard JDBC/psql clients. This familiarity allows developers to leverage existing skills and tools, minimizing the learning curve. Fully Managed Service: Databricks handles the complexities of provisioning, scaling, patching, backups, and ensuring high availability, freeing up development teams to focus on application logic rather than database administration. Managed Change Data Capture (CDC): A crucial feature, managed CDC ensures that operational data in Lakebase remains synchronized with Delta Lake tables for analytical consumption. This continuous synchronization is vital for keeping BI models and AI applications updated with the freshest transactional data. Autoscaling (Lakebase Autoscaling): The latest iteration of Lakebase features intelligent autoscaling of compute resources. It dynamically adjusts Compute Units (CU) based on various metrics like CPU load, memory usage, and working set size, preventing performance bottlenecks and out-of-memory (OOM) issues. It also supports branching and instant restore, enhancing developer agility and operational resilience. Databricks Apps Synergy: Lakebase is designed to serve as the transactional backend for Databricks Apps, enabling the creation and deployment of interactive applications directly on the platform, leveraging governed data and powerful analytics. Governance, Security, and Cost Efficiency with Lakebase Adopting Databricks Lakebase brings significant benefits in terms of data governance, security, and overall cost management, aligning with the principles of a modern data intelligence platform. Unified Governance through Unity Catalog One of Lakebase's most powerful integrations is with Unity Catalog, Databricks' unified governance solution. This integration provides a single pane of glass for managing data assets across the entire Databricks Data Intelligence Platform. Lakebase databases can be registered as catalogs within Unity Catalog, extending its robust governance framework to operational data. This means: Consistent Access Control: Policies defined for your lakehouse data automatically apply to Lakebase, ensuring uniform security and access management across both operational and analytical workloads. Centralized Auditing and Lineage: Unity Catalog provides comprehensive auditing capabilities and data lineage tracking for Lakebase assets, simplifying compliance and offering transparent insights into data flows. Simplified Security Management: By unifying governance, organizations can reduce the complexity of managing security policies across disparate systems, enhancing overall data security posture. Robust Security and Data Protection Lakebase is designed with enterprise-grade security in mind, leveraging existing cloud infrastructure and Databricks' security features: Network Integration: It integrates seamlessly with cloud networking services (e.g., Azure Private Link) for secure, private connectivity. Identity Management: Integration with enterprise identity providers (e.g., Microsoft Entra ID) ensures secure authentication and authorization. Data Encryption: Data is encrypted at rest and in transit, protecting sensitive information throughout its lifecycle. High Availability and Disaster Recovery: As a fully managed service, Lakebase inherently provides features for high availability and point-in-time recovery, ensuring operational resilience. Optimized Cost Efficiency The architectural separation of compute and storage, coupled with advanced autoscaling capabilities, contributes to significant cost savings compared to traditional database architectures: Pay-as-you-go Compute: With serverless and autoscaling compute, you only pay for the resources consumed during active processing, with the ability to scale down to zero when idle. Low-Cost Storage: Leveraging economical cloud object storage for data persistence drastically reduces storage costs. Reduced ETL Overhead: By eliminating the need for complex ETL pipelines between OLTP and OLAP, organizations save on infrastructure, development, and maintenance costs associated with data movement and transformation. This can lead to reported savings of 40-50% in many environments. Lakebase in Action: Powering Real-Time Applications and AI Agents Databricks Lakebase opens up new possibilities for building intelligent, data-driven applications that require both transactional capabilities and deep analytical insights. Its unified approach simplifies development and accelerates time-to-market for innovative solutions. Real-World Use Cases Personalized Recommendations: Build real-time recommendation engines that leverage fresh transactional data from Lakebase to provide immediate and highly relevant suggestions to users. Customer Segmentation and Real-Time Updates: Maintain and update customer profiles and segments in real-time, enabling personalized experiences and targeted marketing campaigns. Feature Stores for Machine Learning: Utilize Lakebase as a feature store to serve low-latency features to AI models, ensuring that predictions and decisions are based on the most current data. Stateful AI Agents: Develop AI agents that can maintain conversational state and interact dynamically with users, using Lakebase as a reliable backend for transactional data. Order Processing Systems: Implement operational applications that require high-frequency reads, writes, and updates, such as order management or inventory systems, directly on the Databricks platform. Interactive Workflow Tools: Create interactive data applications and dashboards that allow users to both view analytical insights and perform transactional updates within the same environment. A Practical Code Snippet Developing with Lakebase feels familiar due to its PostgreSQL compatibility. Hereâs a simple example demonstrating basic CRUD (Create, Read, Update, Delete) operations within a Lakebase table: -- Create a schema for your application CREATE SCHEMA app AUTHORIZATION CURRENT_USER; -- Create a table to store session data for an AI agent CREATE TABLE app.sessions ( session_id UUID PRIMARY KEY, user_id TEXT NOT NULL, state JSONB NOT NULL, created_at TIMESTAMPTZ DEFAULT now(), updated_at TIMESTAMPTZ ); -- Create an index to optimize queries on agent status CREATE INDEX ON app.sessions ((state->>'agentStatus')); -- Insert a new session record INSERT INTO app.sessions(session_id, user_id, state) VALUES (gen_random_uuid(), 'u-123', '{"agentStatus":"active","score":0.82}'); -- Update an existing session's state UPDATE app.sessions SET state = jsonb_set(state, '{score}', '0.91'::jsonb), updated_at = now() WHERE user_id='u-123'; -- Query active sessions SELECT user_id, state->>'score' as current_score FROM app.sessions WHERE (state->>'agentStatus') = 'active'; This SQL snippet showcases how developers can interact with Lakebase using standard PostgreSQL syntax, enabling rapid application development within the Databricks environment. The Lakebase Advantage: Performance and Reliability Beyond its unified architecture, Lakebase is engineered for predictable performance and robust reliability, essential for mission-critical operational applications. The radar chart above provides an opinionated comparison of Databricks Lakebase against traditional OLTP systems across several key attributes. Lakebase demonstrates superior performance predictability, dynamic scalability, cost efficiency, and ease of management, coupled with strong data governance due to its integration with Unity Catalog. Traditional OLTP systems, while effective for their specific purposes, often score lower in these cloud-native, unified data platform metrics. Reliability Features for Business Continuity Lakebase integrates several critical reliability features that ensure business continuity and data integrity: Branching: This feature allows developers to create isolated, production-like environments for testing changes without affecting the main operational database. It promotes safer development practices and faster iteration cycles. Instant Restore and Point-in-Time Recovery (PITR): In the event of data corruption or accidental deletion, Lakebase enables quick restoration to a previous state, minimizing downtime and ensuring data resilience. High Availability: As a managed service, Lakebase is designed for high availability, with automated failover mechanisms and robust infrastructure ensuring continuous operation. Validation and Troubleshooting: Ensuring a Smooth Lakebase Experience Successful implementation and ongoing operation of Databricks Lakebase rely on proper validation and an understanding of common troubleshooting steps. This section provides a framework for ensuring your Lakebase deployment meets performance and reliability expectations. An introductory video to Lakebase, explaining its core functionality and benefits for data apps and AI agents. Key Validation Steps After provisioning and configuring your Lakebase instance, it's crucial to perform a series of validation tests: Connectivity Verification: Confirm successful connections from your applications or development tools (e.g., psql, JDBC clients) to the Lakebase instance. Ensure that Unity Catalog registration is visible and properly configured for governance. Performance Baseline: Conduct baseline QPS (Queries Per Second) tests and monitor latency under expected load conditions. Validate that autoscaling events occur as anticipated and that performance targets are met. Data Synchronization (CDC): Test the end-to-end data flow by inserting/updating records in Lakebase and verifying their timely appearance in Delta Lake tables via managed CDC. If reverse synchronization (Delta to Lakebase) is configured, validate that as well. Governance and Security Checks: Confirm that Unity Catalog permissions are correctly enforced for Lakebase assets and that audit logs accurately reflect data access and modification events. Verify network security configurations (e.g., Private Link) are functioning as intended. Common Troubleshooting Scenarios While Lakebase is designed for stability, understanding potential issues and their resolutions is key to efficient operation: Problem Area Symptom Potential Cause(s) Troubleshooting Step(s) Performance High latency, slow queries, autoscaling not triggering as expected. Inefficient queries, missing indexes, insufficient compute resources, working set exceeding memory. Inspect query plans, add appropriate indexes, monitor CU utilization, review autoscaling logs, consider increasing initial compute capacity if persistently underperforming. Data Sync (CDC) Stale data in Delta Lake, sync job failures, data inconsistencies. Incorrect Unity Catalog permissions, CDC configuration errors, network issues, regional feature limitations. Verify Unity Catalog access for CDC process, check CDC job logs for errors, confirm network connectivity between Lakebase and Delta Lake, consult Databricks documentation for regional CDC availability. Connectivity Unable to connect from application, authentication failures. Incorrect connection strings, firewall rules blocking access, misconfigured private endpoints, invalid credentials/tokens. Double-check connection parameters, review network security group (NSG) and firewall rules, validate Private Link configuration, ensure correct user/service principal credentials. Governance Unauthorized access, unexpected data visibility, audit log discrepancies. Incorrect Unity Catalog access policies, schema mismatches, misconfigured external locations. Review and refine Unity Catalog grants on Lakebase catalogs and schemas, verify external location configurations, ensure consistent data object naming conventions. Feature Limitations Specific PostgreSQL features or extensions not working. Managed environment restrictions, unsupported extensions. Consult Databricks documentation for supported PostgreSQL versions and extensions in Lakebase. Adapt application logic to use supported alternatives if necessary. By proactively monitoring and understanding these aspects, Cloud Solution Architects can ensure robust and efficient operation of Lakebase within their Databricks ecosystem. Conclusion Databricks Lakebase represents a pivotal advancement in data architecture, fundamentally reshaping how organizations approach operational and analytical workloads. By seamlessly integrating a fully managed PostgreSQL OLTP engine directly into the Databricks Data Intelligence Platform, Lakebase addresses the long-standing challenge of data fragmentation. This unification not only simplifies complex ETL processes and reduces operational overhead but also extends robust governance and security through Unity Catalog across the entire data estate. The innovative separation of compute and storage, coupled with intelligent autoscaling, delivers unparalleled cost efficiency and dynamic performance. For Cloud Solution Architects, Lakebase offers a compelling path to building scalable, real-time applications and sophisticated AI agents, leveraging fresh transactional data alongside comprehensive analytical insightsâall within a single, consistent, and highly performant environment. This strategic evolution of the lakehouse architecture empowers enterprises to unlock new levels of agility, innovation, and data-driven decision-making.271Views0likes0CommentsDeploying DNS Private Resolvers and Private DNS Zones for Azure AI Supported Services
Private Networks: Private DNS Zones: Resolves domain names to private IPs within Azure virtual networks without exposing them to the internet. Private DNS Zones are global, you donât need to create multiple same private DNS Zones, you can reuse the same zones as itâs global DNS Private Resolvers: Fully managed service that enables DNS resolution between Azure VNets and on-premises networks without custom DNS servers. DNS Private resolvers are regional, which means if you have Azure EAST US and WEST US 2 regions, you need to create DNS Private resolvers in both regions linked to Private DNS Zones, you can adopt centralized or distributed DNS Private resolvers, I will discuss both options later in this article Public Networks: <In this part â not focusing on Public Networks> Public DNS Zones: Resolves internet-facing domain names to publicly accessible IP addresses Traffic Managers: DNS-based traffic load balancer that routes client requests to the best available global endpoint DNS Security Policy: Controls and protects DNS resolution behavior (e.g., filtering, forwarding, and access rules) to secure name resolution and prevent misuse **Note: 1. Follow Prerequisites to deploy resources. 2. A common misconception is that VNet peering enables DNS resolution. In reality, private DNS zones are only accessible to VNets that are explicitly linked to them, peering provides connectivity, but not name resolution. In the following snapshot Ă Azure Portal Ă Network Foundations Ă DNS, lets explore individual DNS Services offered and later in this document, we will interconnect **Credits to Microsoft Azure Portal Design team for creating new grouped views â you can check out for more â like compute infrastructure, Hybrid, Backup Now, letâs delve into scenario 01: I have grabbed the following snapshot from Azure AI Landing Zones and removed non-network Azure resources to focus only on private Network components, **Credits to AI Landing Zone team for the diagram, Original Version: Inbound Zoom in view with End-to-End Flow Hop Summary 1 Client initiates request 2 DNS query sent to on-prem DNS 3 DNS query forwarded to Azure 4 Azure DNS Resolver processes query 5 Private DNS resolves to Private Endpoint IP 6 Traffic routed via VNet peering 7 Traffic hits Private Endpoint 8 Request served by Azure Files *Link Private DNS to DNS resolvers in other regions, Private DNS is GLOBAL and DNS Resolvers are regional Example Snapshot of entire flow: Nslookup from Client machine, Domain â DNS Conditional Forwarder configuration Note 1: Make sure you selected âAll DNS Servers in this forestâ for replication, otherwise users pointed to some other domain will be unable to resolve Verifying Connectivity with PsPing <credit to Sysinternals team PsPing > PsPing, a tool from Sysinternals, is highly effective for verifying network connectivity from on-premises environments to Azure resources on specific ports. This is particularly useful when you need to ensure connectivity to ports such as 445, 443, 1433, 1521, or any other port required by Azure services you intend to access from either on-premises locations or other cloud environments. By using PsPing, you can test and confirm that the necessary ports are open and accessible, which is crucial for troubleshooting connectivity issues and ensuring smooth communication between your on-premises infrastructure and Azure-hosted resources. Ensure your firewall is set to allow traffic DNS private resolvers â inbound configuration Private DNS Configuration Virtual Network links enable to your private dns Make sure you have peer between hub and spoke Private Endpoint configuration Storage Account configuration âReplace the file share with any supported Azure service that uses Private Endpoints, and follow the same guidance.â 2. Outbound <flow and resources colored with blue> part 2 upcoming soon371Views0likes0CommentsAzure AI Foundry vs. Azure Databricks â A Unified Approach to Enterprise Intelligence
Key Insights into Azure AI Foundry and Azure Databricks Complementary Powerhouses: Azure AI Foundry is purpose-built for generative AI application and agent development, focusing on model orchestration and rapid prototyping, while Azure Databricks excels in large-scale data engineering, analytics, and traditional machine learning, forming the data intelligence backbone. Seamless Integration for End-to-End AI: A critical native connector allows AI agents developed in Foundry to access real-time, governed data from Databricks, enabling contextual and data-grounded AI solutions. This integration facilitates a comprehensive AI lifecycle from data preparation to intelligent application deployment. Specialized Roles for Optimal Performance: Enterprises leverage Databricks for its robust data processing, lakehouse architecture, and ML model training capabilities, and then utilize AI Foundry for deploying sophisticated generative AI applications, agents, and managing their lifecycle, ensuring responsible AI practices and scalability. In the rapidly evolving landscape of artificial intelligence, organizations seek robust platforms that can not only handle vast amounts of data but also enable the creation and deployment of intelligent applications. Microsoft Azure offers two powerful, yet distinct, services in this domain: Azure AI Foundry and Azure Databricks. While both contribute to an organization's AI capabilities, they serve different primary functions and are designed to complement each other in building comprehensive, enterprise-grade AI solutions. Decoding the Core Purpose: Foundry for Generative AI, Databricks for Data Intelligence At its heart, the distinction between Azure AI Foundry and Azure Databricks lies in their core objectives and the types of workloads they are optimized for. Understanding these fundamental differences is crucial for strategic deployment and maximizing their combined potential. Azure AI Foundry: The Epicenter for Generative AI and Agents Azure AI Foundry emerges as Microsoft's unified platform specifically engineered for the development, deployment, and management of generative AI applications and AI agents. It represents a consolidation of capabilities from what were formerly Azure AI Studio and Azure OpenAI Studio. Its primary focus is on accelerating the entire lifecycle of generative AI, from initial prototyping to large-scale production deployments. Key Characteristics of Azure AI Foundry: Generative AI Focus: Foundry streamlines the development of large language models (LLMs) and customized generative AI applications, including chatbots and conversational AI. It emphasizes prompt engineering, Retrieval-Augmented Generation (RAG), and agent orchestration. Extensive Model Catalog: It provides access to a vast catalog of over 11,000 foundation models from various publishers, including OpenAI, Meta (Llama 4), Mistral, and others. These models can be deployed via managed compute or serverless API deployments, offering flexibility and choice. Agentic Development: A significant strength of Foundry is its support for building sophisticated AI agents. This includes tools for grounding agents with knowledge, tool calling, comprehensive evaluations, tracing, monitoring, and guardrails to ensure responsible AI practices. Foundry Local further extends this by allowing offline and on-device development. Unified Development Environment: It offers a single management grouping for agents, models, and tools, promoting efficient development and consistent governance across AI projects. Enterprise Readiness: Built-in capabilities such as Role-Based Access Control (RBAC), observability, content safety, and project isolation ensure that AI applications are secure, compliant, and scalable for enterprise use. Figure 1: Conceptual Architecture of Azure AI Foundry illustrating its various components for AI development and deployment. Azure Databricks: The Powerhouse for Data Engineering, Analytics, and Machine Learning Azure Databricks, on the other hand, is an Apache Spark-based data intelligence platform optimized for large-scale data engineering, analytics, and traditional machine learning workloads. It acts as a collaborative workspace for data scientists, data engineers, and ML engineers to process, analyze, and transform massive datasets, and to build and deploy diverse ML models. Key Characteristics of Azure Databricks: Unified Data Analytics Platform: Central to Databricks is its lakehouse architecture, built on Delta Lake, which unifies data warehousing and data lakes. This provides a single platform for data engineering, SQL analytics, and machine learning. Big Data Processing: Excelling in distributed computing, Databricks is ideal for processing large datasets, performing ETL (Extract, Transform, Load) operations, and real-time analytics at scale. Comprehensive ML and AI Workflows: It offers a specialized environment for the full ML lifecycle, including data preparation, feature engineering, model training (both classic and deep learning), and model serving. Tools like MLflow are integrated for tracking, evaluating, and monitoring ML models. Data Intelligence Features: Databricks includes AI-assistive features such as Databricks Assistant and Databricks AI/BI Genie, which enable users to interact with their data using natural language queries to derive insights. Unified Governance with Unity Catalog: Unity Catalog provides a centralized governance solution for all data and AI assets within the lakehouse, ensuring data security, lineage tracking, and access control. Figure 2: The Databricks Data Intelligence Platform with its unified approach to data, analytics, and AI. The Symbiotic Relationship: Integration and Complementary Use Cases While distinct in their primary functions, Azure AI Foundry and Azure Databricks are explicitly designed to work together, forming a powerful, integrated ecosystem for end-to-end AI development and deployment. This synergy is key to building advanced, data-driven AI solutions in the enterprise. Seamless Integration for Enhanced AI Capabilities The integration between the two platforms is a cornerstone of Microsoft's AI strategy, enabling AI agents and generative applications to be grounded in high-quality, governed enterprise data. Key Integration Points: Native Databricks Connector in AI Foundry: A significant development in 2025 is the public preview of a native connector that allows AI agents built in Azure AI Foundry to directly query real-time, governed data from Azure Databricks. This means Foundry agents can leverage Databricks AI/BI Genie to surface data insights and even trigger Databricks Jobs, providing highly contextual and domain-aware responses. Data Grounding for AI Agents: This integration enables AI agents to access structured and unstructured data processed and stored in Databricks, providing the necessary context and knowledge base for more accurate and relevant generative AI outputs. All interactions are auditable within Databricks, maintaining governance and security. Model Crossover and Availability: Foundation models, such as the Llama 4 family, are made available across both platforms. Databricks DBRX models can also appear in the Foundry model catalog, allowing flexibility in where models are trained, deployed, and consumed. Unified Identity and Governance: Both platforms leverage Azure Entra ID for authentication and access control, and Unity Catalog provides unified governance for data and AI assets managed by Databricks, which can then be respected by Foundry agents. Here's a breakdown of how a typical flow might look: Mindmap 1: Illustrates the complementary roles and integration points between Azure Databricks and Azure AI Foundry within an end-to-end AI solution. When to Use Which (and When to Use Both) Choosing between Azure AI Foundry and Azure Databricks, or deciding when to combine them, depends on the specific requirements of your AI project: Choose Azure AI Foundry When You Need To: Build and deploy production-grade generative AI applications and multi-agent systems. Access, evaluate, and benchmark a wide array of foundation models from various providers. Develop AI agents with sophisticated capabilities like tool calling, RAG, and contextual understanding. Implement enterprise-grade guardrails, tracing, monitoring, and content safety for AI applications. Rapidly prototype and iterate on generative AI solutions, including chatbots and copilots. Integrate AI agents deeply with Microsoft 365 and Copilot Studio. Choose Azure Databricks When You Need To: Perform large-scale data engineering, ETL, and data warehousing on a unified lakehouse. Build and train traditional machine learning models (supervised, unsupervised learning, deep learning) at scale. Manage and govern all data and AI assets centrally with Unity Catalog, ensuring data quality and lineage. Conduct complex data analytics, business intelligence (BI), and real-time data processing. Leverage AI-assistive tools like Databricks AI/BI Genie for natural language interaction with data. Require high-performance compute and auto-scaling for data-intensive workloads. Use Both for Comprehensive AI Solutions: The most powerful approach for many enterprises is to leverage both platforms. Azure Databricks can serve as the robust data backbone, handling data ingestion, processing, governance, and traditional ML model training. Azure AI Foundry then sits atop this foundation, consuming the prepared and governed data to build, deploy, and manage intelligent generative AI agents and applications. This allows for: Domain-Aware AI: Foundry agents are grounded in enterprise-specific data from Databricks, leading to more accurate, relevant, and trustworthy AI responses. End-to-End AI Lifecycle: Databricks manages the "data intelligence" part, and Foundry handles the "generative AI application" part, covering the entire spectrum from raw data to intelligent user experience. Optimized Resource Utilization: Each platform focuses on what it does best, leading to more efficient resource allocation and specialized toolsets for different stages of the AI journey. Comparative Analysis: Features and Capabilities To further illustrate their distinct yet complementary nature, let's examine a detailed comparison of their features, capabilities, and typical user bases. Radar Chart 1: This chart visually compares Azure AI Foundry and Azure Databricks across several key dimensions, illustrating their specialized strengths. Azure AI Foundry excels in generative AI and agent orchestration, while Azure Databricks dominates in data engineering, unified data governance, and traditional ML workflows. A Detailed Feature Comparison Feature Category Azure AI Foundry Azure Databricks Primary Focus Generative AI application & agent development, model orchestration Large-scale data engineering, analytics, traditional ML, and AI workflows Data Handling Connects to diverse data sources (e.g., Databricks, Azure AI Search) for grounding AI agents. Not a primary data storage/processing platform. Native data lakehouse architecture (Delta Lake), optimized for big data processing, storage, and real-time analytics. AI/ML Capabilities Foundation models (LLMs), prompt engineering, RAG, agent orchestration, model evaluation, content safety, responsible AI tooling. Traditional ML (supervised/unsupervised), deep learning, feature engineering, MLflow for lifecycle management, Databricks AI/BI Genie. Development Style Low-code agent building, prompt flows, unified SDK/API, templates. Code-first (Python, SQL, Scala, R), notebooks, IDE integrations. Model Access & Deployment Extensive model catalog (11,000+ models), serverless API, managed compute deployments, model benchmarking. Training and serving custom ML models, including deep learning. Models available for deployment through MLflow. Governance & Security Azure-based security & compliance, RBAC, project isolation, content safety guardrails, tracing, evaluations. Unity Catalog for unified data & AI governance, lineage tracking, access control, Entra ID integration. Key Users AI developers, business analysts, citizen developers, AI app builders. Data scientists, data engineers, ML engineers, data analysts. Integration Points Native connector to Databricks AI/BI Genie, Azure AI Search, Microsoft 365, Copilot Studio, Power Platform. Microsoft Fabric, Power BI, Azure AI Foundry, Azure Purview, Azure Monitor, Azure Key Vault. Table 1: A comparative overview of the distinct features and functionalities of Azure AI Foundry and Azure Databricks Concluding Thoughts In essence, Azure AI Foundry and Azure Databricks are not competing platforms but rather essential components of a unified, comprehensive AI strategy within the Azure ecosystem. Azure Databricks provides the robust, scalable foundation for all data engineering, analytics, and traditional machine learning workloads, acting as the "data intelligence platform." Azure AI Foundry then leverages this foundation to specialize in the rapid development, deployment, and operationalization of generative AI applications and intelligent agents. Together, they enable enterprises to unlock the full potential of AI, transforming raw data into powerful, domain-aware, and governed intelligent solutions. Frequently Asked Questions (FAQ) What is the main difference between Azure AI Foundry and Azure Databricks? Azure AI Foundry is specialized for building, deploying, and managing generative AI applications and AI agents, focusing on model orchestration and prompt engineering. Azure Databricks is a data intelligence platform for large-scale data engineering, analytics, and traditional machine learning, built on a Lakehouse architecture. Can Azure AI Foundry and Azure Databricks be used together? Yes, they are designed to work synergistically. Azure AI Foundry can leverage a native connector to access real-time, governed data from Azure Databricks, allowing AI agents to be grounded in enterprise data for more accurate and contextual responses. Which platform should I choose for training large machine learning models? For training large-scale, traditional machine learning, and deep learning models, Azure Databricks is generally the preferred choice due to its robust capabilities for data processing, feature engineering, and ML lifecycle management (MLflow). Azure AI Foundry focuses more on the deployment and orchestration of pre-trained foundation models and generative AI applications. Does Azure AI Foundry replace Azure Machine Learning or Databricks? No, Azure AI Foundry complements these services. It provides a specialized environment for generative AI and agent development, often integrating with data and models managed by Azure Databricks or Azure Machine Learning for comprehensive AI solutions. How do these platforms handle data governance? Azure Databricks utilizes Unity Catalog for unified data and AI governance, providing centralized control over data access and lineage. Azure AI Foundry integrates with Azure-based security and compliance features, ensuring responsible AI practices and data privacy within its generative AI applications.4.1KViews1like3CommentsMigrating Azure Data Factory and Synapse Pipelines to Fabric Data Factory
Migrating data pipelines from Azure Data Factory (ADF) and Azure Synapse Pipelines to Microsoft Fabric Data Factory represents a significant modernization opportunity and a catalyst for accelerating AI innovation across the enterprise. With Fabric Data Factory, customers can unify their data estate, streamline data engineering workflows, and more effectively leverage real-time analytics, generative AI, and machine learning at scale. This article outlines the key technical considerations for a successful migration from ADF/Synapse pipelines to Fabric Data Factory. Fabric Data Factory vs. ADF and Synapse Pipelines: Whatâs Different? Fabric Data Factory is officially described by Microsoft as the next generation of Azure Data Factory, built to handle your most complex data integration challenges with a simpler, more powerful approach. It retains ADFâs core engine capabilities while introducing major improvements enabled by Fabricâs unified, AI-centric platform including OneLake, expanded activities and native Copilot experiences. A fundamental shift is the move to a fully managed SaaS model, with several important differences: No infrastructure management: Fabric eliminates Azure Integration Runtimes entirely. Compute is managed automatically within a Fabric capacity. For onâpremises connectivity, the OnâPremises Data Gateway (OPDG) replaces ADFâs SelfâHosted Integration Runtime. No publish step: Pipelines are authored directly in the Fabric portal and can be saved or executed immediately, removing the separate publish step required in ADF. Simplified data connections: Traditional Linked Services and Datasets are replaced by Connections and inline data properties within activities, reducing configuration complexity. New native activities: Fabric introduces capabilities not available in ADF/Synapse pipelines, including Office 365 Outlook email, Teams messaging, semantic model refresh, Fabric notebooks, Invoke SSIS (preview), and Lakehouse maintenance (preview). Enhanced CI/CD: Builtâin deployment pipelines support cherryâpicking, individual item promotion, Git integration, and SaaSânative CI/CD beyond ADFâs ARM templateâbased approach. AI Copilot: Fabric Data Factory includes Copilot to assist with pipeline creation and management, a capability not available in ADF or Synapse pipelines. For more details see: Differences between Data Factory in Fabric and Azure - Microsoft Fabric | Microsoft Learn Common Migration Challenges and Recommended Mitigations Migrating to Fabric Data Factory introduces new choices and challenges. While the move to Fabric offers substantial benefits, success depends on understanding key differences, migration challenges and planning accordingly. The table below summarizes the most important considerations to help guide a smooth and successful transition. Table 1. Migration Challenges and Mitigation Challenge Description Recommended Mitigation Feature Gaps Some ADF/Synapse features (e.g., SSIS IR, Managed VNets, certain triggers) are not yet fully supported in Fabric. Delay migration of affected pipelines or redesign using Fabricânative alternatives. Monitor updates via the https://roadmap.fabric.microsoft.com Mapping Data Flows ADF Mapping Data Flows donât directly map to Fabric equivalents. Rebuild using Dataflow Gen2, Fabric Warehouse SQL, or Spark notebooks. Validate transformation logic and data types postâmigration. Trigger Redesign Fabric lacks centralized trigger management; scheduling must be defined at the pipeline level. Recreate triggers per pipeline and apply standardized naming conventions and documentation to maintain operational clarity. Global Parameters ADF Global Parameters must be converted to Fabric Variable Libraries. Use Microsoftâs conversion guidance and account for differences in data types and runtime usage patterns. See Convert Azure Data Factory Global Parameters to Fabric Variable Libraries. Dynamic Connections Fabric does not support dynamic linked service properties in the same way as ADF. Parameterize connection objects within pipeline activities using dynamic content. Deployment Performance Some environments report slower execution of deployment pipelines in Fabric. Break deployments into smaller logical units and validate performance during pilot phases prior to production rollout. Capacity Planning Fabric uses a fixedâcapacity compute model instead of ADFâs elastic payâasâyouâgo runtime. Rightâsize Fabric capacity based on peak load testing and continuously monitor usage with tools such as the Fabric Capacity Estimator. Migration Tooling Migration Assistant: Microsoft Fabric includes a builtâin Migration Assistant for both ADF and Synapse pipelines, designed specifically to support pipeline migrations. To assess migration readiness, open your ADF/Synapse pipeline instance, go to the authoring canvas, and select Migrate to Fabric (Preview) > Get started (Preview). As shown in the assessment summary below, pipelines are grouped into migration readiness categories such as Ready, Needs Review, Coming Soon, and Unsupported. This classification gives engineering teams early visibility into potential migration risks by highlighting activities or configurations that may behave differently in Fabric and require validation or adjustment after migration (Needs review), features that are not currently supported in Fabric but are planned for future availability (Coming soon), or not available in Fabric and will require redesign or reâimplementation (Unsupported). In enterprise environments with large pipeline estates, this insight is critical for avoiding unexpected failures or delays during migration. After completing the assessment, you can proceed with the migration wizard and mount your ADF pipelines into Microsoft Fabric. Mounting does not migrate your ADF pipelines to Fabric Data Factory at this stage. Instead, it creates a reference to your existing instances within the Fabric workspace without consuming Fabric capacity. After mounting, run pipelines side by side to validate behavior and results. Once the side by side has been validated, select Migrate to Fabric button to proceed with connection mapping and the actual migration to Fabric Data Factory. After completing the migration process, you will be presented with the Migration Results page. This view provides a summary of all selected pipeline resources along with their migration status and corresponding Fabric resource names. Successfully migrated pipelines are now available as Fabricânative items within the workspace, while any errors or unmapped dependencies are flagged for further review. For Synapse Analytics pipelines, you transition directly into the Fabric Data Factory experience (assess->map->migrate flow) rather than mounting first to reference Synapse pipelines externally. For detailed migration steps, follow this link: Assess your Azure Data Factory and Synapse pipelines for migration to Fabric - Azure Data Factory | Microsoft Learn PowerShell automation tool: Microsoft provides a PowerShell upgrade utility to accelerate migration from Azure Data Factory to Fabric Data Factory. Using the Microsoft.FabricPipelineUpgrade module, you can translate a large subset of ADF pipeline JSON into Fabricânative definitions, giving you a fast, scalable starting point for migration. The tool covers common patterns such as Copy, Lookup, Stored Procedure, and standard control flow. Manual followâup is still required for edge cases (custom connectors, complex expressions, and some data flow scenarios). Import-AdfFactory -SubscriptionId <your Subscription ID> -ResourceGroupName <your Resource Group Name> -FactoryName <your Data Factory Name> -PipelineName "pipeline1" -AdfToken $adfSecureToken | ConvertTo-FabricResources | Export-FabricResources -Region <region> -Workspace <workspaceId> -Token $fabricSecureToken For stepâbyâstep guidance, see: Detailed Tutorial for PowerShell-based Migration of Azure Data Factory Pipelines to Fabric - Microsoft Fabric | Microsoft Learn OpenâSource Migration Tooling In addition to Microsoftâsupported migration utilities, the Fabric Toolbox provides a set of openâsource tools designed to assist with migration planning, readiness analysis, and pipeline translation from ADF and Synapse to Fabric Data Factory. Fabric Data Factory Migration Assistant PowerShell: An openâsource tool from the Fabric Toolbox that supports migration from both Azure Data Factory and Synapse ARM templates and built as a browserâbased singleâpage application (SPA). https://github.com/microsoft/fabric-toolbox/tree/main/tools/FabricDataFactoryMigrationAssistant Fabric Assessment Tool: An openâsource commandâline utility used to connect to and scan workspaces in order to extract inventory data and assess migration scope by creating a structured export of assets for planning and analysis. https://github.com/microsoft/fabric-toolbox/tree/main/tools/fabric-assessment-tool When to Use What? Organizations typically adopt one of three migration strategies when transitioning ADF or Synapse pipelines to Fabric Data Factory: LiftâandâShift to accelerate transition timelines with minimal pipeline refactoring. Modernization to reâarchitect orchestration logic and fully leverage Fabricânative analytics and AI capabilities. Hybrid to balance migration velocity with targeted modernization of highâvalue or lowâparity workloads. The appropriate migration paths should be aligned with business priorities, existing integration patterns, and the desired pace of platform transformation, and is largely determined by the feature parity between existing ADF/Synapse assets and their Fabric Data Factory equivalents. A range of migration tooling options are available depending on migration scope and pipeline complexity: Built-In Fabric UI Assistant â Migrate to Fabric: Use this assistant to assess pipeline readiness across both ADF and Synapse environments, mount existing ADF pipelines into a Fabric workspace, perform sideâbyâside validation, or migrate supported Synapse pipelines directly into Fabric Data Factory experience. PowerShell Upgrade Tool (Microsoftâsupported): Use this for bulk ADF migrations at scale, repeatable upgrades, and CI/CDâdriven pipeline conversion with a supported path. Fabric Data Factory Migration Assistant PowerShell (Open Source): Use for early analysis, connector mapping, and generating a migration starting point outside the Fabric UI. Fabric Assessment Tool (Open Source): Use before migration to understand scope, inventory, dependencies, and readiness across your Fabric and data estate. Manual migration: best suited for complex, lowâparity pipelines and provides an opportunity to modernize architecture using Fabricâs native capabilities, delivering longâterm benefits in maintainability, performance, and cost. Key Considerations for a Smooth Transition Before migrating, itâs important to understand the architectural differences between Azure Data Factory or Synapse pipelines and Fabric Data Factory. Reviewing these differences early helps determine which pipeline components can be reused, translated, or redesigned for Fabricânative execution. Start by prioritizing lowârisk, highâparity pipelines that can be migrated with minimal redesign. Mounting existing ADF pipelines into Fabric enables gradual migration and sideâbyâside testing, allowing teams to validate compatibility before using conversion tools or replatforming workloads. For larger environments, the Microsoft.FabricPipelineUpgrade PowerShell module or Open-Source tools can be used to migrate pipelines at scale while mapping linked services to Fabric connections. Where possible, leverage Fabricânative capabilities such as Copilot for pipeline authoring, and code fix, deployment pipelines for CI/CD, and OneLake shortcuts to access external data without duplication. Itâs also recommended to validate migrated pipelines under productionâlike workloads to confirm performance and reliability before cutover. For complex or largeâscale enterprise migrations, engaging Microsoft partners can help accelerate modernization efforts while minimizing operational risk. Partners | Microsoft Fabric For detailed best practices guidance, refer to: Migration Best Practices for Azure Data Factory to Fabric Data Factory - Microsoft Fabric | Microsoft Learn Summary Migrating from Azure Data Factory or Synapse pipelines to Microsoft Fabric Data Factory represents a key step toward building a unified, AIâready analytics platform. By leveraging the builtâin migration assessment and associated tooling, organizations can perform pipelineâlevel compatibility analysis, identify unsupported activities or configuration dependencies, and implement a phased modernization strategy aligned with workload readiness. Successful transitions require a clear understanding of the architectural shift from ADF/Synapseâs PaaS to Fabricâs SaaSâmanaged model, where compute is fully managed within the Fabric capacity, traditional Integration Runtimes are no longer required, and datasets and linked services are replaced with connectionâbased configurations defined inline within pipeline activities. By adopting Fabricânative capabilities such as deployment pipelines for CI/CD, Copilotâassisted pipeline authoring, and OneLake, organizations can standardize pipeline lifecycle management, enable governed access to shared data assets across domains, and support multiâcloud integration through virtualized data access allowing pipelines to operate on distributed datasets without duplicating or relocating data across Lakehouse, Data Warehouse, and RealâTime Analytics workloads within a unified Fabric workspace.This Week on the Fabric Engineering Connection
After a twoâweek pause for FABCON & SQLCON - The Microsoft Fabric & SQL Community Conferences, weâre excited to welcome partners back for our first Fabric Engineering Connection call since the conference. Welcome backâand what a great way to restart the conversation! đ This weekâs sessions bring partners closer to the people building Microsoft Fabric, with timely insights and takeaways straight from FabCon. đ Whatâs on the agenda: Fabric AIâPowered Automation for ProâDevelopers (Americas & EMEA) presented by Evelina Alroy-Brin and Hasan Abo-Shally Recap of Data Warehouse announcements from FabCon presented by Rakesh Krishnan and Tino Tereshko đşđŚ đ Session times: Americas & EMEA: Wednesday, March 25 | 8â9 AM PT APAC: Thursday, March 26 | 1â2 AM UTC / Wednesday, March 25 | 5â6 PM PT These calls are a great opportunity to reconnect after FabCon, hear directly from engineering, and dig deeper into whatâs newâand whatâs nextâfor Microsoft Fabric. đ Participation is open to members of the Fabric Partner Community. Join here: https://aka.ms/JoinFabricPartnerCommunity56Views1like0Commentsđ¨ PartnerâExclusive Event: AMA with Fabric Leadership
Weâre excited to invite Fabric Partner Community members to a live Ask Me Anything (AMA) with Fabric leadershipâa rare opportunity to get direct answers and insights from the team shaping Azure Data and Microsoft Fabric. Featured Guest Shireesh Thota CVP, Azure Data Databases Tuesday, March 24 8:00â9:00 AM PT With FabCon + SQLCon wrapping just days before, this session is designed for partners who want to go deeperâask followâup questions, pressureâtest ideas, and understand whatâs next as they plan with customers. Topics may include: Whatâs next for Azure SQL, Cosmos DB, and PostgreSQL Guidance on SQL Server roadmap direction Deepâdive questions on SQL DB in Fabric Questions about the new DPâ800 Analytics Engineer exam going into beta this month Partners can submit any questionsâtechnical, roadmapâfocused, certificationârelated, or customerâscenario driven. This event is exclusively available to members of the Fabric Partner Community. Not a member yet? Join the Fabric Partner Community to attend this AMA and unlock access to partnerâonly events like this: https://aka.ms/JoinFabricPartnerCommunity71Views1like0Comments