azure databricks
8 Topics- Announcing the new Databricks Job activity in ADF!We’re excited to announce that Azure Data Factory now supports the orchestration of Databricks Jobs! Databrick Jobs allow you to schedule and orchestrate a task or multiple tasks in a workflow in your Databricks workspace. Since any operation in Databricks can be a task, this means you can now run anything in Databricks via ADF, such as serverless jobs, SQL tasks, Delta Live Tables, batch inferencing with model serving endpoints, or automatically publishing and refreshing semantic models in the Power BI service. And with this new update, you’ll be able to trigger these workflows from your Azure Data Factory pipelines. To make use of this new activity, you’ll find a new Databricks activity under the Databricks activity group called Job. Once you’ve added the Job activity (Preview) to your pipeline canvas, you can connect to your Databricks workspace and configure the settings to select your Databricks job, allowing you to run the Job from your pipeline. We also know that allowing parameterization in your pipelines is important as it allows you to create generic reusable pipeline models. ADF continues to provide support for these patterns and is excited to extend this capability to the new Databricks Job activity. Under the settings of your Job activity, you’ll also be able to configure and set parameters to send to your Databricks job, allowing maximum flexibility and power for your orchestration jobs. To learn more, read Azure Databricks activity - Microsoft Fabric | Microsoft Learn. Have any questions or feedback? Leave a comment below!5.1KViews1like2Comments
- Different pools for workers and driver - in ADF triggered ADB jobsHello All, Azure Databricks allows usage of separate compute pools for drivers and workers when you create a job via the native Databricks workflows. For customers using ADF as an orchestrator for ADB jobs, is there a way to achieve the same when invoking notebooks/jobs via ADF? The linked service configuration in ADF seems to allow only one instance pool. Appreciate any pointers. Thanks !Solved159Views0likes1Comment
- Query serverless SQL pool from an Apache Spark Scala notebookApache Spark notebooks in Azure Synapse Analytics workspace can execute T-SQL queries on a serverless Synapse SQL pool. This way you can leverage load data from some SQL table or view into your Apache Spark data frames apply some advanced data processing. In this article you will learn how to call SQL code form spark notebook.15KViews2likes3Comments
- How to – Advanced Properties of Linked ServicesAzure Data Factory and Synapse Pipelines have a wealth of linked service connection types that allow them to connect and interact with many services and data stores. The Workspace UI provides the most important properties that are needed for the connection. However, at times we need more control that the UI doesn’t offer. In this tutorial, we will focus on customer scenario to see how you can find all the available properties for the Linked Service you are working with, then how to incorporate that property into your Linked Service directly from within the UI.11KViews4likes0Comments
- Microsoft® and the .NET Foundation announce the release of version 1.0 of .NET for Apache® Spark™It is our pleasure to announce the release of version 1.0 of .NET for Apache® Spark™, an open source package that brings .NET development to the Apache® Spark™ platform. .NET for Apache® Spark™ is available as an OSS project on the .NET Foundation’s GitHub and can be downloaded from NuGet. It is built into Azure Synapse Analytics and Azure HDInsight where you can enjoy an out-of-the-box experience. Version 1.0 will be released in these products in the next major release. .NET for Apache® Spark™ can also be used in other Apache Spark cloud offerings including Azure Databricks and as well as AWS EMR Spark. For on-prem deployments, it offers is multi-platform support for Windows, MacOS, and Linux.17KViews0likes3Comments