Databricks

9 Topics

Announcing the new Databricks Job activity in ADF!
We’re excited to announce that Azure Data Factory now supports the orchestration of Databricks Jobs! Databrick Jobs allow you to schedule and orchestrate a task or multiple tasks in a workflow in your Databricks workspace. Since any operation in Databricks can be a task, this means you can now run anything in Databricks via ADF, such as serverless jobs, SQL tasks, Delta Live Tables, batch inferencing with model serving endpoints, or automatically publishing and refreshing semantic models in the Power BI service. And with this new update, you’ll be able to trigger these workflows from your Azure Data Factory pipelines. To make use of this new activity, you’ll find a new Databricks activity under the Databricks activity group called Job. Once you’ve added the Job activity (Preview) to your pipeline canvas, you can connect to your Databricks workspace and configure the settings to select your Databricks job, allowing you to run the Job from your pipeline. We also know that allowing parameterization in your pipelines is important as it allows you to create generic reusable pipeline models. ADF continues to provide support for these patterns and is excited to extend this capability to the new Databricks Job activity. Under the settings of your Job activity, you’ll also be able to configure and set parameters to send to your Databricks job, allowing maximum flexibility and power for your orchestration jobs. To learn more, read Azure Databricks activity - Microsoft Fabric | Microsoft Learn. Have any questions or feedback? Leave a comment below!
Noelle_Li
Sep 16, 2025 Place Azure Data Factory Blog
5.2KViews
1like
2Comments
Azure Databricks - SQL query - Configuration not available
I spun up a FINO's Legend Studio instance locally, and I was able to establish a connectivity between the application and my Azure Databricks resource. However, when I run a SQL query from Legend Studio, which is supposed to execute on Databricks, I get a "Configuration legend_databricks_http_path is not available" error from Databricks: By going to the "Query History" on Azure Databricks, I can confirm Legend Studio is reaching Databricks, but this is responding with the error mentioned above. The "See error" button doesn't provide any additional error details. Is anyone familiar with the "Configuration is not available" type of error in Azure Databricks SQL queries?
Solved
damiangelis
Feb 14, 2025 Place Azure
317Views
0likes
2Comments
Data archiving of delta table in Azure Databricks
Hi all, Currently I am researching on data archiving for delta table data on Azure platform as there is data retention policy within the company. I have studied the documentation from Databricks official (https://docs.databricks.com/en/optimizations/archive-delta.html) which is about archival support in Databricks. It said "If you enable this setting without having lifecycle policies set for your cloud object storage, Databricks still ignores files based on this specified threshold, but no data is archived." Therefore, I am thinking how to configure the lifecycle policy in azure storage account. I have read the documentation on Microsoft official (https://learn.microsoft.com/en-us/azure/storage/blobs/lifecycle-management-overview) Let say the delta table data are stored in "test-container/sales" and there are lots of "part-xxxx.snappy.parquet" data file stored in that folder. Should I simply specify "tierToArchive", "daysAfterCreationGreaterThan: 1825", "prefixMatch: ["test-container/sales"]? However, I am worried that will this archive mechanism impact on normal delta table operation? Besides, I am worried that what if the parquet data file moved to archive tier contains both data created before 5 years and after 5 years, it is possible? Will it by chance move data earlier to archive tier before 5 years? Highly appreciate if someone could help me out with the questions above. Thanks in advance.
Brian_169
Jan 05, 2025 Place Analytics on Azure
331Views
0likes
1Comment
Harnessing Retail Data with Azure: Integrating Blob Storage and Databricks for Advanced Analytics
Learn how a retail company leverages Azure Blob Storage and Azure Databricks to store, process, and analyze its massive sales data. You will see how the company uses PySpark to transform data into insights that help them optimize their product strategy and marketing campaigns. You will also find some learning resources to help you get started with data engineering on Microsoft Azure.
Jiechen_Li
Feb 26, 2024 Place Educator Developer Blog
1.5KViews
0likes
0Comments
Empowering Startups: The Introductory Guide to Databricks for Entrepreneur's Data-Driven Success
Unlock the key to entrepreneurial success with Databricks—a journey where data empowers startups to thrive. Get ready to embark on a transformative quest for data-driven excellence!
Destiny_Erhabor
Sep 26, 2023 Place Educator Developer Blog
3.4KViews
2likes
0Comments
Loading Parquet and Delta files into Azure Synapse using ADB or Azure Synapse?
I have a below case scenario. We are using Azure Databricks to pull data from several sources and generate the Parquet and Delta files and loaded them into our ADLS Gen2 Containers. We are now planning to create our data warehouse inside Azure Synapse SQL Pools, where we will create external tables for dimension tables which will use delta files and hash distributed fact tables using Parquet files. Now, the question is, to automate this data warehousing loading activity, which method is better? Is it better to use Azure Databricks to write our transformation logic to create dim and fact tables and load them regularly inside Azure Synapse SQL pools (or) is it better to use Azure Synapse to write our transformation logic to create dim and fact tables and load them regularly inside Azure Synapse SQL pools. Please help.
bspadp2
Sep 06, 2023 Place Azure Architecture
688Views
0likes
1Comment
Train your Model on Spark/Databricks, score it on ADX
Are you using Spark/Databricks to build Machine Learning models? Do you need to score new data that is streamed into Azure Data Explorer? If this is your scenario please read on! In this blog we show how to train an ML model on Azure Databricks, export it to ADX, and score new samples directly on ADX, in near real time, using inline Python code embedded in KQL query.
adieldar
Jul 19, 2023 Place Azure Data Explorer Blog
6.2KViews
2likes
4Comments
Getting started on Azure
I work with large dataset and I am just getting started on learning Azure. I am famaliar with Python and Powerbi. I am planning to integrate Synapse and Databricks for anaalytics and visualisation using Powerbi. What books do you recommend for me to understand these modules?
Chan_Tze_Leong
Oct 29, 2021 Place Analytics on Azure
1.2KViews
0likes
1Comment
ADF Mapping Data Flows for Databricks Notebook Developers
Convert Databricks ETL Notebooks to ADF using Mapping Data Flows
Mark Kromer
Oct 21, 2019 Place Azure Data Factory Blog
6.3KViews
0likes
0Comments