Here are 3 examples of how to build automated, visually designed ETL processes from hand-coded Databricks Notebooks ETL using ADF using Mapping Data Flows. In each of these examples that I outline below, it takes just a few minutes to design these coded ETL routines into ADF using Mapping Data Flows without writing any code.
When you see Databricks ETL code that is reading files, represent that action as a Source transformation. The location of the data is defined in an ADF dataset and the source projection and data types are defined in the Source projection, Select transformation, and Derived Columns. The credentials for your file systems and databases are stored in the Linked Service.
A common operation in data lake file processing for ETL is to map codes to values. There are several ways to accomplish this task of building enumerations in Data Flow. In the sample at the link for #1, I show you how to build this as a case statement inside a Derived Column transformation, or use an external lookup file using a Join or Lookup transformation. The lookup file is essentially just the mapping of key to value.
When you see any operation in Notebooks that displays stats or row values, you will accomplish this same task using the Debug session switch in Data Flows. Go into the Data Preview tab to interact with the data and see your live transformation results there. You can also click on each column to see column property statistics in charts and descriptive form.
For grouping, you will use Aggregate transformations and for filtering use the Filter transformation. You can filter and clean values using both common equality functions or regular expressions in ADF: