ADF adds TTL to Azure IR to reduce Data Flow activity times
Published Sep 26 2019 09:33 PM 20.7K Views
Microsoft

ADF has added a TTL (time-to-live) option to the Azure Integration Runtime for Data Flow properties to reduce data flow activity times.

azureir2.png

This setting is only used during ADF pipeline executions of Data Flow activities. Debug executions from pipelines and data preview debugging will continue to use the debug settings which has a preset TTL of 60 minutes.

 

If you leave the TTL to 0, ADF will always spawn a new Spark cluster environment for every Data Flow activity that executes. This means that an Azure Databricks cluster is provisioned each time and takes about 5-7 minutes to become available and execute your job.

 

However, if you set a TTL, ADF will maintain a pool of VMs which can be utilized to spin-up each subsequent data flow activity against that same Azure IR. This reduces the amount of time needed to start-up the environment before your job is executed.

 

ADF will maintain that pool for the TTL time after the last data flow pipeline activity executes. Note that this will extend your billing period for a data flow to the extended time of your TTL. However, your data flow job execution time will decrease because of the re-use of the VMs from the compute pool. The compute resources are not provisioned until your first data flow activity is executed using that Azure IR.

 

Read more about the Azure Integration Runtime here. And here is an ADF Data Flow performance guide to help you optimize your environment.

21 Comments
Copper Contributor

Hi @Mark Kromer , Where within my Azure Data factory (or data flows) do I set this Time to Live (TTL) property? I have looked quite a bit but can't find the screen you are showing above. 

Microsoft

From the ADF pipeline designer UI, go to Connections > Integration Runtimes > New. Select Azure IR and then open the Data Flow Run Time properties section.

Copper Contributor

Hi @Mark Kromer 

I can find the option you indicate. On the Integration Runtime Setup windows there are only two options: Azure self-Hosted and Azure SSIS. However, I do not see the Data Flow Run Time properties in any of them.

Microsoft

Try creating a new Azure IR to see the options

 

ir.png

Copper Contributor

is there any option to update the TTL value for the an existing IR

Copper Contributor

also, is there an option to pass the custom created IR's as parameters to pipelines?

Microsoft

@saikumare Currently, we do not support updating the TTL value, you must create a new Azure IR instead. The IRs are also not parameterizable, however, if you set the IR in the Data Flow activity as "Auto-Resolve", you can parameterize the core count and compute type properties there, making them dynamic.

Copper Contributor

but this "Auto-Resolve" IR (with TTL=0) does not have the option of setting/updating the TTL which would consume more than 4 mins of cluster startup for each job .

Any other option to handle both these core count and TTL dynamically?

Microsoft

@saikumare TTL is a cluster-controlling mechanism, so it cannot be dynamic. But making the compute settings inside the Azure IR configuration parameterized, instead of only inside the activity, is a common ask. You can add the feedback to Azure User Voice so that we can prioritize the feature request accordingly.

Copper Contributor

I still can not see the option to set the TTL option. Can you please advice?

 

 

TTL.JPG

Microsoft

@Reasat Are you using ADF or Synapse?

Copper Contributor

We are using Synapse

Microsoft

@Reasat Synapse has not yet implemented TTL in the Azure IR feature. This is something that they are currently working on as a top priority.

Copper Contributor

@Mark Kromer I came across the same issue. Do you have a timeline when TTL for Synapse Data Flows will be available?

Copper Contributor

@Mark Kromer , I cant find this setting in ADF , can you please help

Copper Contributor

@Mark Kromer Do you have a tentative timeline when TTL for Synapse Data Flows will be available? We need this to plan our production migration. Thanks in advance!

Microsoft

@Reasat No ETA/timeline at this time

Copper Contributor

@Mark Kromer , , I cant find this setting in ADF , can you please help. I need to set TTL for cluster.

Microsoft

@Abhijeetuk  Go to Azure Integration Runtime under "Manage" in ADF UI ... Click on Data flow runtime properties accordion at bottom on panel and set a TTL. Then choose "Quick re-use".

 

Copper Contributor

@Mark Kromer Any update on when TTL for Synapse Data Flows will be available??

Thanks.

Microsoft

@Reasat it is coming to Synapse in CY21

Version history
Last update:
‎Sep 26 2019 09:34 PM
Updated by: