Best practice recommendation for SSIS package execution retries
Published Oct 29 2019 08:40 PM 17.9K Views
Microsoft

 

Transient issues

Your typical SSIS package often needs to access external resources, like databases, files, or services. If your package execution and those resources are in the same on-premises environment managed by yourself, the execution should be completed without any issue. However, if your package execution and/or some of those resources are in a cloud environment, e.g. running your package on SSIS Integration Runtime (IR) in in Azure Data Factory (ADF), sometimes the execution fails due to transient network/third-party service related issues. You can learn more about them here. These issues might disappear by retrying your package execution.

 

Recommended solutions

To ensure the completion of your package execution and avoid its repetitive manual invocations, you can configure it to retry automatically. Depending on the logic of your package, you can retry its execution completely or partially, but please be aware that package execution retries can only solve transient issues. Effective package execution retries depend on a suitable retry count and interval. Too small a retry count might not be sufficient to complete the package execution. Too big a retry count and too short a retry interval might occupy resources and block their usage for extended periods of time. Depending on your package execution scenario, there are many retry options:

  1. If your package execution can be retried directly, where the next execution won’t be impacted by previous failure:
    1. If you use T-SQL to run your package, see Retry package executions via T-SQL 
    2. If you use Execute SSIS Package activity in ADF pipeline to run your package, see Retry Execute SSIS Package activity 
    3. If you use Stored Procedure activity in ADF pipeline to run your package, see Retry package executions via Stored Procedure activity 
  2. If your package execution requires some preparations, like cleaning up dirty data, before it can be retried:
    1. If your package is editable, see Retry preparations for editable packages
    2. If your package is non-editable:
      1. If you use T-SQL to run your package, see Retry preparations for non-editable packages via T-SQL 
      2. If you use Execute SSIS Package/Stored Procedure activity in ADF pipeline to run your package, see Retry preparations for non-editable packages in ADF

    3. If you want to partially retry your package execution, see Retry package executions partially

 

Retry package executions via T-SQL

You can use built-in SSISDB stored procedures to trigger package executions. The [catalog].[start_execution] stored procedure accepts an argument called @retry_count to specify the maximum number of retries for your package execution. Its default value is 0, which means not to retry, so you can assign different values to retry your package execution automatically when it fails. For example:

 

Declare @execution_id bigint
EXEC [SSISDB].[catalog].[create_execution] @package_name=N'<packge name>',
       @execution_id=@execution_id OUTPUT,
       @folder_name=N'<folder name>',
       @project_name=N'<project name>',
     @use32bitruntime=False,
       @reference_id=Null
Declare @retry_count int = 5
EXEC [SSISDB].[catalog].[start_execution] @execution_id ,
       @retry_count=@retry_count

 

clipboard_image_0.png

Here we’ve set @retry_count to 5 and the execution report on SSMS shows Executed Count to be 6, which means after the first failed package execution, it’s automatically retried another 5 times.

 

Retry Execute SSIS Package activity

You can use Execute SSIS Package activity in ADF pipeline to trigger package executions, see Run SSIS packages with Execute SSIS Package activity. ADF provides a built-in feature to retry activities with 2 properties, Retry and Retry interval (in seconds), which allow you to configure the maximum retry count and interval between each retry. You can find them in General tab of Execute SSIS Package activity. For example:

clipboard_image_1.pngclipboard_image_2.png

Here ADF UI shows 4 execution records that can also be found in SSMS, which means after the first failed Execute SSIS Package activity, it’s automatically retried another 3 times with 30 seconds interval between each retry.

 

Retry package executions via Stored Procedure activity

You can also use Stored Procedure activity in ADF pipeline to trigger package executions, see Run SSIS packages with Stored Procedure activity. With this activity, you have two methods to retry your package execution:

  1. Built-in activity retry. Similar to Execute SSIS Package activity (see above) and other ADF activities, you can use the Retry and Retry interval properties in General tab. For example:

clipboard_image_3.pngclipboard_image_4.png

  1. Retry argument in SSISDB stored procedure. As already mentioned in Retry package executions via T-SQL  (see above), you can use the @retry_count argument. For example:

 

DECLARE @return_value INT, @exe_id BIGINT, @err_msg NVARCHAR(150)   EXEC @return_value=[SSISDB].[catalog].[create_execution] @folder_name=N'retry', @project_name=N'Retry', @package_name=N'RetryNumber.dtsx', @use32bitruntime=0, @runinscaleout=1, @useanyworker=1, @execution_id=@exe_id OUTPUT   EXEC [SSISDB].[catalog].[set_execution_parameter_value] @exe_id, @object_type=50, @parameter_name=N'SYNCHRONIZED', @parameter_value=1   EXEC [SSISDB].[catalog].[start_execution] @execution_id=@exe_id, @retry_count=0   IF(SELECT [status] FROM [SSISDB].[catalog].[executions] WHERE execution_id=@exe_id)<>7 BEGIN SET @err_msg=N'Your package execution did not succeed for execution ID: ' + CAST(@exe_id AS NVARCHAR(20)) RAISERROR(@err_msg,15,1) END

 

These two retry methods differ as follows:

  1. Using the activity retry will generate multiple activity runs, while using the retry argument will only generate one activity run.
  2. You can set a retry interval when using the activity retry, but not when using the retry argument.

Retry preparations for editable packages

In some cases, a failed package execution can’t be retried immediately, so you need to do some preparations before retrying it. If your package is editable, you can open it in SSDT and add preparation tasks you need at the beginning of its Control Flow. For example:

clipboard_image_5.png

You can use a Script Task (Do preparation in the above package example) to clean dirty data, create/delete tables, or do anything else upstream to prepare for the following downstream tasks, ensuring that your package is retriable. Afterwards, you can follow the above Retry package executions via T-SQL, Retry Execute SSIS Package activity, or Retry package executions via Stored procedure activity sections to retry your package execution automatically.

 

Retry preparations for non-editable packages via T-SQL

If you’re using T-SQL to run your package and it isn’t editable, you can use a loop in T-SQL to prepare and retry its execution. For example:

 

Declare @max_retry_count int = 3
Declare @retry_interval nvarchar(10) = N'00:00:30'
 
Declare @exec_count int = 0
WHILE @exec_count <= @max_retry_count
BEGIN
       -- Do anything here to prepare execution
       -- Example: clean tmp table
       -- truncate table [example].[tmp]
 
       -- Create executoin
       Declare @execution_id bigint
       EXEC [SSISDB].[catalog].[create_execution] @package_name=N'<package name>',
              @execution_id=@execution_id OUTPUT,
              @folder_name=N'<folder name>',
              @project_name=N'<project name>',
            @use32bitruntime=False,
              @reference_id=Null
 
       SELECT @execution_id
       -- Wait until package execution is done
       EXEC [SSISDB].[catalog].[set_execution_parameter_value] @execution_id,
              @object_type=50,
              @parameter_name=N'SYNCHRONIZED',
              @parameter_value= 1
 
       -- Start execution
       EXEC [SSISDB].[catalog].[start_execution]
              @execution_id
 
       -- Check execution result
       Declare @exec_status int
       SELECT @exec_status = status from [catalog].[executions] where execution_id = @execution_id
 
       -- Check execution status to determine end execution or retry
       IF @exec_status = 4
       BEGIN
              SET @exec_count = @exec_count + 1
              IF @exec_count <= @max_retry_count
              BEGIN
                     WAITFOR DELAY @retry_interval
              END
       END
       ELSE
       BEGIN
              SET @exec_count = @max_retry_count + 1
       END
END

 

At the beginning of the loop, you can do anything to prepare for your package execution. After your preparations, you can use SYNCHRONIZED parameter to ensure that SSISDB stored procedure waits until your package execution is completed, so you can get its resulting status to determine whether to retry it or not. If its status is failed (= 4), you can wait for your specified interval and then retry it.

 

Retry preparations for non-editable packages in ADF

If you’re using Execute SSIS Package activity in ADF pipeline to run your package and it isn’t editable, you can use another ADF activity to prepare for its execution in the same pipeline. For example:

clipboard_image_7.png

You can use Stored Procedure activity (Do preparation in the above pipeline example) to clean dirty data left over by the previously failed package execution. Next, you can implement your retry logic programmatically using the framework of your choice, such as .NET, to run your pipeline, monitor its result, and rerun it if Execute SSIS Package activity fails. For example:

 

int maxRetryCount = 3;
TimeSpan retryIntervalInSecond = TimeSpan.FromSeconds(30);
TimeSpan checkStatusIntervalInSecond = TimeSpan.FromSeconds(5);
 
int execCount = 0;
while (execCount <= maxRetryCount)
{
   // Create a pipeline run
   Console.WriteLine("Creating pipeline run...");
   CreateRunResponse runResponse = client.Pipelines.CreateRunWithHttpMessagesAsync(
       resourceGroup, dataFactoryName, pipelineName
   ).Result.Body;
   Console.WriteLine("Pipeline run ID: " + runResponse.RunId);
 
   // Monitor and wait for pipeline finish
   Console.WriteLine("Checking pipeline run status...");
   PipelineRun pipelineRun;
   while (true)
   {
       pipelineRun = client.PipelineRuns.Get(
           resourceGroup, dataFactoryName, runResponse.RunId);
       Console.WriteLine("Status: " + pipelineRun.Status);
       if (pipelineRun.Status == "InProgress")
       {
           Thread.Sleep(checkStatusIntervalInSecond);
       }
       else
       {
           break;
       }
   }
 
   // Check Execute SSIS package activity status
   Console.WriteLine("Checking Execute SSIS activity run details...");
   RunFilterParameters filterParams = new RunFilterParameters(
       DateTime.UtcNow.AddMinutes(-10), DateTime.UtcNow.AddMinutes(10));
   ActivityRunsQueryResponse queryResponse = client.ActivityRuns.QueryByPipelineRun(
       resourceGroup, dataFactoryName, runResponse.RunId, filterParams);
   var ssisActivityRun = queryResponse.Value.Where((run) => run.ActivityName == "Execute SSIS package1").First();
 
   // According to Execute SSIS package activity status, decides to do retry or not
   if (ssisActivityRun.Status == "Failed")
   {
       Console.WriteLine("Execute SSIS package is Failed.");
       execCount++;
       if (execCount <= maxRetryCount)
       {
           Console.WriteLine("Sleep and retry");
           Thread.Sleep(retryIntervalInSecond);
       }
   }
   else
   {
       Console.WriteLine("Execute SSIS package is Succeeded. No need to retry.");
       break;
   }
}

 

sdkretry.PNG

 

Retry package executions partially

If only a specific part of your package is impacted by transient issues and you prefer to run other parts only once to minimize the overall running costs, then you can redesign your package to only retry the execution of that specific part. For example:

clipboard_image_8.png

clipboard_image_9.png

Let’s imagine a Data Flow task in your package that copies on-premises data to a cloud staging area (Staging Data Flow in the above package example), it’s the only part that sometimes fails due to connectivity/network issues and this always leads to a failure of the whole package execution. To avoid the complete failure, you can add a For Loop container in your package that makes Staging Data Flow task retriable.

clipboard_image_10.png

To implement your retry logic, you can first add 3 user variables and assign values to them (StagingRetryCount/StagingRetryIntervalInSecond/StagingRetryMaxCount = 0/3/5, respectively, in the above example).

clipboard_image_11.png

After adding a For Loop container in your Control Flow, you can open its Properties table and set the value of its MaximumErrorCount property to StagingRetryMaxCount + 1 (= 6 in the above example), ensuring that Staging Data Flow task will be retried at most StagingRetryMaxCount times.

Double-clicking on the For Loop container, you can open its editor and add an expression for each of its loop iteration properties.

clipboard_image_12.png

You can set the values for InitExpression/EvalExpression/AssignExpression properties as @StagingRetryCount = 0/@StagingRetryCount <= @StagingRetryMaxCount/@StagingRetryCount = @StagingRetryCount + 1, respectively, ensuring that the first loop iteration is the initial package execution, the second iteration will be the first retry, subsequent iterations will increment the retry count, and the iteration that increments the retry count beyond its maximum will be the last retry, so the loop will be exited after that.

clipboard_image_13.png

Next, you can link the Success/Failure paths of Staging Data Flow task to Expression/Script Tasks (Exit Loop/Sleep Interval in the above For Loop container example), respectively, and move all tasks into the container.

clipboard_image_14.png

In Exit Loop task, you can add an expression that sets StagingRetryCount to StagingRetryMaxCount, so the loop will be exited immediately.

clipboard_image_15.pngclipboard_image_16.png

In Sleep Interval task, you can implement a retry interval programmatically using C# based on StagingRetryIntervalInSecond variable to avoid network/service throttling.

clipboard_image_17.png

clipboard_image_18.png

At last, you can now link the For Loop container with upstream/downstream tasks from your original package (Do something/Do something else in the above package example, respectively), such that when you run the new package, if Staging Data Flow task fails, it will be retried automatically a maximum of StagingRetryMaxCount times with an interval of StagingRetryIntervalInSecond seconds between each retry.

Version history
Last update:
‎Nov 01 2019 12:24 AM
Updated by: