Forum Discussion

Mhya19's avatar
Mhya19
Copper Contributor
Sep 06, 2023

Incremental load json from Azure Blos Stoarge to Azure SQL

Hello, 

I am trying to create a pipe in ADF that will allow incremental load of the data from Azure Blob Storage to Azure SQL. Unfortunetlly I've tried many options but since the source of my data is a list of variable jsons ... it seems I am not able to the upsert function beacause I have null fields, last modified option is not working, CDC or pass a timestamp from a dataflow from last trigger to a copy activity are not working . 

Upsert give me a lot of duplicates in the db.

Last Modfied and CDC are simply not working.

Timestamp pass gives me a date format error no matter what: 

 

Anyone had this same issue and could give me some tips on which would be a valid solution for this ?

Thanks, 

Mhya

4 Replies

  • NathanMoyle's avatar
    NathanMoyle
    Copper Contributor

    If you are using a copy activity with Blob Storage as a source and Azure SQL as sink you can use these features:
    On the source side you can filter by last modified to only include files that have been modified since the last successful run of your pipeline if you store run dates from your pipeline in your SQL server or anywhere else.

    Also on the sink side you can copy insert the data using a stored procedure, which will involve creating a table type in SQL and a stored procedure that takes that type as an input. Then that stored procedure can merge new data into your database based on the last modified field your JSON and any other conditions that are important for you. Bear in mind that depending on your SQL server settings NULL != NULL.

     

    I hope this was helpful

    • Mhya19's avatar
      Mhya19
      Copper Contributor
      Hello Nathan,
      Thank you for your idea. I finally managed to find a solution, using Lookup and Set variable to pass the last modified date to the start time (utc). It was not so easy while I was getting an object from the lookup, although the output preview looked like a date. Here is what I used to transform the object into string and extract the datetime I am interested in: Variable = @substring(string(activity('Max_Time').output),34,23) . I hope this will help others.
      • NathanMoyle's avatar
        NathanMoyle
        Copper Contributor
        I'm glad you managed to find a solution.
        This approach could cause you issues later if the length of the text output changes format in any way. Instead you can navigate the json data directly to get the data you're after.
        For example if the data returned from the activity is:
        {
        "firstrow":{
        "date":"2023-09-22T14:40.00"
        }
        }

        Then you will be able to extract just the date by using the formula:
        @activity('Max_Time').output.firstrow.date

Resources