Forum Discussion

Roop_s610's avatar
Roop_s610
Copper Contributor
May 08, 2025

copy data fails - best practice?

Hi Everyone,

We have the need to start to copy (some) data from various inhouse sql database tables to an Azure online database.

So far what I have done, to keep the updates times to a minimum, is to create static and live tables where the tables are large.  the static tables will incorporate data up to the start of the year as a one off pipeline.  the live is upserts  daily pipeline.

After a lot of tweaking I got the Lives pipelines working - generally fail free.

However with the large tables I am really struggling to copy the data over (usually failing after 5 hours of progress)  the tables generally have 4-5 million rows and maybe 50 columns.  

I've played around with various settings but am now wondering if this is the best method of copying large amounts of data from on prem to cloud databases?

1 Reply

  • petevern's avatar
    petevern
    Copper Contributor

    Hi,

    Split the load (partitioned copy)
    - Partition by date, ID range, or another stable key.
    - Use copy data activities in parallel with filters (WHERE clause) on the source query.

    Use a staging (Intermediate) location via Azure Blob Storage
    ADF/Synapse usually performs better when doing:
    - SQL Server > Azure Blob (Parquet or CSV)
    - Then Blob > Azure SQL Database

Resources