Azure Data Factory Copy Data Activity changes data while copying Parquet data to Dedicated SQL Pool

Copper Contributor

Hi All,

We have a parquet file on a ADLS2 Storage container, that has over 7 million rows of data. 

We created a Copy Data Activity on Azure Data Factory, to move this data to a table in Dedicated SQL Pool. All the data from the Parquet file goes into the database table accurately, except for this one row, where there is a Decimal value of 78.6 in the parquet file, that goes into the SQL table as 78.5. 

Here's more context on the steps we took so far to trace the root cause for this issue:

  • We have tried to change the parquet file name and push it to this table -- the data still goes in to the SQL table as 78.5 (when the parquet file has 78.6)

  • We have tried to create a version 2 table in the SQL DB and pushed the data into this V2 table using the Copy Data Activity, still, the data goes in as 78.5

  • We have checked the compression type used to create the parquet file on our python code (it is GZIP), the compression type used to unzip this parquet file data on Data Factory's Dataset connection -- earlier it was snappy, we changed it to GZIP and re-ran the Copy Data Activity -- and still, the data goes in as 78.5

  • We have checked the Decimal data type precision and scale, as well as the datatype Mapping from source to sink -- if this was off, the whole dataset should have issues for this column, but it is this one row, that goes in to the SQL table incorrectly.

Ask: Has any of you ever encountered this issue before? If so, how did you solve it? 

Any suggestions are welcome. Thank you!!

0 Replies