Blog Post

Azure Data Factory Blog
1 MIN READ

Cast transformation added to mapping data flows

Mark Kromer's avatar
Mark Kromer
Icon for Microsoft rankMicrosoft
Aug 03, 2022

We have added a super-useful new transformation primitive to Mapping Data Flows in Azure Data Factory and Azure Synapse Analytics. The Cast transformation enables super-easy data type conversions with built-in type checking.

 

 

Use the column drop-down to select columns from your stream of metadata for type conversion. If you select "Assert type check", then ADF will automatically tag rows that fail your type conversion and you can trap them later in your data flow. You can even use the Assert error row handling to log error rows that fail type conversion.

 

For more complex data type conversions and pattern matching, use the Derived Column transformation. We'll also bring in complex data type conversions to the Cast transformation a little later.

 

With the Cast transformation, it is easier to trap type conversion errors because the Derived Column is built to support resiliency, meaning that type conversion errors result in NULL unless you explicitly add an Assert.

 

Updated Aug 03, 2022
Version 2.0

3 Comments

  • interceptor's avatar
    interceptor
    Copper Contributor

    As soon as I add a Cast step, I'm experiencing an error with a long error message. 

    Resolved attribute(s) (...) missing from (...)

    Could this be related to using Parquet files as source?

  • Ratty1967UK's avatar
    Ratty1967UK
    Copper Contributor

    i also tried the "Assert type check" option - casting "foobar" as a double - but just got a java exception rather than an error marked on the row...

     

    Job aborted due to stage failure: Task 0 in stage 47.0 failed 1 times, most recent failure: Lost task 0.0 in stage 47.0 (TID 34, vm-5fd56064, executor 1): java.text.ParseException: Unparseable number: "foobar"
    at java.text.NumberFormat.parse(NumberFormat.java:385)
    at org.apache.spark.sql.extensions.ParseNumber.nullSafeEval(FunctionExtensions.scala:418)
    at org.apache.spark.sql.catalyst.expressions.BinaryExpression.eval(Expression.scala:510)
    at org.apache.spark.sql.catalyst.expressions.IsNull.eval(nullExpressions.scala:325)...
  • Ratty1967UK's avatar
    Ratty1967UK
    Copper Contributor

    needs to have more powerful formatting options to allow for alternative separators - in particular i'm thinking of the decimal point being "," instead of "." when a file has originated from one of our European sites - but could equally apply to other data types.