Forum Discussion

shiva_ch155's avatar
shiva_ch155
Copper Contributor
Apr 07, 2023

Parse/Flatten the csv file in which one column is XML using Dataflow in Azure Data Factory

We have CSV file in this format.

UID

Catalog_Names

152376<catalogs>
<Catalog Name="NEWYORK"
RequireApprove="True"/>
</catalogs>
413683<catalogs>
<Catalog Name="BOSTON"
RequireApprove="True"/>
</catalogs>
347687<catalogs>
<Catalog Name="DALLAS"
RequireApprove="True"/>
</catalogs>

Need to parse/flatten the columns

I need file with column names: UID, Catalog Name, RequireApprove w.r.t data using Dataflow in Azure DataFactory.

 

I tried using Parse activity in Dataflow with the expression 

(catalogs as {replace(toString(u_catalog.catalogs),'@','')Catalog Name} as string,
{replace(toString(u_catalog.catalogs),'@','')RequireApprove} as string)
but it is not working, getting error Any help on this please.

 

Thank you.

  • Gunjan_Kanani's avatar
    Gunjan_Kanani
    Copper Contributor
    Using Parse Transformation in Data flow :
    Source XML with Attribute data:
    <cars>
    <car model="camaro">
    <year>1989</year>
    </car></cars>
    Expression: (cars as (car as ({@model} as string, year as integer)))
    Note: If you run into errors extracting attributes (i.e. @model) from a complex type, a workaround is to convert the complex type to a string, remove the @ symbol (i.e. replace(toString(your_xml_string_parsed_column_name.cars.car),'@','') ), and then use the parse JSON transformation activity.
    • shiva_ch155's avatar
      shiva_ch155
      Copper Contributor

      Gunjan_Kanani 

      <catalogs>
      <Catalog Name="NEWYORK"
      RequireApprove="True"/>
      </catalogs>

      Exp:

      (catalogs as {replace(toString(u_catalog.catalogs),'@','')Catalog Name} as string,
      {replace(toString(u_catalog.catalogs),'@','')RequireApprove} as string)
      not working

Resources