Parse/Flatten the csv file in which one column is XML using Dataflow in Azure Data Factory

Copper Contributor

We have CSV file in this format.

UID

Catalog_Names

152376<catalogs>
<Catalog Name="NEWYORK"
RequireApprove="True"/>
</catalogs>
413683<catalogs>
<Catalog Name="BOSTON"
RequireApprove="True"/>
</catalogs>
347687<catalogs>
<Catalog Name="DALLAS"
RequireApprove="True"/>
</catalogs>

Need to parse/flatten the columns

I need file with column names: UID, Catalog Name, RequireApprove w.r.t data using Dataflow in Azure DataFactory.

 

I tried using Parse activity in Dataflow with the expression 

(catalogs as {replace(toString(u_catalog.catalogs),'@','')Catalog Name} as string,
{replace(toString(u_catalog.catalogs),'@','')RequireApprove} as string)
but it is not working, getting error Any help on this please.

 

Thank you.

2 Replies
Using Parse Transformation in Data flow :
Source XML with Attribute data:
<cars>
<car model="camaro">
<year>1989</year>
</car></cars>
Expression: (cars as (car as ({@model} as string, year as integer)))
Note: If you run into errors extracting attributes (i.e. @model) from a complex type, a workaround is to convert the complex type to a string, remove the @ symbol (i.e. replace(toString(your_xml_string_parsed_column_name.cars.car),'@','') ), and then use the parse JSON transformation activity.

@Gunjan_Kanani 

<catalogs>
<Catalog Name="NEWYORK"
RequireApprove="True"/>
</catalogs>

Exp:

(catalogs as {replace(toString(u_catalog.catalogs),'@','')Catalog Name} as string,
{replace(toString(u_catalog.catalogs),'@','')RequireApprove} as string)
not working