Parquet format support added to Wrangling Data Flow in Azure Data Factory

Published Feb 05 2020 01:25 PM 4,306 Views
Microsoft

Wrangling Data Flow (WDF) in ADF now supports Parquet format. You can have your data stored in ADLS Gen2 or Azure Blob in parquet format and use that to do agile data preparation using Wrangling Data Flow in ADF

 

Create a parquet format dataset in ADF and use that as an input in your wrangling data flow

 

2020-02-05_13h17_42.png

 

You can then use the parquet format dataset as an input to your Wrangling Data Flow to do agile data preparation at cloud scale via spark execution

 

2020-02-05_13h20_00.png

 

 

2020-02-05_13h22_50.png

 

Learn more about using Wrangling Data Flow to do data preparation at cloud scale here.

 

2 Comments
%3CLINGO-SUB%20id%3D%22lingo-sub-1153826%22%20slang%3D%22en-US%22%3EParquet%20format%20support%20added%20to%20Wrangling%20Data%20Flow%20in%20Azure%20Data%20Factory%3C%2FLINGO-SUB%3E%3CLINGO-BODY%20id%3D%22lingo-body-1153826%22%20slang%3D%22en-US%22%3E%3CP%3EWrangling%20Data%20Flow%20(WDF)%20in%20ADF%20now%20supports%20Parquet%20format.%20You%20can%20have%20your%20data%20stored%20in%20ADLS%20Gen2%20or%20Azure%20Blob%20in%20parquet%20format%20and%20use%20that%20to%20do%20agile%20data%20preparation%20using%20Wrangling%20Data%20Flow%20in%20ADF%3C%2FP%3E%0A%3CP%3E%26nbsp%3B%3C%2FP%3E%0A%3CP%3ECreate%20a%20parquet%20format%20dataset%20in%20ADF%20and%20use%20that%20as%20an%20input%20in%20your%20wrangling%20data%20flow%3C%2FP%3E%0A%3CP%3E%26nbsp%3B%3C%2FP%3E%0A%3CP%3E%3CSPAN%20class%3D%22lia-inline-image-display-wrapper%20lia-image-align-center%22%20image-alt%3D%222020-02-05_13h17_42.png%22%20style%3D%22width%3A%20999px%3B%22%3E%3CIMG%20src%3D%22https%3A%2F%2Fgxcuf89792.i.lithium.com%2Ft5%2Fimage%2Fserverpage%2Fimage-id%2F169232iDBE2CD33B7BA194B%2Fimage-size%2Flarge%3Fv%3D1.0%26amp%3Bpx%3D999%22%20title%3D%222020-02-05_13h17_42.png%22%20alt%3D%222020-02-05_13h17_42.png%22%20%2F%3E%3C%2FSPAN%3E%3C%2FP%3E%0A%3CP%3E%26nbsp%3B%3C%2FP%3E%0A%3CP%3EYou%20can%20then%20use%20the%20parquet%20format%20dataset%20as%20an%20input%20to%20your%20Wrangling%20Data%20Flow%20to%20do%20agile%20data%20preparation%20at%20cloud%20scale%20via%20spark%20execution%3C%2FP%3E%0A%3CP%3E%26nbsp%3B%3C%2FP%3E%0A%3CP%3E%3CSPAN%20class%3D%22lia-inline-image-display-wrapper%20lia-image-align-center%22%20image-alt%3D%222020-02-05_13h20_00.png%22%20style%3D%22width%3A%20999px%3B%22%3E%3CIMG%20src%3D%22https%3A%2F%2Fgxcuf89792.i.lithium.com%2Ft5%2Fimage%2Fserverpage%2Fimage-id%2F169238i4AA1F67E2F9342D4%2Fimage-size%2Flarge%3Fv%3D1.0%26amp%3Bpx%3D999%22%20title%3D%222020-02-05_13h20_00.png%22%20alt%3D%222020-02-05_13h20_00.png%22%20%2F%3E%3C%2FSPAN%3E%3C%2FP%3E%0A%3CP%3E%26nbsp%3B%3C%2FP%3E%0A%3CP%3E%26nbsp%3B%3C%2FP%3E%0A%3CP%3E%3CSPAN%20class%3D%22lia-inline-image-display-wrapper%20lia-image-align-center%22%20image-alt%3D%222020-02-05_13h22_50.png%22%20style%3D%22width%3A%20999px%3B%22%3E%3CIMG%20src%3D%22https%3A%2F%2Fgxcuf89792.i.lithium.com%2Ft5%2Fimage%2Fserverpage%2Fimage-id%2F169239iF1F2720152588302%2Fimage-size%2Flarge%3Fv%3D1.0%26amp%3Bpx%3D999%22%20title%3D%222020-02-05_13h22_50.png%22%20alt%3D%222020-02-05_13h22_50.png%22%20%2F%3E%3C%2FSPAN%3E%3C%2FP%3E%0A%3CP%3E%26nbsp%3B%3C%2FP%3E%0A%3CP%3ELearn%20more%20about%20using%20Wrangling%20Data%20Flow%20to%20do%20data%20preparation%20at%20cloud%20scale%20%3CA%20href%3D%22https%3A%2F%2Faka.ms%2Fwranglingdfdocs%22%20target%3D%22_self%22%20rel%3D%22noopener%20noreferrer%22%3Ehere%3C%2FA%3E.%3C%2FP%3E%0A%3CP%3E%26nbsp%3B%3C%2FP%3E%3C%2FLINGO-BODY%3E%3CLINGO-TEASER%20id%3D%22lingo-teaser-1153826%22%20slang%3D%22en-US%22%3E%3CP%3EWrangling%20Data%20Flow%20(WDF)%20in%20ADF%20now%20supports%20Parquet%20format%3C%2FP%3E%3C%2FLINGO-TEASER%3E%3CLINGO-LABS%20id%3D%22lingo-labs-1153826%22%20slang%3D%22en-US%22%3E%3CLINGO-LABEL%3EAzure%20Data%20Factory%3C%2FLINGO-LABEL%3E%3CLINGO-LABEL%3EBig%20Data%20Analytics%3C%2FLINGO-LABEL%3E%3CLINGO-LABEL%3EWrangling%20Data%20Flows%3C%2FLINGO-LABEL%3E%3C%2FLINGO-LABS%3E%3CLINGO-SUB%20id%3D%22lingo-sub-1158594%22%20slang%3D%22en-US%22%3ERe%3A%20Parquet%20format%20support%20added%20to%20Wrangling%20Data%20Flow%20in%20Azure%20Data%20Factory%3C%2FLINGO-SUB%3E%3CLINGO-BODY%20id%3D%22lingo-body-1158594%22%20slang%3D%22en-US%22%3E%3CP%3E%3CA%20href%3D%22https%3A%2F%2Ftechcommunity.microsoft.com%2Ft5%2Fuser%2Fviewprofilepage%2Fuser-id%2F74542%22%20target%3D%22_blank%22%3E%40Gaurav%20Malhotra%3C%2FA%3E%2C%26nbsp%3Bit%20is%20exciting%20to%20see%20Parquet%20in%20Power%20Query.%20Do%20you%20know%20if%20this%20same%20capability%20is%20coming%20to%20Power%20Query%20in%20the%20Power%20Platform%2C%20such%20as%20Power%20BI%20dataflows%3F%20Or%20is%20this%20feature%20dependent%20on%20ADF-specific%20parquet%20interpretation%20tech%3F%3C%2FP%3E%3C%2FLINGO-BODY%3E%3CLINGO-SUB%20id%3D%22lingo-sub-1158744%22%20slang%3D%22en-US%22%3ERe%3A%20Parquet%20format%20support%20added%20to%20Wrangling%20Data%20Flow%20in%20Azure%20Data%20Factory%3C%2FLINGO-SUB%3E%3CLINGO-BODY%20id%3D%22lingo-body-1158744%22%20slang%3D%22en-US%22%3E%3CP%3EIs%20managed%20identify%20auth%20planned%20for%20the%20near%20term%3F%20My%20understanding%20is%20that%20if%20your%20ADLS%20instance%20is%20in%20a%20VNet%2C%20service%20principal%20auth%20can't%20be%20used%20with%20an%20Azure%20Auto%20resolve%20IR%2C%20so%20until%20MI%20auth%20is%20supported%20this%20feature%20can't%20be%20used.%3C%2FP%3E%3C%2FLINGO-BODY%3E
Version history
Last update:
‎Feb 05 2020 01:25 PM
Updated by: