Read and Write Complex Data Types in ADF

Published Oct 12 2020 12:56 PM 2,686 Views
Microsoft

ADF has connectors for Parquet, Avro, and ORC data lake file formats. However, datasets used by Copy Activity do not currently have support for those types. Here is how to read and write those complex columns in ADF by using data flows.

 

There is a description of this technique in each file format documentation page in the ADF online docs:

 

https://docs.microsoft.com/en-us/azure/data-factory/format-orc#dataset-properties

https://docs.microsoft.com/en-us/azure/data-factory/format-parquet#data-type-support

https://docs.microsoft.com/en-us/azure/data-factory/format-avro#data-flows 

 

Step 1: Make a new dataset and choose the file format type. In this example, I am using Parquet. Set NONE for schema:

complex1.png

Step 2: Make a data flow with this new dataset as the source:

complex2.png

Step 3: Go to Projection -> Import Projection

complex4.png

Step 4: You’ll see your data under Data Preview

complex3.png

1 Comment
%3CLINGO-SUB%20id%3D%22lingo-sub-1772763%22%20slang%3D%22en-US%22%3ERead%20and%20Write%20Complex%20Data%20Types%20in%20ADF%3C%2FLINGO-SUB%3E%3CLINGO-BODY%20id%3D%22lingo-body-1772763%22%20slang%3D%22en-US%22%3E%3CP%3EADF%20has%20connectors%20for%20Parquet%2C%20Avro%2C%20and%20ORC%20data%20lake%20file%20formats.%20However%2C%20datasets%20used%20by%20Copy%20Activity%20do%20not%20currently%20have%20support%20for%20those%20types.%20Here%20is%20how%20to%20read%20and%20write%20those%20complex%20columns%20in%20ADF%20by%20using%20data%20flows.%3C%2FP%3E%0A%3CP%3E%26nbsp%3B%3C%2FP%3E%0A%3CP%3EThere%20is%20a%20description%20of%20this%20technique%20in%20each%20file%20format%20documentation%20page%20in%20the%20ADF%20online%20docs%3A%3C%2FP%3E%0A%3CP%3E%26nbsp%3B%3C%2FP%3E%0A%3CP%3E%3CA%20href%3D%22https%3A%2F%2Fdocs.microsoft.com%2Fen-us%2Fazure%2Fdata-factory%2Fformat-orc%23dataset-properties%22%20target%3D%22_blank%22%20rel%3D%22noopener%20noreferrer%22%3Ehttps%3A%2F%2Fdocs.microsoft.com%2Fen-us%2Fazure%2Fdata-factory%2Fformat-orc%23dataset-properties%3C%2FA%3E%3C%2FP%3E%0A%3CP%3E%3CA%20href%3D%22https%3A%2F%2Fdocs.microsoft.com%2Fen-us%2Fazure%2Fdata-factory%2Fformat-parquet%23data-type-support%22%20target%3D%22_blank%22%20rel%3D%22noopener%20noreferrer%22%3Ehttps%3A%2F%2Fdocs.microsoft.com%2Fen-us%2Fazure%2Fdata-factory%2Fformat-parquet%23data-type-support%3C%2FA%3E%3C%2FP%3E%0A%3CP%3E%3CA%20href%3D%22https%3A%2F%2Fdocs.microsoft.com%2Fen-us%2Fazure%2Fdata-factory%2Fformat-avro%23data-flows%22%20target%3D%22_blank%22%20rel%3D%22noopener%20noreferrer%22%3Ehttps%3A%2F%2Fdocs.microsoft.com%2Fen-us%2Fazure%2Fdata-factory%2Fformat-avro%23data-flows%3C%2FA%3E%26nbsp%3B%3C%2FP%3E%0A%3CP%3E%26nbsp%3B%3C%2FP%3E%0A%3CP%3EStep%201%3A%20Make%20a%20new%20dataset%20and%20choose%20the%20file%20format%20type.%20In%20this%20example%2C%20I%20am%20using%20Parquet.%20Set%20NONE%20for%20schema%3A%3C%2FP%3E%0A%3CP%3E%3CSPAN%20class%3D%22lia-inline-image-display-wrapper%20lia-image-align-inline%22%20image-alt%3D%22complex1.png%22%20style%3D%22width%3A%20400px%3B%22%3E%3CIMG%20src%3D%22https%3A%2F%2Ftechcommunity.microsoft.com%2Ft5%2Fimage%2Fserverpage%2Fimage-id%2F226110i17DFB14FFF9F3199%2Fimage-size%2Fmedium%3Fv%3D1.0%26amp%3Bpx%3D400%22%20role%3D%22button%22%20title%3D%22complex1.png%22%20alt%3D%22complex1.png%22%20%2F%3E%3C%2FSPAN%3E%3C%2FP%3E%0A%3CP%3EStep%202%3A%20Make%20a%20data%20flow%20with%20this%20new%20dataset%20as%20the%20source%3A%3C%2FP%3E%0A%3CP%3E%3CSPAN%20class%3D%22lia-inline-image-display-wrapper%20lia-image-align-inline%22%20image-alt%3D%22complex2.png%22%20style%3D%22width%3A%20400px%3B%22%3E%3CIMG%20src%3D%22https%3A%2F%2Ftechcommunity.microsoft.com%2Ft5%2Fimage%2Fserverpage%2Fimage-id%2F226118i04E91CB1C7A114DF%2Fimage-size%2Fmedium%3Fv%3D1.0%26amp%3Bpx%3D400%22%20role%3D%22button%22%20title%3D%22complex2.png%22%20alt%3D%22complex2.png%22%20%2F%3E%3C%2FSPAN%3E%3C%2FP%3E%0A%3CP%3EStep%203%3A%20Go%20to%20Projection%20-%26gt%3B%20Import%20Projection%3C%2FP%3E%0A%3CP%3E%3CSPAN%20class%3D%22lia-inline-image-display-wrapper%20lia-image-align-inline%22%20image-alt%3D%22complex4.png%22%20style%3D%22width%3A%20400px%3B%22%3E%3CIMG%20src%3D%22https%3A%2F%2Ftechcommunity.microsoft.com%2Ft5%2Fimage%2Fserverpage%2Fimage-id%2F226119i659FF3890F6ADA95%2Fimage-size%2Fmedium%3Fv%3D1.0%26amp%3Bpx%3D400%22%20role%3D%22button%22%20title%3D%22complex4.png%22%20alt%3D%22complex4.png%22%20%2F%3E%3C%2FSPAN%3E%3C%2FP%3E%0A%3CP%3EStep%204%3A%20You%E2%80%99ll%20see%20your%20data%20under%20Data%20Preview%3C%2FP%3E%0A%3CP%3E%3CSPAN%20class%3D%22lia-inline-image-display-wrapper%20lia-image-align-inline%22%20image-alt%3D%22complex3.png%22%20style%3D%22width%3A%20400px%3B%22%3E%3CIMG%20src%3D%22https%3A%2F%2Ftechcommunity.microsoft.com%2Ft5%2Fimage%2Fserverpage%2Fimage-id%2F226116i297D79E0B695F038%2Fimage-size%2Fmedium%3Fv%3D1.0%26amp%3Bpx%3D400%22%20role%3D%22button%22%20title%3D%22complex3.png%22%20alt%3D%22complex3.png%22%20%2F%3E%3C%2FSPAN%3E%3C%2FP%3E%3C%2FLINGO-BODY%3E%3CLINGO-TEASER%20id%3D%22lingo-teaser-1772763%22%20slang%3D%22en-US%22%3E%3CP%3EADF%20has%20connectors%20for%20Parquet%2C%20Avro%2C%20and%20ORC%20data%20lake%20file%20formats.%20However%2C%20datasets%20used%20by%20Copy%20Activity%20do%20not%20currently%20have%20support%20for%20those%20types.%20Here%20is%20how%20to%20read%20and%20write%20those%20complex%20columns%20in%20ADF%20by%20using%20data%20flows.%3C%2FP%3E%0A%3CP%3E%26nbsp%3B%3C%2FP%3E%0A%3CP%3E%3CSPAN%20class%3D%22lia-inline-image-display-wrapper%20lia-image-align-inline%22%20image-alt%3D%22complex1.png%22%20style%3D%22width%3A%20416px%3B%22%3E%3CIMG%20src%3D%22https%3A%2F%2Ftechcommunity.microsoft.com%2Ft5%2Fimage%2Fserverpage%2Fimage-id%2F226102i917A0ABAA0C59730%2Fimage-size%2Flarge%3Fv%3D1.0%26amp%3Bpx%3D999%22%20role%3D%22button%22%20title%3D%22complex1.png%22%20alt%3D%22complex1.png%22%20%2F%3E%3C%2FSPAN%3E%3C%2FP%3E%3C%2FLINGO-TEASER%3E%3CLINGO-LABS%20id%3D%22lingo-labs-1772763%22%20slang%3D%22en-US%22%3E%3CLINGO-LABEL%3EAzure%20Data%20Factory%3C%2FLINGO-LABEL%3E%3CLINGO-LABEL%3EAzure%20Data%20Integration%3C%2FLINGO-LABEL%3E%3CLINGO-LABEL%3EAzure%20ETL%3C%2FLINGO-LABEL%3E%3CLINGO-LABEL%3EMapping%20Data%20Flows%3C%2FLINGO-LABEL%3E%3C%2FLINGO-LABS%3E
Version history
Last update:
‎Oct 12 2020 12:58 PM
Updated by: