My customer is having On-premises Hadoop environment with a data volume of approximately 80 Petabytes. As they started evaluating Azure to store their cognizant data which will be extracted from the On-Premises Hadoop environment and will be loaded to Azure Storage.
This cognizant data will be then further processed and analyze using Azure Synapse Analytics
We started evaluating this cognizant data which is coming close to 5 Petabytes. And this volume needs to be loaded to the Azure Data Lake Storage Gen 2 for further processing.
With the latest resume copy feature, customers can get benefits when they are loading very large data with control the process on data migration, and saving effort and time on data partitions or any solutions explicitly created for high resilience.
Below is the way how to resume the copy activity from the last failure point at file level:
More guidance on best practices of using ADF for data lake migration:
Data Factory offers two basic approaches for migrating data from on-premises HDFS to Azure. You can select the approach based on your scenario.
Network security- By default, Data Factory transfers data from on-premises HDFS to Blob storage or Azure Data Lake Storage Gen2 by using an encrypted connection over HTTPS protocol. HTTPS provides data encryption in transit and prevents eavesdropping and man-in-the-middle attacks.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.