Seamless data sharing between organizations eliminates data silos, facilitates data-empowered decisions and unlocks tremendous competitive advantages . Traditionally, organizations have shared data with internal teams or external partners by generating data feeds requiring investment in data copy and refresh pipelines. The result is higher cost for data storage and movement, data proliferation (i.e. multiple copies of data) and delay in access to time-sensitive data. Near-real time access to data is the key to harnessing the true power and scale of big data in enterprise data lakes to effectively realize consistent and reliable data driven decisions.
We are excited to announce the public preview of Microsoft Purview Data Sharing, which enables in-place data sharing for Azure Data Lake Storage (ADLS Gen2 ) and Blob Storage. Data providers can now share data in-place from ADLS Gen2 and Blob storage accounts without data duplication, and share within and across organizations. They can also centrally manage sharing activities within Microsoft Purview, a unified data governance solution . Data consumers can now have near real-time access to shared data. They can also use this shared data for any of their processing and insights needs and gain value at cloud scale faster than ever before. Storage data access and transactions are charged to the data consumers based on what they use, and at no additional cost to the data providers.
Microsoft Purview in-place data sharing for ADLS Gen2 and Blob storage can help with a variety of scenarios, including:
Collaborate with external partners
Organizations often need to collaborate and share data with external partners. For example, organizations might need to share sales, forecast, inventory, and freight data across a supply chain to improve operational efficiency and agility. With Microsoft Purview in-place data sharing for ADLS Gen2 and Blob storage, organizations can share data at scale, near real time with multiple external partners without any data duplication across storage accounts and centrally manage those sharing relationships from a single dashboard. Leveraging time-sensitive data, organizations and external partners can make decisions quickly to optimize inventory stocks and operations, which lowers costs and improves customer satisfaction.
Leverage third party ISVs or data aggregators for data processing and analytics
Many organizations leverage third party ISVs or data aggregators to help them perform data normalization, transformation, and analysis. It’s critical to easily share data at scale and receive results. With ADLS Gen2 and Blob storage in-place sharing, organizations can share data from storage accounts at petabyte scale with a few clicks with no additional infrastructure to provision or maintain. Data aggregators/ISVs can combine shared data from data provider storage accounts with their own data and perform analytics using their proprietary algorithms. Once complete, they can provide results back to the data provider using the same in-place data sharing capability. Similarly, organizations using third party SaaS applications can receive their own data processed and stored by the SaaS providers using in-place data sharing.
Automate data sharing
As data volume continues to grow, data providers looking to share or monetize their data assets need an efficient way to provide secure access to large amounts of data to multiple data consumers. For example, scientific data, seismic data, satellite/surveillance videos and images, retail data or financial data stored in an ADLS Gen2 or Blob storage account can reach millions of files or petabytes in scale . It is critical to enable data consumers to access portions of this data based on their specific business use cases without making a full copy of provider shared data. ADLS Gen2 and Blob storage in-place sharing provides an easy-to-use data sharing mechanism for bulk and near real-time data access. The entire process of secure data sharing can be automated through SDK/API.
Share data internally within an organization to improve data-driven decisions
Organizations are looking for more ways to empower their employees to make data-driven decisions. With ADLS Gen2 and Blob storage in-place sharing, data providers can easily share data between different business units and departments without data duplication, while centrally monitoring their sharing relationships. Data consumers can use their own tools to analyze shared data without burdening the data providers with their access costs.
How does in-place data share work
Microsoft Purview enables sharing of files and folders in-place from ADLS Gen2 and Blob storage accounts. A data provider creates a share by specifying files and folders to be shared, and who to share them with (one or more data consumers). An invitation is sent to each data consumer, who accepts the invitation and specifies the target storage account in their own Azure subscription to access the shared data . This establishes a sharing relationship between the provider and consumer storage accounts. This sharing relationship provides data consumer read-only access to shared data through the consumer’s target storage account. Any changes to the data in the provider’s source storage account is reflected in near real-time in the consumer’s target storage account. The data provider pays for data storage and their own data access , while the data consumer pays for their own data access transactions. Data provider can revoke access to the share or set a share expiration time for time-bound access to data. Data consumer can also terminate access to the share at any time.
Provider and consumer storage accounts must be in the same Azure region. Data can be shared from ADLS Gen2 to ADLS Gen2, and Blob to Blob storage accounts.
This public preview is initially available for ADLS Gen2 and Blob Storage accounts in Canada Central, Canada East, UK South, UK West, Australia East, Japan East, Korea South, and South Africa North Azure regions. The following are resources to help you get started. We look forward to your feedback.