Feb 05 2021 06:46 AM - edited Feb 05 2021 06:47 AM
Article published by Clinton W Ford
Data validation, data transformation and de-identification can be complex and time-consuming. As data volumes grow, new downstream use cases and applications emerge, and expectations of timely delivery of high-quality data increase the importance of fast and reliable data transformation, validation, de-duplication and error correction.
To abstract their entire ETL process and achieve consistent data through data quality and master data management services, the City of Spokane leveraged DQLabs and Azure Databricks. They merged a variety of data sources, removed duplicate data and curated the data in Azure Data Lake Storage (ADLS).