Forum Discussion

datanerdcg's avatar
datanerdcg
Copper Contributor
Apr 08, 2025

Excel column header verification using schema in database

I have a requirement where we need to do data quality check on the excel files in Azure Blob with the Schema stored in the Database. 

 

Azure Blob has a container in which we have multiple excel files with data. These files generally follow a structure and few business rules, for example, if the data is related to employee there will be 10 columns, all rows in colA = 'abc' (same data), colB should be date in some format, colC is number and less than 5 and likewise. Similarly different excels have different headers, no of columns,  structure and business rules. 

A table is maintained in the database with the structure and business rules. 

ExcelTemplateIdExcelTemplateNameColumnNameMaxLengthDataTypeDefaultValue
1abcname255varchar 
1abcempId10int 
1abcdept100 xyz

I need to create an adf pipeline which will read the excel files one by one from the source and compare with the schema (present in the database) and copy the good data to location01 and bad data to location02. Location01 and 02 can be a table in database.

I do not wish to create one pipeline for each excel sheet, rather it should be a dynamic one which would handle all excels. How can I achieve this? 

No RepliesBe the first to reply

Resources