I’ve been getting many requests to publish the files and KBs used in the DQS TechEd presentation (available
So I know it’s a bit late, but still – here it is…
For those of you who haven’t viewed the presentation yet - the high level story is about a major service provider for professional sports team in the US.
Joe, a data analyst in that company, gets up one morning and looks at the sports sales report from the last quarter, and sees a big mess – sports types contain duplicates, team names are spelled wrong, and even the types of revenue are all jumbled (see pics below).
So he wants to clean the data with DQS (see more details in the recording).The relevant files for this demo are attached:
Sports Sales Report – the original sales report, unclean
Sports Sales Report Clean – the cleaned data after the DQS cleansing
DQS TechEd Sample – sample Excel file for running knowledge discovery to enrich the KB
DQS TechEd – Excel file for cleansing the various company-specific fields
Matching Sample – sample file for creating and tuning a matching policy
Matching – Excel file for running a matching project
.DQS files - are the relevant KBs for various stages of the project (Out-Of-The-Box sports KB, Music KB and final Sports KB). You can save them and import them into your DQS client.
To run the full scenario, just do the following:
From the client main menu choose New Knowledge Base and create the KBs from the supported KBs
Follow the scenario from the TechEd demo
Enjoy and share your enjoyment with others :)
I hope you find the demo data useful, in one of the next posts were going to introduce you to the DQS matching functionality.