In the previous blog article , I described how to use SSIS DQS Domain Value Import . An error may occur during the execution of the project due to problems with the data. There are several available configuration options to deal with the error, related to:
In the SSIS DQS Domain Value Import configuration editor, there is an option for "Specify how to handle the incorrect values":
The following are the options:
There are two main options for recording an error:
When using Fail Component or Ignore Failure , you can select option to Write every error as a warning to the Log . Let's use the Knowledge Base from the last article :
Consider to import the following data:
CREATE TABLE DQSCOLORS5 (
INSERT INTO DQSCOLORS5
('white' ,0 ,NULL)
,('gray' ,NULL ,NULL)
,('orange' ,3 ,NULL)
,('clear' ,NULL ,'tbd')
,('dark blue' ,NULL ,'deep blue')
,('light green' ,NULL ,'green')
Set the option to Write every error as a warning to the log :
Execute the component, when it fails, click on the Progress window:
You should see the warning log entries as follows:
The option for Write every error as a warning to the log is unselected by default to minimize performance overhead. However, this is a useful option during troubleshooting.
Alternatively, you can output the error information and write to a file.
Execute the project and open the resulting error file.
Let's review the error log. The first three errors are self-explanatory. Below is the additional explanation for the failure in the last three:
|Input||Reason for error|
|'clear', NULL, 'tbd'||Both 'clear' and 'tbd' already exists in the domain. Setting synonym when both values already exists in the domain is not supported. You will need to do this through DQS client|
|'dark blue', NULL, 'deep blue'||You can't set synonym to a value that is not a leading value. 'dark blue' is already set to be corrected to 'blue'. Chaining leading value from 'deep blue' to 'dark blue' to 'blue' is not supported. You must set synonym ('deep blue') directly to ultimate leading value ('blue')|
|'light green', NULL, 'green'||This is a tricky one. The reason for the error is because the 'light green' and 'green' are in the wrong order. 'green' should be set as leading value (the first column) while the new synonym should be set in the last column. Reversing the order may partially corrupt the domain as the 'light green' will be imported (see the screenshot below) while the synonym is not set.|
Ok, I admit that this option is not necessary related to error handling, but more on how you manage the integrity of the DQS Knowledge Base. When you updating the Knowledge Base, you effectively creating a new version of the Knowledge Base and it is "checked out" to you during editing. While you are working on updating the Knowledge Base, any DQS projects continue to use the (previously) published version of the Knowledge Base.
When you automate updating DQS Domain Value through SSIS, there is a possibility that the Knowledge Base is being "checked out" for editing by a Knowledge Base owner through DQS Client. SSIS DQS Domain Value Import fails when the target Knowledge Base is not published to prevent conflict. Hence, you should check to make sure the Knowledge Base is published before running SSIS. However, what happen when there is a failure in updating the Knowledge Base during SSIS execution?
The following are the available options:
The default option works in most cases, but the other options provide nice flexibility when designing your process flow.
In this article, we discussed the error handling option for SSIS DQS Domain Value Import. It is a good complement to the DQS Cleansing component and enables DQS to be integrated with other system. Let us know what you think about the component.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.