In this post, we are going to explore a couple new capabilities brought to the Data Mapper in the form of Regular Expression functions. We can use these new functions to help with data validation and conformity.
For those of you who are new to regular expressions, A regular expression is "a sequence of characters that specifies a match pattern in text. Usually, such patterns are used by string-searching algorithms for find or find and replace operations on strings, or for input validation." - Wikipedia
In terms of how are Regular Expressions relevant to the Data Mapper, we can leverage them in the following ways:
- Validating data
- Transform data in the interest of consistency
- Find and Replace
Some use cases where we may find the functionality useful:
- Email addresses
- Postal/Zip Codes
- URLs/IP Addresses
- Dates
- Phone Numbers
- Credit Card Validation
- Identity: Social Security/Social Insurance
When we explore the available Functions found in the data mapper, there are two of particular interest for us:
- Regular expression matches
- Regular expression replaces
Let's now explore a couple scenarios:
- Scenario #1: Zip Code Validation
In this scenario, we need to map a Zip code from our source document to our destination document. There are 3 different Zip code formats that we want to support:
- #####
- #####-####
- ##### ####
Should the Zip code be presented in any of these formats we will want to map it across. If this condition isn't satisfied, we don't want to map the value across.
We can use the Regular expression matches function and pass in a Pattern expression of ^\d{5}(directAccess(-\s, ?:, ?:\d{4}))?$ that will perform our business rule. Since the Regular expression matches function will return a Boolean value we can pass this output to an if function. If true is passed in, then we will map the ZipCode source value to the ZipCode destination value.
- Scenario 2: Enforcing consistency
There may be times where abbreviations or acronyms are being used instead of the complete, or proper term. This practice can wreak havoc in downstream systems as it create data integrity issues. For example, we have a company called Contoso but there may be additional ways to refer to the corporate entity like LLC, Corp, Inc etc. If we want to standardize on just one term, we can look for all of these different variations and replace them with a common term.
To achieve this, we can use the Regular expression replaces function that will take a list of inputs and should there be any matches with this list, we will provide a replacement string.
To accomplish our goal, we can provide a Regular expression pattern of LLC|llc|Corp|corp|company|Company and then provide a Replacement string of Corporation.
Hopefully this this post has provided you with some ideas on how you can use these Regular expression functions in your own maps. If you would like to see a recorded demo of these scenarios, please check out the following video: