Forum Discussion
Fabio Augusto de Almeida Carlos
Oct 04, 2018Copper Contributor
How to use regex capturing-group in custom sensitive information in DLP Office365?
I'm triyng to create a custom sensitive information type in Office365 (Security & Compliance Center) to match possible passwords (at least 8 digits, a letter, a number and a special character). R...
vindmil
Jul 31, 2019Copper Contributor
Looks like you encountered typical issue. If I remember correctly O365 uses Boost.Regex engine, regex101 doesn't have it (PCRE, ECMAScript, Python, GoLang), so your validation helps to find only issues which are in common for both.
At first sight: 8-infinite matches, positive lokahead,s, "anything between 0 and inifnite occurences" (aka "I have no idea what's there, but would like to match it") - which are not supported by Office's regex engine. Apart of that, they are also very dangerous and it's better to avoid them (despite that they are very convenient in many cirumstances).
Here is some more information from MS:
https://docs.microsoft.com/en-us/office365/securitycompliance/create-a-custom-sensitive-information-type-in-scc-powershell#potential-validation-issues-to-be-aware-of
Btw, your regex for passwords - it won't work as you might think. Counter-examples for your regex (they will be matched, but cannot be a password if you set strong password configuration):
000aaa0000
10.10.10.10
mydomain.com
hellooo!
At first sight: 8-infinite matches, positive lokahead,s, "anything between 0 and inifnite occurences" (aka "I have no idea what's there, but would like to match it") - which are not supported by Office's regex engine. Apart of that, they are also very dangerous and it's better to avoid them (despite that they are very convenient in many cirumstances).
Here is some more information from MS:
https://docs.microsoft.com/en-us/office365/securitycompliance/create-a-custom-sensitive-information-type-in-scc-powershell#potential-validation-issues-to-be-aware-of
Btw, your regex for passwords - it won't work as you might think. Counter-examples for your regex (they will be matched, but cannot be a password if you set strong password configuration):
000aaa0000
10.10.10.10
mydomain.com
hellooo!