Forum Discussion
A way to filter gibberish Emails and Usernames
Hi NikolinoDE
Liked your post as I learned something with your SEARCH("^[A-Za-z]{3,}[0-9]{4,}$", A2)
Quick questions if you don't mind...
#1 SEARCH being case-insensitive wouldn't the following work as well?
SEARCH("^[a-z]{3,}[0-9]{4,}$", A2)
or
SEARCH("^[A-Z]{3,}[0-9]{4,}$", A2)
#2 What's the meaning of the $ sign at the end? Can you provide a pointer to doc. or briefly explain?
Thanks Much
- Case Insensitivity in SEARCH:
Yes, you're correct. The SEARCH function in Excel is case-insensitive by default. Therefore, both of the following patterns would work:
- SEARCH("^[a-z]{3,}[0-9]{4,}$", A2): This pattern checks if the string starts with at least 3 lowercase letters and ends with at least 4 numbers.
- SEARCH("^[A-Z]{3,}[0-9]{4,}$", A2): This pattern checks if the string starts with at least 3 uppercase letters and ends with at least 4 numbers.
You can use either of these patterns based on your specific needs and the characteristics of the data you're working with.
- Meaning of the $ sign:
In regular expressions (regex), the $ sign represents the end of a line or string. In the context of the pattern ^[A-Za-z]{3,}[0-9]{4,}$:
- ^[A-Za-z]{3,}: This part ensures that the string starts (^) with at least 3 uppercase or lowercase letters.
- [0-9]{4,}$: This part ensures that the string ends ($) with at least 4 numeric digits.
So, the $ ensures that the pattern is matched only when the specified conditions apply from the beginning to the end of the string.
- LorenzoNov 15, 2023Silver Contributor
(THANKS for all your explanations - Liked)
I feel embarrassed to ask but after reading re. RegEx and trying even very simplified versions of your formula I keep getting #VALUE! errors:
Wonder where my bad is or if I missed something???
In case this would matter I run 365 Personal/Windows v2310 Build 16924.20124 (so not Insider)
and only Add-in currently installed is DAX Studio- djclementsNov 15, 2023Silver Contributor
Lorenzo To the best of my knowledge (and after a quick Google search), regular expressions or RegEx is not supported by any of the built-in functions in Excel. However, there are plenty of VBA and UDF solutions available online. Regarding Add-Ins, though, when I searched for RegEx in the Add-Ins Store, nothing promising came up.
When using "^[a-z]" in the SEARCH function, you are searching for the literal string "^[a-z]". For example, if the string in cell A2 of your screenshot were "abc^[a-z]def", the SEARCH function would return 4.
Unfortunately, the SEARCH function only works with basic wildcards (*?~) and does not appear to work with RegEx notation.
- NikolinoDENov 15, 2023Gold Contributor
You are correct, and I appreciate your clarification. I apologize for any confusion caused by my previous responses
. The SEARCH function in Excel does not support full regular expressions; it primarily performs simple substring searches.
To achieve more complex pattern matching, especially using regular expressions, a more suitable approach would be to utilize VBA within Excel or a third-party tool that supports regex.
Here's an example of a VBA code snippet that you can use in Excel:
Vba code is untested, please backup your file first.
Function IsPotentialGibberish(email As String, username As String) As String Dim regex As Object Set regex = CreateObject("VBScript.RegExp") ' Email pattern: at least 2 letters, followed by at least 1 number, then @, some letters, dot, more letters regex.Pattern = "^[A-Za-z]{2,}[0-9]{1,}@[a-zA-Z]+\.[a-zA-Z]+$" If regex.Test(email) And Not (regex.Test(username) And regex.Test(username)) Then IsPotentialGibberish = "Potential Gibberish" Else IsPotentialGibberish = "Normal" End If End Function
You can use this function in your Excel worksheet. Assuming your email is in cell A2 and the username is in cell B2, you would enter the following formula in another cell:
=IsPotentialGibberish(A2, B2)
This function uses the VBScript.RegExp object to perform regular expression-based pattern matching. It checks both the email and username against the specified pattern and returns "Potential Gibberish" if the conditions are met.
Remember to enable macros and use the VBA editor to insert this code into a module in your Excel workbook.
Again, I appreciate your understanding, and I hope this solution better fits your needs.