Forum Discussion
Extract text between two characters
- Aug 23, 2022
Here is a small custom VBA function:
Function ExtractURL(ByVal s As String) As String Dim p1 As Long Dim p2 As Long Dim p3 As Long s = Replace(s, Chr(160), " ") p1 = InStr(1, s, ".") p2 = InStrRev(s, " ", p1 - 1) p3 = InStr(p1 + 1, s, " ") If p3 > 0 Then s = Mid(s, p2 + 1, p3 - p2 - 1) Else s = Mid(s, p2 + 1) End If If Left(s, 4) <> "www." Then s = "www." & s End If ExtractURL = s End Function
Use like this: in a cell formula:
=ExtractURL(A2)
=LET(
a,FILTERXML("<y><z>"&SUBSTITUTE(A2," ","</z><z>")&"</z></y>","//z"),
b,SEARCH(".",a),
LOOKUP(9^99,b,a))
Detlef_Lewin good evening,
I am using your code on the daily basis and it works perfectly.
But when I try to understand its logic it becomes complicated, can you please explain it to me?
I don't understand why you divide the text by:
FILTERXML("<y><z>"&SUBSTITUTE(A2," ","</z><z>")&"</z></y>","//z")
And what 9^99 stands for?
LOOKUP(9^99,b,a))
Thank you in advance
- mtarlerAug 31, 2022Silver Contributor
Molekula I can try to explain his code and my variation on it:
=LET( a,FILTERXML("<y><z>"&SUBSTITUTE(A2," ","</z><z>")&"</z></y>","//z"), b,SEARCH(".",a), LOOKUP(9^99,b,a))
FILTERXML will perform a filter on XML text. XML text is a markup language using <> and </> to tag sections of the text. So in this case the function will "filter" out all the sections tagged with <z>...</z> based on that last argument "//z". So the trick here is to convert the text into XML text where what you want is between the <z> and </z> tags. So step 1 is to 'initialize' the XML text with the <y><z> tags and then inside that function is a SUBSTITUTE(A2, " ", "</z><z>") which replaces every instance of a space (" ") with the end tag </z> and then starts a new z container with <z>. Finally it ends the text with </z> </y> to properly close the containers. So FILTERXML will now create an array output from the original text broken up based on the location of the spaces.
The next line (line 3) then searches for the presence of a "." inside each array element and returns a number if it is there (corresponding to its location) or an error if not
The LOOKUP in line 4 finds the numeric value from the array b (i.e. the element that found a ".") and returns the corresponding element that came out from the FILTERXML function.
In my variation:
=LET( a,FILTERXML("<y><z>"&SUBSTITUTE(A27," ","</z><z>")&"</z></y>","//z"), CONCAT(IF(IFERROR(LEN(WEBSERVICE(a)),FALSE),a,"")))
after creating that array of parts I then check if WEBSERVICE() of that item returns any data and if it finds anything it returns the text from that part back again or returns blank if not. All the elements are CONCAT together so all the blanks + the web address + blanks