Formula for extracting nth field from long delimited text string

%3CLINGO-SUB%20id%3D%22lingo-sub-1961285%22%20slang%3D%22en-US%22%3EFormula%20for%20extracting%20nth%20field%20from%20long%20delimited%20text%20string%3C%2FLINGO-SUB%3E%3CLINGO-BODY%20id%3D%22lingo-body-1961285%22%20slang%3D%22en-US%22%3E%3CP%3EI'm%20currently%20using%20the%20following%20formula%20to%20extract%20the%20nth%20field%20(n%3D7)%20from%20a%20text%20string%20located%20in%20cell%20A1%20and%20delimited%20by%20%C2%BB%20Char(187)%3A%3C%2FP%3E%3CP%3E%26nbsp%3B%3C%2FP%3E%3CP%3E%3DMID(A1%2CFIND(%22%C2%AE%22%2CSUBSTITUTE(A1%2CCHAR(187)%2C%22%C2%AE%22%2C7))%2B1%2CFIND(%22%C2%AE%22%2CSUBSTITUTE(A1%2CCHAR(187)%2C%22%C2%AE%22%2C7%2B1))-(FIND(%22%C2%AE%22%2CSUBSTITUTE(A1%2CCHAR(187)%2C%22%C2%AE%22%2C7))%2B1))%3C%2FP%3E%3CP%3EIt%20works%2C%20but%20is%20there%20a%20simpler%20formula%20that%20is%20more%20inefficient%20as%20I%20am%20using%20the%20formula%20many%20times%3F%3C%2FP%3E%3CP%3E%26nbsp%3B%3C%2FP%3E%3CP%3EA1%20sample%3A%20%C2%BB44165%C2%BBNo%C2%BB44166%C2%BB%C2%BBDesign%20%26amp%3B%20implement%20a%20new%20XXX%20system%C2%BB%C2%BBInsert%20Name%C2%BB%3C%2FP%3E%3CP%3E%26nbsp%3B%3C%2FP%3E%3CP%3EExcel%202016%20%26amp%3B%20Windows%2C%20but%20would%20prefer%20open%20platform%20solution%20(eg%20Excel%20for%20web)%3B%20no%20vba%20please%3C%2FP%3E%3CP%3E%26nbsp%3B%3C%2FP%3E%3CP%3EI%20have%20looked%20around%20and%20also%20tried%3A%3CBR%20%2F%3ETRIM(MID(SUBSTITUTE(A1%2C%22%20%22%2CREPT(%22%20%22%2CLEN(A1)))%2C%20(N-1)*LEN(A1)%2B1%2C%20LEN(A1))).%3C%2FP%3E%3CP%3EThis%20formula%20is%20simpler%2C%20but%20doesn't%20seem%20to%20work%20on%20large%20strings%20%26gt%3B%20256%20chars%3C%2FP%3E%3CP%3E%26nbsp%3B%3C%2FP%3E%3CP%3EFILTERXML(%22%3CT%3E%3CS%3E%22%20%26amp%3B%20SUBSTITUTE(SUBSTITUTE(E51%2C%22%26amp%3B%22%2C%22%26amp%3Bamp%3B%22)%2CCHAR(187)%2C%22%3C%2FS%3E%3CS%3E%22)%20%26amp%3B%20%22%3C%2FS%3E%3C%2FT%3E%22%2C%22%2F%2Fs%5B%22%26amp%3BE50%26amp%3B%22%5D%22)%3C%2FP%3E%3CP%3EThis%20formula%20has%20reserved%20chars%20(eg%20%26amp%3B)%20and%20does%20not%20work%20on%20web%20excel%3C%2FP%3E%3C%2FLINGO-BODY%3E%3CLINGO-LABS%20id%3D%22lingo-labs-1961285%22%20slang%3D%22en-US%22%3E%3CLINGO-LABEL%3EExcel%3C%2FLINGO-LABEL%3E%3CLINGO-LABEL%3EExcel%20for%20web%3C%2FLINGO-LABEL%3E%3CLINGO-LABEL%3EFormulas%20and%20Functions%3C%2FLINGO-LABEL%3E%3C%2FLINGO-LABS%3E%3CLINGO-SUB%20id%3D%22lingo-sub-1961402%22%20slang%3D%22en-US%22%3ERe%3A%20Formula%20for%20extracting%20nth%20field%20from%20long%20delimited%20text%20string%3C%2FLINGO-SUB%3E%3CLINGO-BODY%20id%3D%22lingo-body-1961402%22%20slang%3D%22en-US%22%3E%3CP%3E%3CA%20href%3D%22https%3A%2F%2Ftechcommunity.microsoft.com%2Ft5%2Fuser%2Fviewprofilepage%2Fuser-id%2F892922%22%20target%3D%22_blank%22%3E%40Katoomba%3C%2FA%3E%26nbsp%3BNot%20sure%20if%20this%20will%20meet%20you%20demands%2C%20but%20you%20indicated%20that%20you%20would%20like%20a%20solution%20that%20works%20in%20Excel%20for%20the%20Web.%20Well%2C%20then%20perhaps%20the%20LET%20function%20will%20come%20in%20handy.%3C%2FP%3E%3CPRE%20class%3D%22lia-code-sample%20language-excel%22%3E%3CCODE%3E%3DIFERROR(%0ALET(%0AmyText%2C%20A2%2C%0ANth%2CB2%2C%0AMarker%2C%20%22%C2%BB%22%2C%0ARepl1%2C%22%7C%22%2C%0ARepl2%2C%22%7C%7C%22%2C%0AMarkedText%2C%20SUBSTITUTE(SUBSTITUTE(myText%2CMarker%2CRepl1%2CNth)%2CMarker%2CRepl2%2CNth)%2C%0AFrom%2C%20FIND(Repl1%2CMarkedText)%2B1%2C%0ATo%2C%20FIND(Repl2%2CMarkedText)%2C%0ANrOfchar%2C%20To-From%2C%0AExtractedText%2C%20MID(MarkedText%2CFrom%2CNrOfchar)%2C%0AExtractedText%0A)%2C%3C%2FCODE%3E%3C%2FPRE%3E%3CP%3EI%20had%20some%20difficulties%20with%20the%20CHAR(187)%20part%20of%20your%20formula%20so%20I%20just%20entered%20the%20%22%C2%BB%22%20as%20the%20marker%20to%20search%20for.%20Also%20used%20pipe-symbols%20to%20mark%20the%20beginning%20and%20the%20end%20of%20the%20Nth%20field%20with.%3C%2FP%3E%3CP%3EThe%20formula%20is%20certainly%20not%20shorter%20than%20yours%2C%20but%20it%20more%20efficient%20and%20easier%20to%20read%20and%20maintain.%20A%20working%20example%20is%20located%20in%20the%20attached%20workbook.%20I%20hope%20you'll%20find%20it%20helpful.%3C%2FP%3E%3C%2FLINGO-BODY%3E%3CLINGO-SUB%20id%3D%22lingo-sub-1961512%22%20slang%3D%22en-US%22%3ERe%3A%20Formula%20for%20extracting%20nth%20field%20from%20long%20delimited%20text%20string%3C%2FLINGO-SUB%3E%3CLINGO-BODY%20id%3D%22lingo-body-1961512%22%20slang%3D%22en-US%22%3EPerhaps%20you%20could%20try%3A%3CBR%20%2F%3E%3CBR%20%2F%3E%3DTRIM(MID(SUBSTITUTE(A1%2CCHAR(187)%2CREPT(%22%20%22%2CLEN(A1))%2C8)%2CFIND(%22%C2%AE%22%2CSUBSTITUTE(A1%2CCHAR(187)%2C%22%C2%AE%22%2C7))%2B1%2CLEN(A1)))%3CBR%20%2F%3E%3CBR%20%2F%3EBut%2C%20the%20first%20part%20of%20the%20formula%20will%20double%20the%20length%20of%20the%20string%2C%20so%20if%20you%20have%20very%20long%20strings%20(like%2016k%20characters)%20it%20could%20exceed%20excel's%20maximum%20length%20(and%20may%20be%20why%20your%20other%20attempt%20isn't%20working%20as%20it's%20replacing%20each%20space%20with%20a%20string%20of%20spaces%20equal%20to%20the%20length%20of%20the%20string).%3C%2FLINGO-BODY%3E
Occasional Visitor

I'm currently using the following formula to extract the nth field (n=7) from a text string located in cell A1 and delimited by » Char(187):

 

=MID(A1,FIND("®",SUBSTITUTE(A1,CHAR(187),"®",7))+1,FIND("®",SUBSTITUTE(A1,CHAR(187),"®",7+1))-(FIND("®",SUBSTITUTE(A1,CHAR(187),"®",7))+1))

It works, but is there a simpler formula that is more inefficient as I am using the formula many times?

 

A1 sample: »44165»No»44166»»Design & implement a new XXX system»»Insert Name»

 

Excel 2016 & Windows, but would prefer open platform solution (eg Excel for web); no vba please

 

I have looked around and also tried:
TRIM(MID(SUBSTITUTE(A1," ",REPT(" ",LEN(A1))), (N-1)*LEN(A1)+1, LEN(A1))).

This formula is simpler, but doesn't seem to work on large strings > 256 chars

 

FILTERXML("<t><s>" & SUBSTITUTE(SUBSTITUTE(E51,"&","&amp;"),CHAR(187),"</s><s>") & "</s></t>","//s["&E50&"]")

This formula has reserved chars (eg &) and does not work on web excel

2 Replies

@Katoomba Not sure if this will meet you demands, but you indicated that you would like a solution that works in Excel for the Web. Well, then perhaps the LET function will come in handy.

 

=IFERROR(
LET(
myText, A2,
Nth,B2,
Marker, "»",
Repl1,"|",
Repl2,"||",
MarkedText, SUBSTITUTE(SUBSTITUTE(myText,Marker,Repl1,Nth),Marker,Repl2,Nth),
From, FIND(Repl1,MarkedText)+1,
To, FIND(Repl2,MarkedText),
NrOfchar, To-From,
ExtractedText, MID(MarkedText,From,NrOfchar),
ExtractedText
),

 

I had some difficulties with the CHAR(187) part of your formula so I just entered the "»" as the marker to search for. Also used pipe-symbols to mark the beginning and the end of the Nth field with.

The formula is certainly not shorter than yours, but it more efficient and easier to read and maintain. A working example is located in the attached workbook.

 

Now, if you are not bound to the Excel for the Web, consider using Text-To-Columns on the data ribbon, and then select the Nth column from the output. Works in every Excel version I'm aware of except for the on-line version. You'll need an add-in to achieve it there, I believe, but I may be mistaken.

I hope you'll find it helpful.

 

 

Perhaps you could try:

=TRIM(MID(SUBSTITUTE(A1,CHAR(187),REPT(" ",LEN(A1)),8),FIND("®",SUBSTITUTE(A1,CHAR(187),"®",7))+1,LEN(A1)))

But, the first part of the formula will double the length of the string, so if you have very long strings (like 16k characters) it could exceed excel's maximum length (and may be why your other attempt isn't working as it's replacing each space with a string of spaces equal to the length of the string).