Formula for extracting nth field from long delimited text string

Copper Contributor

I'm currently using the following formula to extract the nth field (n=7) from a text string located in cell A1 and delimited by » Char(187):

 

=MID(A1,FIND("®",SUBSTITUTE(A1,CHAR(187),"®",7))+1,FIND("®",SUBSTITUTE(A1,CHAR(187),"®",7+1))-(FIND("®",SUBSTITUTE(A1,CHAR(187),"®",7))+1))

It works, but is there a simpler formula that is more inefficient as I am using the formula many times?

 

A1 sample: »44165»No»44166»»Design & implement a new XXX system»»Insert Name»

 

Excel 2016 & Windows, but would prefer open platform solution (eg Excel for web); no vba please

 

I have looked around and also tried:
TRIM(MID(SUBSTITUTE(A1," ",REPT(" ",LEN(A1))), (N-1)*LEN(A1)+1, LEN(A1))).

This formula is simpler, but doesn't seem to work on large strings > 256 chars

 

FILTERXML("<t><s>" & SUBSTITUTE(SUBSTITUTE(E51,"&","&amp;"),CHAR(187),"</s><s>") & "</s></t>","//s["&E50&"]")

This formula has reserved chars (eg &) and does not work on web excel

3 Replies

@Katoomba Not sure if this will meet you demands, but you indicated that you would like a solution that works in Excel for the Web. Well, then perhaps the LET function will come in handy.

 

=IFERROR(
LET(
myText, A2,
Nth,B2,
Marker, "»",
Repl1,"|",
Repl2,"||",
MarkedText, SUBSTITUTE(SUBSTITUTE(myText,Marker,Repl1,Nth),Marker,Repl2,Nth),
From, FIND(Repl1,MarkedText)+1,
To, FIND(Repl2,MarkedText),
NrOfchar, To-From,
ExtractedText, MID(MarkedText,From,NrOfchar),
ExtractedText
),

 

I had some difficulties with the CHAR(187) part of your formula so I just entered the "»" as the marker to search for. Also used pipe-symbols to mark the beginning and the end of the Nth field with.

The formula is certainly not shorter than yours, but it more efficient and easier to read and maintain. A working example is located in the attached workbook.

 

Now, if you are not bound to the Excel for the Web, consider using Text-To-Columns on the data ribbon, and then select the Nth column from the output. Works in every Excel version I'm aware of except for the on-line version. You'll need an add-in to achieve it there, I believe, but I may be mistaken.

I hope you'll find it helpful.

 

 

Perhaps you could try:

=TRIM(MID(SUBSTITUTE(A1,CHAR(187),REPT(" ",LEN(A1)),8),FIND("®",SUBSTITUTE(A1,CHAR(187),"®",7))+1,LEN(A1)))

But, the first part of the formula will double the length of the string, so if you have very long strings (like 16k characters) it could exceed excel's maximum length (and may be why your other attempt isn't working as it's replacing each space with a string of spaces equal to the length of the string).

@Riny_van_Eekelen Thank you for your response and my apologies for not replying earlier as I only picked up the discussion now.

Thank you for your suggestion, but your formula is probably a bit long and complicated for my purposes  

I am currently using a large delimited string (eg >15K) to pass data between > 20 networked workbooks and a central master workbook (ie a single string greatly simplifies links between workbooks). I am then using the MID formula many thousands of times to automatically extract data at the other end. MID works surprisingly well (with some performance overhead), but I am hoping that there is a simpler standard formula that ideally would only require parameters to be included once or uses fewer SUBSTITUTES.  

I will ultimately need to write up and handover the workbooks to someone else and would like all formulae to be simple enough for someone with middling Excel skills to understand - ie no VBA or array formulae

Excel provides the Textjoin and Concatenate formulae to create delimited strings but unfortunately offers no simple formula to extract data from a delimited string 

Excel on the Web compatibility is not required (at this point) but is a nice to have